样本数据集:
Price=c(6651, 7255, 25465, 35645, 2556, 3665)
NumberPurchased=c(25, 30, 156, 250, 12, 16)
Type=c("A", "A", "C", "C", "B", "B")
Source=c("GSC", "MYL", "TTC", "ZAF", "CAN", "HLT")
df1 <- data.frame(Price, NumberPurchased, Type, Source)
我希望能够使用两个其他变量(ID
,PurchaseDate
)创建一个新的数据框,但基于变量的数据行更多Type
。
我要应用的规则:如果Type = A,PurchaseDate
则为“ 2013”,“ 2014”。如果Type = B,PurchaseDate
则为“ 2013”。如果Type = C,PurchaseDate
则为“ 2013”,“ 2014”,“ 2015”。
如果Type
是A,分Price
和NumberPurchased
2,和有2行具有不同的PurchaseDate
如上文所指定的。如果Type
是B,假如与PurhcaseDate
如2013年如果Type
是C,分Price
和NumberPurchased
3,并有3行具有不同的PurchaseDate
如上文所指定的。
因此,我想要这样的东西作为新的数据集:
Price=c(3325.5, 3325.5, 3627.5, 3627.5, 8488.3, 8488.3, 8488.3, 11881.6, 11881.6, 11881.6, 2556, 3665)
NumberPurchased=c(12.5, 12.5, 15, 15, 52, 52, 52, 83.3, 83.3, 83.3, 12, 16)
Type=c("A", "A", "A", "A", "C", "C", "C", "C", "C", "C","B", "B")
Source=c("GSC", "GSC", "MYL", "MYL", "TTC","TTC", "TTC", "ZAF", "ZAF","ZAF", "CAN", "HLT")
PurchaseDate=c("2013", "2014", "2013", "2014", "2013", "2014", "2015", "2013", "2014", "2015", "2013", "2013")
ID=c(1, 1, 2, 2, 3, 3, 3, 4, 4, 4, 5, 6)
df2 <- data.frame(Price, NumberPurchased, Type, Source, PurchaseDate, ID)
有见识吗?
这是一种可能的方法。首先,我们将为创建索引Type
,然后将相应地增长数据,然后将使用data.table
包来计算新变量。
library(data.table)
setDT(df1)[, indx := as.numeric(factor(Type, levels = c("B", "A", "C")))]
# setDT(df1)[, indx := ifelse(Type == "C", 3, 2)] # Alternative index per your comment
df2 <- df1[rep(seq_len(.N), indx)]
df2[, `:=`(
Price = Price/.N,
PurchaseDate = 2013:(2013 + (.N - 1)),
NumberPurchased = NumberPurchased/.N,
ID = .GRP
),
by = .(Source, Type)][]
# Price NumberPurchased Type Source indx PurchaseDate ID
# 1: 3325.500 12.50000 A GSC 2 2013 1
# 2: 3325.500 12.50000 A GSC 2 2014 1
# 3: 3627.500 15.00000 A MYL 2 2013 2
# 4: 3627.500 15.00000 A MYL 2 2014 2
# 5: 8488.333 52.00000 C TTC 3 2013 3
# 6: 8488.333 52.00000 C TTC 3 2014 3
# 7: 8488.333 52.00000 C TTC 3 2015 3
# 8: 11881.667 83.33333 C ZAF 3 2013 4
# 9: 11881.667 83.33333 C ZAF 3 2014 4
# 10: 11881.667 83.33333 C ZAF 3 2015 4
# 11: 2556.000 12.00000 B CAN 1 2013 5
# 12: 3665.000 16.00000 B HLT 1 2013 6
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句