# 纵向数据集中的累积变量构造

``````     ID age work
1   Dan  10    0
2   Dan  11    0
3   Dan  12    0
4   Dan  13    0
5   Dan  14    0
6   Dan  15    3
7   Dan  16    5
8   Dan  17    8
9   Dan  18    5
10  Dan  19   12
11 Jeff  20    0
12 Jeff  16    0
13 Jeff  17    0
14 Jeff  18    0
15 Jeff  19    0
16 Jeff  20    0
17 Jeff  21    8
18 Jeff  22   10
19 Jeff  23   12
20 Jeff  24   12
21 Jeff  25   12
22 Jeff  26   12
23 Jeff  27   12
24 Jeff  28   12
25 Jeff  29   12
``````

``````     ID age work cumwork
1   Dan  10    0       0
2   Dan  11    0       0
3   Dan  12    0       0
4   Dan  13    0       0
5   Dan  14    0       0
6   Dan  15    3       3
7   Dan  16    5       8
8   Dan  17    8      16
9   Dan  18    5      21
10  Dan  19   12      33
11 Jeff  20    0       0
12 Jeff  16    0       0
13 Jeff  17    0       0
14 Jeff  18    0       0
15 Jeff  19    0       0
16 Jeff  20    0       0
17 Jeff  21    8       8
18 Jeff  22   10      18
19 Jeff  23   12      30
20 Jeff  24   12      42
21 Jeff  25   12      54
22 Jeff  26   12      66
23 Jeff  27   12      78
24 Jeff  28   12      90
25 Jeff  29   12     102
``````

``````# Generate test data set
x=data.frame(ID=c(rep("Dan",times=10),rep("Jeff",times=15)),age=c(10:20,16:29),work=c(rep(0,times=5),3,5,8,5,12,rep(0,times=6),8,10,rep(12,times=7)),stringsAsFactors=F)

# Generate cumulative work experience variable
x\$cumwork=x\$work

for(r in 2:nrow(x)){
if(x\$ID[r]==x\$ID[r-1]){
x\$cumwork[r]=x\$cumwork[r-1]+x\$cumwork[r]
}
}
``````

A5C1D2H2I1M1N2O1R2T1

`ave`对于这些类型的任务很方便。您要使用的功能是`cumsum`

``````x\$cumwork <- ave(x\$work, x\$ID, FUN = cumsum)
x
#      ID age work cumwork
# 1   Dan  10    0       0
# 2   Dan  11    0       0
# 3   Dan  12    0       0
# 4   Dan  13    0       0
# 5   Dan  14    0       0
# 6   Dan  15    3       3
# 7   Dan  16    5       8
# 8   Dan  17    8      16
# 9   Dan  18    5      21
# 10  Dan  19   12      33
# 11 Jeff  20    0       0
# 12 Jeff  16    0       0
# 13 Jeff  17    0       0
# 14 Jeff  18    0       0
# 15 Jeff  19    0       0
# 16 Jeff  20    0       0
# 17 Jeff  21    8       8
# 18 Jeff  22   10      18
# 19 Jeff  23   12      30
# 20 Jeff  24   12      42
# 21 Jeff  25   12      54
# 22 Jeff  26   12      66
# 23 Jeff  27   12      78
# 24 Jeff  28   12      90
# 25 Jeff  29   12     102
``````

``````library(data.table)
DT <- data.table(x)
DT[, cumwork := cumsum(work), by = ID]
``````

0条评论