How to calculate running total for prior years in R?

Goulou

I have a set of variables in the dataset -- I want to simply calculate the running total (and the running mean) for all these variables, based on all prior years.

To illustrate. This is how my data looks like, including the total run variable that I want to generate.

country year    X1  X2  X3  X4  X5  running_total

Bahamas 1990    0   0   0   0   1   NA
Bahamas 1991    0   0   1   1   0   1
Bahamas 1992    1   1   0   0   1   3
Bahamas 1993    0   0   0   0   0   6
Bahamas 1994    1   1   0   1   1   6
Bahamas 1995    0   0   1   0   0   10
Bahamas 1996    0   1   0   1   0   11
Bahamas 1997    1   0   1   0   1   13
Bahamas 1998    0   1   0   1   0   16
Bahamas 1999    1   0   1   0   1   18
Bahamas 2000    0   1   0   1   0   21
Bahamas 2001    1   0   1   0   1   23
Bahamas 2002    0   1   0   1   0   26
Bahamas 2003    1   0   0   0   1   28
Bahamas 2004    0   0   0   1   0   30
Bahamas 2005    1   1   0   0   0   31
Bahamas 2006    0   0   1   1   1   33
Bahamas 2007    1   0   0   0   0   36
Bahamas 2008    0   0   1   1   1   37
Bahamas 2009    1   1   0   0   0   40
Bahamas 2010    0   0   1   1   1   42
Bahamas 2011    1   1   0   0   0   45
Bolivia 1990    0   0   0   0   0   NA
Bolivia 1991    0   0   1   1   0   0
Bolivia 1992    0   0   0   0   0   2
Bolivia 1993    0   0   1   0   0   2
Bolivia 1994    0   0   0   0   0   3
Bolivia 1995    0   0   0   0   0   3
Bolivia 1996    0   0   0   0   0   3
Bolivia 1997    0   0   0   0   0   3
Bolivia 1998    0   0   0   0   0   3
Bolivia 1999    0   0   0   0   0   3
Bolivia 2000    0   1   0   1   0   3
Bolivia 2001    0   0   0   0   0   5
Bolivia 2002    0   0   0   0   0   5
Bolivia 2003    0   0   0   0   0   5
Bolivia 2004    0   0   0   0   0   5
Bolivia 2005    0   0   0   0   0   5
Bolivia 2006    0   0   0   0   0   5
Bolivia 2007    0   0   0   0   0   5
Bolivia 2008    0   0   0   0   1   5
Bolivia 2009    0   0   0   0   0   6
Bolivia 2010    0   0   0   0   1   6
Bolivia 2011    0   0   0   0   0   7

Starting year 1990 ==NA. For example, running total for 1991 is based on 1990. Running total for 1992 is based on 1990-1991. running total for 1993 is based on 1990-1992- running total for 1994 is based on 1990-1993. And so on...until 2011. Then it starts the same procedur for new country B.

I tried the following code below but it doesn't work the way I want. Surely, I need to specify it better, but how?

DF$csum <- ave(DF$X1, DF$X2,DF$X3,DF$X4,DF$X5,FUN=cumsum)

In addition, I would like to generate running mean based on the same logic.

Any help here would be much appreciated!

structure(list(country = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("Bahamas", "Bolivia"), class = "factor"), year = c(1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 1998L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L, 2011L, 1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 1998L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L, 2011L), X1 = c(0L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), X2 = c(0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), X3 = c(0L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), X4 = c(0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), X5 = c(1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 0L), running_total = c(NA, 1L, 3L, 6L, 6L, 10L, 11L, 13L, 16L, 18L, 21L, 23L, 26L, 28L, 30L, 31L, 33L, 36L, 37L, 40L, 42L, 45L, NA, 0L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 7L)), .Names = c("country", "year", "X1", "X2", "X3", "X4", "X5", "running_total"), class = "data.frame", row.names = c(NA, -44L))

sindri_baldur
library(data.table)
setDT(df)
df[, xt := X1+X2+X3+X4+X5]
df[, rt2 := shift(cumsum(xt)), by = country]

Actually it can be solved with an one-liner:

df[, rt3 := {xt=X1+X2+X3+X4+X5; shift(cumsum(xt))}, by = country]
# Or as Ryan points out:
df[, rt2 := shift(cumsum(Reduce(`+`, .SD))) , by = country , .SDcols = grep('^X.*', names(df), value = T)]

All resulting in:

    country year X1 X2 X3 X4 X5 running_total xt rt2
 1: Bahamas 1990  0  0  0  0  1            NA  1  NA
 2: Bahamas 1991  0  0  1  1  0             1  2   1
 3: Bahamas 1992  1  1  0  0  1             3  3   3
 4: Bahamas 1993  0  0  0  0  0             6  0   6
 5: Bahamas 1994  1  1  0  1  1             6  4   6
 6: Bahamas 1995  0  0  1  0  0            10  1  10
 7: Bahamas 1996  0  1  0  1  0            11  2  11
 8: Bahamas 1997  1  0  1  0  1            13  3  13
 9: Bahamas 1998  0  1  0  1  0            16  2  16
10: Bahamas 1999  1  0  1  0  1            18  3  18
11: Bahamas 2000  0  1  0  1  0            21  2  21
12: Bahamas 2001  1  0  1  0  1            23  3  23
13: Bahamas 2002  0  1  0  1  0            26  2  26
14: Bahamas 2003  1  0  0  0  1            28  2  28
15: Bahamas 2004  0  0  0  1  0            30  1  30
16: Bahamas 2005  1  1  0  0  0            31  2  31
17: Bahamas 2006  0  0  1  1  1            33  3  33
18: Bahamas 2007  1  0  0  0  0            36  1  36
19: Bahamas 2008  0  0  1  1  1            37  3  37
20: Bahamas 2009  1  1  0  0  0            40  2  40
21: Bahamas 2010  0  0  1  1  1            42  3  42
22: Bahamas 2011  1  1  0  0  0            45  2  45
23: Bolivia 1990  0  0  0  0  0            NA  0  NA
24: Bolivia 1991  0  0  1  1  0             0  2   0
25: Bolivia 1992  0  0  0  0  0             2  0   2
26: Bolivia 1993  0  0  1  0  0             2  1   2
27: Bolivia 1994  0  0  0  0  0             3  0   3
28: Bolivia 1995  0  0  0  0  0             3  0   3
29: Bolivia 1996  0  0  0  0  0             3  0   3
30: Bolivia 1997  0  0  0  0  0             3  0   3
31: Bolivia 1998  0  0  0  0  0             3  0   3
32: Bolivia 1999  0  0  0  0  0             3  0   3
33: Bolivia 2000  0  1  0  1  0             3  2   3
34: Bolivia 2001  0  0  0  0  0             5  0   5
35: Bolivia 2002  0  0  0  0  0             5  0   5
36: Bolivia 2003  0  0  0  0  0             5  0   5
37: Bolivia 2004  0  0  0  0  0             5  0   5
38: Bolivia 2005  0  0  0  0  0             5  0   5
39: Bolivia 2006  0  0  0  0  0             5  0   5
40: Bolivia 2007  0  0  0  0  0             5  0   5
41: Bolivia 2008  0  0  0  0  1             5  1   5
42: Bolivia 2009  0  0  0  0  0             6  0   6
43: Bolivia 2010  0  0  0  0  1             6  1   6
44: Bolivia 2011  0  0  0  0  0             7  0   7
    country year X1 X2 X3 X4 X5 running_total xt rt2

この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。

侵害の場合は、連絡してください[email protected]

編集
0

コメントを追加

0

関連記事

分類Dev

How to calculate the running total using aggregate

分類Dev

How to calculate running total for a SortedMap / TreeMap

分類Dev

How to calculate total cost

分類Dev

How can I calculate network measures for each account only considering the last 6 month period prior to the transaction date in r?

分類Dev

Pandas: Calculate running difference between prior record between date-time stamps

分類Dev

How to count the total minutes from two dates with different years?

分類Dev

How to calculate total seconds of many overlapping DateTimes

分類Dev

How to calculate with php the total with a VAT percentage in PHP

分類Dev

how to calculate total amount of Jenkin's jobs

分類Dev

How to calculate age (in years) based on Date of Birth and getDate()

分類Dev

Bernoulli Prior in R STAN

分類Dev

Oracle SQL - trying to calculate running total with Group By without having an existing numerical column to sum

分類Dev

How to calculate the Total amount depending upon the subtotal and shipping charges in php?

分類Dev

How to calculate the percent of the overall total sum, per entry?

分類Dev

How to get all the values of one column and calculate a total

分類Dev

How to apply status on a account that has its running total reaching zero

分類Dev

SQL Running total with reset

分類Dev

Running total in Microsoft SSMS

分類Dev

Calculate total duration for properties in object

分類Dev

How to calculate the Cartesian Power of a list in R

分類Dev

how calculate a conditional mutate in R with dplyr?

分類Dev

How do I calculate the total sum of my cart to my dictionary list based on user input menu?

分類Dev

Running total based on ng:repeat

分類Dev

Sum Running Total in Crystal Report

分類Dev

Javascript: Add or Subtract to running total

分類Dev

Rewrite running total sql query

分類Dev

Decreasing running total in oracle sql

分類Dev

how to start postgres prior to run docker

分類Dev

calculate total value and get percentage value of each using that total value

Related 関連記事

  1. 1

    How to calculate the running total using aggregate

  2. 2

    How to calculate running total for a SortedMap / TreeMap

  3. 3

    How to calculate total cost

  4. 4

    How can I calculate network measures for each account only considering the last 6 month period prior to the transaction date in r?

  5. 5

    Pandas: Calculate running difference between prior record between date-time stamps

  6. 6

    How to count the total minutes from two dates with different years?

  7. 7

    How to calculate total seconds of many overlapping DateTimes

  8. 8

    How to calculate with php the total with a VAT percentage in PHP

  9. 9

    how to calculate total amount of Jenkin's jobs

  10. 10

    How to calculate age (in years) based on Date of Birth and getDate()

  11. 11

    Bernoulli Prior in R STAN

  12. 12

    Oracle SQL - trying to calculate running total with Group By without having an existing numerical column to sum

  13. 13

    How to calculate the Total amount depending upon the subtotal and shipping charges in php?

  14. 14

    How to calculate the percent of the overall total sum, per entry?

  15. 15

    How to get all the values of one column and calculate a total

  16. 16

    How to apply status on a account that has its running total reaching zero

  17. 17

    SQL Running total with reset

  18. 18

    Running total in Microsoft SSMS

  19. 19

    Calculate total duration for properties in object

  20. 20

    How to calculate the Cartesian Power of a list in R

  21. 21

    how calculate a conditional mutate in R with dplyr?

  22. 22

    How do I calculate the total sum of my cart to my dictionary list based on user input menu?

  23. 23

    Running total based on ng:repeat

  24. 24

    Sum Running Total in Crystal Report

  25. 25

    Javascript: Add or Subtract to running total

  26. 26

    Rewrite running total sql query

  27. 27

    Decreasing running total in oracle sql

  28. 28

    how to start postgres prior to run docker

  29. 29

    calculate total value and get percentage value of each using that total value

ホットタグ

アーカイブ