R equivalent of Excel's "Sumif(s)" function across like columns

DR_

I am fairly new to R (new to this site as well) and trying to understand how to aggregate data across columns in a situation where there is more than 1 identifier (in this case, two: PERSON_ID and PRODUCT_ID).

Please see my example below. To the right of the two identifiers within my data frame are five columns containing weekly sales figures. I need to aggregate the weekly data so that:

1: Week columns with the same name are summed (typically this is something I can easily accomplish in MS Excel using the sumif/sumifs function)

2: Any rows containing the same PERSON_ID and PRODUCT_ID combination are summed as well.

In this particular case, notice that the week of 6/2/2017 appears in more than one column. Meanwhile, PERSON_ID 0003603 appears twice for the same PRODUCT_ID, 3024.

PERSON_ID    PRODUCT_ID    6/23/2017   6/16/2017   6/9/2017   6/2/2017   6/2/2017
0003603      3024          10.000      5.000       4.000      3.000      2.000
0003603      3024          1.000       2.000       3.000      8.000      1.000     
0007654      2111          8.000       3.000       2.000      1.000      0.000
0008885      3025          0.000       0.000       1.000      3.000      9.000
0950645      3024          6.000       5.000       4.000      3.000      2.000

My actual data frame contains in excess of 1 million records, so an approach using the data.table package would be ideal, as far as I can tell.

Can someone please shed some light on how to solve this particular problem in R?

MichaelChirico

melting your data (reshaping long) is the way to go. If I understand what you're after correctly, it's simply:

x = fread('PERSON_ID    PRODUCT_ID  6/23/2017   6/16/2017   6/9/2017    6/2/2017    6/2/2017
0003603 3024    10.000  5.000   4.000   3.000   2.000
0003603 3024    1.000   2.000   3.000   8.000   1.000
0007654 2111    8.000   3.000   2.000   1.000   0.000
0008885 3025    0.000   0.000   1.000   3.000   9.000
0950645 3024    6.000   5.000   4.000   3.000   2.000',
          colClasses = c('character', 'character', rep('numeric', 5L)))

xmlt = 
  melt(x, id.vars = c('PERSON_ID', 'PRODUCT_ID'),
       variable.name = 'week', value.name = 'sales')

xmlt[ , week := as.IDate(week, format = '%m/%d/%Y')]

xmlt[ , .(total_sales = sum(sales)), 
      keyby = .(PERSON_ID, PRODUCT_ID, week)]
    PERSON_ID PRODUCT_ID       week total_sales
#  1:   0003603       3024 2017-06-02          14
#  2:   0003603       3024 2017-06-09           7
#  3:   0003603       3024 2017-06-16           7
#  4:   0003603       3024 2017-06-23          11
#  5:   0007654       2111 2017-06-02           1
#  6:   0007654       2111 2017-06-09           2
#  7:   0007654       2111 2017-06-16           3
#  8:   0007654       2111 2017-06-23           8
#  9:   0008885       3025 2017-06-02          12
# 10:   0008885       3025 2017-06-09           1
# 11:   0008885       3025 2017-06-16           0
# 12:   0008885       3025 2017-06-23           0
# 13:   0950645       3024 2017-06-02           5
# 14:   0950645       3024 2017-06-09           4
# 15:   0950645       3024 2017-06-16           5
# 16:   0950645       3024 2017-06-23           6

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Java

Pandas: Excel sumifs equivalent

From Dev

Excel - SUMIFS for multiple columns

From Dev

SUMIFS function across multiple sheets

From Dev

SUMIFS function excel

From Dev

R: Equivalent for Sumif and Countif by Categorical variable across columns in R

From Dev

What is equivalent function of excel "fdist" function in R?

From Dev

Excel SUMIFS function with complex criteria

From Dev

SQL Server equivalent of Excel's TINV function

From Dev

What is the MATLAB equivalent of Excel's NORMSDIST function?

From Dev

Cognos equivalent of excel's sumif() function

From Dev

C# equivalent of Excel's TINV function

From Dev

What is the MATLAB equivalent of Excel's NORMSDIST function?

From Dev

Cognos equivalent of excel's sumif() function

From Dev

SQL Server equivalent of Excel's TINV function

From Dev

Python's equivalent for R's dput() function

From Dev

Is there a Python equivalent to R's sample() function?

From Java

Equivalent of R's factor function in Pandas

From Dev

Python equivalent of R's head and tail function

From Dev

R: Is there an equivalent of Stata's ibn. function?

From Dev

Expand the for loop across n number of columns by making a function in R

From Dev

Panda's equivalent of R's order() for arranging dataframe columns

From Dev

What is the R equivalent of SQL's "LIKE '%searched_word%'"?

From Dev

Passing multiple values to SUMIFS function in Excel

From Dev

Using Max function result in SumIFS Excel

From Dev

Excel SUMIFS function not returning results consistently

From Dev

Equivalent of Excel Round function

From Dev

SQL Server - any equivalent of Excel's CHOOSE function?

From Dev

SQL round down by significance (equivalent of excel's floor function)

From Dev

Grouped function across multiple columns

Related Related

  1. 1

    Pandas: Excel sumifs equivalent

  2. 2

    Excel - SUMIFS for multiple columns

  3. 3

    SUMIFS function across multiple sheets

  4. 4

    SUMIFS function excel

  5. 5

    R: Equivalent for Sumif and Countif by Categorical variable across columns in R

  6. 6

    What is equivalent function of excel "fdist" function in R?

  7. 7

    Excel SUMIFS function with complex criteria

  8. 8

    SQL Server equivalent of Excel's TINV function

  9. 9

    What is the MATLAB equivalent of Excel's NORMSDIST function?

  10. 10

    Cognos equivalent of excel's sumif() function

  11. 11

    C# equivalent of Excel's TINV function

  12. 12

    What is the MATLAB equivalent of Excel's NORMSDIST function?

  13. 13

    Cognos equivalent of excel's sumif() function

  14. 14

    SQL Server equivalent of Excel's TINV function

  15. 15

    Python's equivalent for R's dput() function

  16. 16

    Is there a Python equivalent to R's sample() function?

  17. 17

    Equivalent of R's factor function in Pandas

  18. 18

    Python equivalent of R's head and tail function

  19. 19

    R: Is there an equivalent of Stata's ibn. function?

  20. 20

    Expand the for loop across n number of columns by making a function in R

  21. 21

    Panda's equivalent of R's order() for arranging dataframe columns

  22. 22

    What is the R equivalent of SQL's "LIKE '%searched_word%'"?

  23. 23

    Passing multiple values to SUMIFS function in Excel

  24. 24

    Using Max function result in SumIFS Excel

  25. 25

    Excel SUMIFS function not returning results consistently

  26. 26

    Equivalent of Excel Round function

  27. 27

    SQL Server - any equivalent of Excel's CHOOSE function?

  28. 28

    SQL round down by significance (equivalent of excel's floor function)

  29. 29

    Grouped function across multiple columns

HotTag

Archive