Aggregate a column based on NAs in a different column

Soheil

I want to aggregate group2 based on NAs in group1:

Datetime            group1  group2
2011-08-08 21:00:00   1       1
2011-08-08 21:10:00   NA      2
2011-08-08 21:20:00   NA      3
2011-08-08 21:30:00   2       4
2011-08-08 21:40:00   NA      5
2011-08-08 21:50:00   NA      6
2011-08-08 22:00:00   3       7

This is my desired output:

Datetime            group1  group2
2011-08-08 21:00:00   1       1
2011-08-08 21:30:00   2       9 
2011-08-08 22:00:00   3       18

Edit: 9=2+3+4 and 18=5+6+7.

aggregate(group2~group1, data=Data, subset(Data,group1==NA),sum)

Any suggestion is appreciated. Can I do it with aggregate? or should I use different package?

Rich Scriven

It looks like na.locf from package zoo would be quite useful here.

Assuming dat is your original data, we can take the dates for the non-NA group1 levels and use cbind to bring them together with the aggregated group2 data.

> library(zoo)
> Datetime <- dat$Datetime[!is.na(dat$group1)]
> cbind(Datetime, aggregate(group2~group1, na.locf(dat, fromLast = TRUE), sum))
#              Datetime group1 group2
# 1 2011-08-08 21:00:00      1      1
# 2 2011-08-08 21:30:00      2      9
# 3 2011-08-08 22:00:00      3     18

PS: Thanks for updating/editing your question (+1).

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Java

Mutate new column based on different datasets

From Java

Apache spark aggregation: aggregate column based on another column value

From Java

How to fill names with NA based on a different column

From Dev

Python - Transpose/Pivot a column based based on a different column

From Dev

How to select based on different column data

From Dev

Scala: aggregate column based file

From Dev

SQLSERVER group by (aggregate column based on other column)

From Dev

Pandas interpolate NaNs based on different column

From Dev

Joining tables based on different column names

From Dev

SQL XML Column filter based on XML node aggregate function

From Dev

Aggregate by repeated datetime index with different identifiers in a column on a pandas dataframe

From Dev

Duplicate row based on value in different column

From Dev

R: aggregate dataframe but different column

From Dev

Is there a way to choose a different column on a comparison based on a variable?

From Dev

deedle aggregate/group based on running numbers in a column of Frame

From Dev

R Conditional replacement of NAs based on text in another column

From Dev

Order rows based on different column value

From Dev

dplyr mutate based on other column with different suffix

From Dev

pandas dataframe: how to aggregate a subset of rows based on value of a column

From Dev

Aggregate contents of a column based on the range of values in another column in Pandas

From Dev

Conditionally aggregate grouped data frame with different functions depending on values in a column

From Dev

PySpark: Fill NAs with mode of column based on aggregation of other columns

From Dev

sum a column based on another column in R, but skip the rows with NAs

From Dev

SQLSERVER group by (aggregate column based on other column)

From Dev

Aggregate by repeated datetime index with different identifiers in a column on a pandas dataframe

From Dev

R: aggregate dataframe but different column

From Dev

select different column based on column values

From Dev

Aggregate Pandas Column based on values in Column Range

From Dev

How to conditionally aggregate a column based on another

Related Related

  1. 1

    Mutate new column based on different datasets

  2. 2

    Apache spark aggregation: aggregate column based on another column value

  3. 3

    How to fill names with NA based on a different column

  4. 4

    Python - Transpose/Pivot a column based based on a different column

  5. 5

    How to select based on different column data

  6. 6

    Scala: aggregate column based file

  7. 7

    SQLSERVER group by (aggregate column based on other column)

  8. 8

    Pandas interpolate NaNs based on different column

  9. 9

    Joining tables based on different column names

  10. 10

    SQL XML Column filter based on XML node aggregate function

  11. 11

    Aggregate by repeated datetime index with different identifiers in a column on a pandas dataframe

  12. 12

    Duplicate row based on value in different column

  13. 13

    R: aggregate dataframe but different column

  14. 14

    Is there a way to choose a different column on a comparison based on a variable?

  15. 15

    deedle aggregate/group based on running numbers in a column of Frame

  16. 16

    R Conditional replacement of NAs based on text in another column

  17. 17

    Order rows based on different column value

  18. 18

    dplyr mutate based on other column with different suffix

  19. 19

    pandas dataframe: how to aggregate a subset of rows based on value of a column

  20. 20

    Aggregate contents of a column based on the range of values in another column in Pandas

  21. 21

    Conditionally aggregate grouped data frame with different functions depending on values in a column

  22. 22

    PySpark: Fill NAs with mode of column based on aggregation of other columns

  23. 23

    sum a column based on another column in R, but skip the rows with NAs

  24. 24

    SQLSERVER group by (aggregate column based on other column)

  25. 25

    Aggregate by repeated datetime index with different identifiers in a column on a pandas dataframe

  26. 26

    R: aggregate dataframe but different column

  27. 27

    select different column based on column values

  28. 28

    Aggregate Pandas Column based on values in Column Range

  29. 29

    How to conditionally aggregate a column based on another

HotTag

Archive