Diff on each subset of a data frame column

Melissa Salazar

I have a data frame with ID, year, and month. I need to group by year and month and get the unique IDs from that group. I want to compare the unique IDs to the prior year, month group, how many IDs were added and how many were subtracted.

Kind of shooting in the dark but I tried the following, doesn't work:

connections <- df %>%
  group_by(year, month) %>%
  arrange(year, month) %>%
  diff_data(unique(as.vector(~ID)), lag(unique(as.vector(~ID))))

Sample Data

df <- data.frame(ID=c("A1", "A2", "A3", "A1", "A2","A4", "A1", "A4", "A5"),
year= c(2010, 2010, 2010, 2011, 2011, 2011, 2012, 2012, 2012), 
month= c(1, 2, 3, 1, 2, 3, 1, 2, 3))

Desired Output

Ben

First would do aggregate on both month and year. In this approach would list all IDs added and deleted each month, and get length to count how many added and deleted each month.

library(tidyverse)

df %>%
  aggregate(ID ~ year + month, ., unique, drop = FALSE) %>%
  group_by(month) %>%
  arrange(year) %>%
  mutate(addedID = mapply(setdiff, ID, lag(ID), SIMPLIFY = FALSE),
         num_addedID = lapply(addedID, length),
         deletedID = mapply(setdiff, lag(ID), ID, SIMPLIFY = FALSE),
         num_deletedID = lapply(deletedID, function(x) length(na.omit(x)))) %>%
  ungroup() %>%
  arrange(month, year) %>%
  as.data.frame()

Output

  year month ID addedID num_addedID deletedID num_deletedID
1 2010     1 A1      A1           1        NA             0
2 2011     1 A1                   0                       0
3 2012     1 A1                   0                       0
4 2010     2 A3      A3           1        NA             0
5 2011     2 A2      A2           1        A3             1
6 2012     2 A4      A4           1        A2             1
7 2010     3 A3      A3           1        NA             0
8 2011     3 A4      A4           1        A3             1
9 2012     3 A5      A5           1        A4             1

Data

df <- data.frame(ID=c("A1", "A3", "A3", "A1", "A2","A4", "A1", "A4", "A5"),
                 year= c(2010, 2010, 2010, 2011, 2011, 2011, 2012, 2012, 2012), 
                 month= c(1, 2, 3, 1, 2, 3, 1, 2, 3))

この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。

侵害の場合は、連絡してください[email protected]

編集
0

コメントを追加

0

関連記事

分類Dev

Julia: Subset data frame

分類Dev

Take the subsets of a data.frame with the same feature and select a single row from each subset

分類Dev

Each column of a data.frame as a factor of two levels

分類Dev

filtering each column of a data frame an put NA for unmatched values

分類Dev

Max value of for each column for distinct rows of data frame in r

分類Dev

How can I subset a data frame based on dates, when my dates column is not the index in Python?

分類Dev

Replace a subset of a data frame with dplyr join operations

分類Dev

Repeat calculations by subset of data frame in Python

分類Dev

How can I change the column names of a list to the first row of each data frame in a loop?

分類Dev

Calculate difference of adjacent rows (decimal numbers) in a data frame for each group defined in a different column

分類Dev

How can I convert an R data frame with a single column into a corpus for tm such that each row is taken as a document?

分類Dev

User grpl on each element of a dataframe column to find a string in a different data frame

分類Dev

duplicate a column in pyspark data frame

分類Dev

subsetting data frame on sum of column

分類Dev

R : Column operation on a data frame

分類Dev

How do I subset a data frame based on the values in another data frame?

分類Dev

Matching rownumber and column name of a data frame with values of another data frame

分類Dev

jsonlite is creating a data.frame with a column of class data.frame

分類Dev

jsonlite is creating a data.frame with a column of class data.frame

分類Dev

Aggregate Column from Data Frame 1 and Insert to Data Frame 2

分類Dev

SQL Split data by continuous increasing sequence & then subset each by a pattern

分類Dev

Cumulative sum of each each variable of a data.frame in R?

分類Dev

To find whether a column exists in data frame or not

分類Dev

Modify a Data Frame column with list comprehension

分類Dev

Convert a string to data frame, including column names

分類Dev

Unlist character column to data frame in R

分類Dev

add_column error with data frame

分類Dev

Change of element in column not updating in data frame

分類Dev

Create new column on grouped data frame

Related 関連記事

  1. 1

    Julia: Subset data frame

  2. 2

    Take the subsets of a data.frame with the same feature and select a single row from each subset

  3. 3

    Each column of a data.frame as a factor of two levels

  4. 4

    filtering each column of a data frame an put NA for unmatched values

  5. 5

    Max value of for each column for distinct rows of data frame in r

  6. 6

    How can I subset a data frame based on dates, when my dates column is not the index in Python?

  7. 7

    Replace a subset of a data frame with dplyr join operations

  8. 8

    Repeat calculations by subset of data frame in Python

  9. 9

    How can I change the column names of a list to the first row of each data frame in a loop?

  10. 10

    Calculate difference of adjacent rows (decimal numbers) in a data frame for each group defined in a different column

  11. 11

    How can I convert an R data frame with a single column into a corpus for tm such that each row is taken as a document?

  12. 12

    User grpl on each element of a dataframe column to find a string in a different data frame

  13. 13

    duplicate a column in pyspark data frame

  14. 14

    subsetting data frame on sum of column

  15. 15

    R : Column operation on a data frame

  16. 16

    How do I subset a data frame based on the values in another data frame?

  17. 17

    Matching rownumber and column name of a data frame with values of another data frame

  18. 18

    jsonlite is creating a data.frame with a column of class data.frame

  19. 19

    jsonlite is creating a data.frame with a column of class data.frame

  20. 20

    Aggregate Column from Data Frame 1 and Insert to Data Frame 2

  21. 21

    SQL Split data by continuous increasing sequence & then subset each by a pattern

  22. 22

    Cumulative sum of each each variable of a data.frame in R?

  23. 23

    To find whether a column exists in data frame or not

  24. 24

    Modify a Data Frame column with list comprehension

  25. 25

    Convert a string to data frame, including column names

  26. 26

    Unlist character column to data frame in R

  27. 27

    add_column error with data frame

  28. 28

    Change of element in column not updating in data frame

  29. 29

    Create new column on grouped data frame

ホットタグ

アーカイブ