Count rows by group based on values

bvowe

I have a dataframe, DATA, containing information on students and their test dates. I want to create a variable called WANT, where for all the STUDENTs you count the unique months (not the unique rows) as shown in the sample of the WANT variable below:

library(dplyr)

set.seed(0)

DATA <- data.frame("STUDENT" = sample(1:5, 100, r = T),
                   "TESTDATE" = sample(seq(as.Date('2010/01/01'), as.Date('2010/12/31'), by="day"), 100, r=T))
    
DATA <- DATA %>% arrange(STUDENT, TESTDATE)
    
DATA$WANT <- c(1,1,1,2,2,3,3,4,4,5,5,6,7,7,8,8,9,1,1,1,2,3,3,4,5,5,6,7,8,8,9,10,10, rep(NA, 67))

My attempt only does rows and it's not what I wish for

DATA %>% group_by(STUDENT) %>% mutate(WANT = 1:n())
akrun

We may extract the month part and use match or as.integer(factor(WANT2 levels = unique(WANT2)))

library(dplyr)
out <- DATA %>% 
  group_by(STUDENT) %>% 
  mutate(WANT2 = as.integer(format(TESTDATE, '%m')),
        WANT2 = match(WANT2, unique(WANT2))) %>%
   ungroup

-output

> head(out, 22) %>% as.data.frame
   STUDENT   TESTDATE WANT WANT2
1        1 2010-01-20    1     1
2        1 2010-01-31    1     1
3        1 2010-02-10    2     2
4        1 2010-02-10    2     2
5        1 2010-03-27    3     3
6        1 2010-04-20    4     4
7        1 2010-04-21    4     4
8        1 2010-05-02    5     5
9        1 2010-05-06    5     5
10       1 2010-05-13    5     5
11       1 2010-05-20    5     5
12       1 2010-06-17    6     6
13       1 2010-08-22    7     7
14       1 2010-08-25    7     7
15       1 2010-08-27    7     7
16       1 2010-08-30    7     7
17       1 2010-09-06    8     8
18       1 2010-09-30    8     8
19       1 2010-10-27    9     9
20       1 2010-10-31    9     9
21       1 2010-12-10   10    10
22       1 2010-12-21   10    10

If we want the year-month to count as separate, then do

out <- DATA %>% 
  group_by(STUDENT) %>% 
  mutate(WANT2 = format(TESTDATE, '%Y-%m'),
        WANT2 = match(WANT2, unique(WANT2))) %>%
   ungroup

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

count number of rows in a data frame in R based on group

From Dev

Count number of unique rows based on two columns, by group

From Dev

Limiting GROUP BY based on COUNT() values in mySQL

From Dev

R - Adding a count to a dataframe based on values in current row and other rows

From Dev

Pandas: Replace values in a column with 'Other' based on the count of rows corresponding to the value

From Dev

MYSQL - Displaying rows with max count values in a group with other columns

From Dev

How to count rows of different group of values in one SELECT

From Dev

How to create a sequential count by group that excludes values that are in rows above

From Dev

SQL query to count rows based on previous values of different column

From Dev

Select rows based on group_by rows and its column values

From Dev

How to group by and count rows based on a time interval condition

From Dev

SQL - Updating specific rows based on group by count

From Dev

Count rows in data table with certain values by group

From Dev

Updating specific rows to values based on the count of rows in another table

From Dev

select unique rows based on column by least group count

From Dev

Select rows in each group based on their values

From Dev

Count/group rows based on date including missing

From Dev

how to count rows based on column values null and not null

From Dev

How to count the number of unique values in a group of rows?

From Dev

Count Rows before Group By

From Dev

SQL group 2 rows based on column values

From Dev

How to group the rows and get the count based on other column values in oracle

From Dev

Groupby columns and sort values descending by the count of rows for each group

From Dev

Add and count non-zero values of rows based on current date

From Dev

Divide values of rows based on condition which are of running count

From Dev

Powershell - Group and count unique values from CSV file based on a column

From Dev

Limiting output of rows based on count of values in another table?

From Dev

Expand number of dataframe rows based on sample count values

From Dev

group rows based on column and sum their values

Related Related

  1. 1

    count number of rows in a data frame in R based on group

  2. 2

    Count number of unique rows based on two columns, by group

  3. 3

    Limiting GROUP BY based on COUNT() values in mySQL

  4. 4

    R - Adding a count to a dataframe based on values in current row and other rows

  5. 5

    Pandas: Replace values in a column with 'Other' based on the count of rows corresponding to the value

  6. 6

    MYSQL - Displaying rows with max count values in a group with other columns

  7. 7

    How to count rows of different group of values in one SELECT

  8. 8

    How to create a sequential count by group that excludes values that are in rows above

  9. 9

    SQL query to count rows based on previous values of different column

  10. 10

    Select rows based on group_by rows and its column values

  11. 11

    How to group by and count rows based on a time interval condition

  12. 12

    SQL - Updating specific rows based on group by count

  13. 13

    Count rows in data table with certain values by group

  14. 14

    Updating specific rows to values based on the count of rows in another table

  15. 15

    select unique rows based on column by least group count

  16. 16

    Select rows in each group based on their values

  17. 17

    Count/group rows based on date including missing

  18. 18

    how to count rows based on column values null and not null

  19. 19

    How to count the number of unique values in a group of rows?

  20. 20

    Count Rows before Group By

  21. 21

    SQL group 2 rows based on column values

  22. 22

    How to group the rows and get the count based on other column values in oracle

  23. 23

    Groupby columns and sort values descending by the count of rows for each group

  24. 24

    Add and count non-zero values of rows based on current date

  25. 25

    Divide values of rows based on condition which are of running count

  26. 26

    Powershell - Group and count unique values from CSV file based on a column

  27. 27

    Limiting output of rows based on count of values in another table?

  28. 28

    Expand number of dataframe rows based on sample count values

  29. 29

    group rows based on column and sum their values

HotTag

Archive