R: Count how many times value has occured before within certain range of rows

sakwa Published at Dev

sakwa

I have a dataframe like this:

df <- data.frame("subj.no" = rep(1:3, each = 24), 
                 "trial.no" = rep(1:3, each = 8, length.out = 72), 
                 "item" = c(rep(c("ball", "book"), 4), rep(c("doll", "rope"), 4), rep(c("fish", "box"), 4), rep(c("paper", "candle"), 4), rep(c("horse", "marble"), 4), rep(c("doll", "rope"), 4), rep(c("tree", "dog"), 4), rep(c("ball", "book"), 4), rep(c("horse", "marble"), 4)),
                 "rep.no" = rep(1:4, each = 2, length.out = 72),
                 "DV" = c(1,0,1,0,1,0,0,1,1,0,1,0,0,0,1,0,1,0,1,0,1,0,0,0,0,1,1,1,1,0,0,1,0,1,1,0,0,1,0,1,1,1,0,1,0,0,
                      1,0,0,1,1,0,1,0,0,1,1,1,1,0,0,0,0,0,0,1,0,1,0,1,1,0),)

I now want to create another column DV.no which says that the value 1 occurred the nth time within that combination of subj.no, trial.no and item. For DV==0, the value in the new column should be 0.

So the resulting vector should look like this:

DV.no = c(1,0,2,0,3,0,0,1,1,0,2,0,0,0,3,0,1,0,2,0,3,0,0,0,0,1,1,2,2,0,0,3,0,1,1,0,0,2,0,3,1,1,0,2,0,0,2,0,0,1,1,0,2,0,0,2,1,1,2,0,0,0,0,0,0,1,0,2,0,3,1,0)

So basically, for each unique combination of values in subj.no, trial.no and item, whenever the value of DV is 1, then 1 should be added to the count in the new variable.

(Remark: The column rep.no is not part of the relevant value combination. But it's in the df anyway, and since I didn't know if it's useful for the solution, I left it there.)

How can this be done in R?

akrun

We can do a group by cumsum on the 'DV' column

library(dplyr)
df %>%
    group_by(subj.no, trial.no, item) %>% 
     mutate(V.no = cumsum(DV)* DV)

Or in base R with ave

df$V.no <- with(df, DV *ave(DV, subj.no, trial.no, item, FUN = cumsum))

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at2020-12-15

Comments

0 comments

From Dev

Count how many times certain pandas row has specific column value lower than another certain pandas row across many pandas dataframes

From Dev

Related Related

Article