I have some weekly data slices that have a name, date, and a rank.
I would like to create a new column to flag when they are consistently ranked 2 or higher in the two previous periods.
name<- c("a","a","a","a",
"b","b","b","b",
"c","c","c","c")
date<- c("2015-01-19","2014-01-12","2014-01-05","2014-01-01",
"2015-01-19","2014-01-12","2014-01-05","2014-01-01",
"2015-01-19","2014-01-12","2014-01-05","2014-01-01")
rank<- c(1,2,1,3,
2,3,2,1,
3,1,3,2)
df<-data.frame(name,date,rank)
df<-arrange(df,date,desc(rank))
df
name date rank
a 2014-01-01 3
c 2014-01-01 2
b 2014-01-01 1
c 2014-01-05 3
b 2014-01-05 2
a 2014-01-05 1
b 2014-01-12 3
a 2014-01-12 2
c 2014-01-12 1
c 2015-01-19 3
b 2015-01-19 2
a 2015-01-19 1
For example: For a on Jan 19, it was ranked 1 and it was also ranked less than 2 on both Jan 12, and Jan 5, so flag this with a 1 For b on Jan 19, it was ranked 2, but it was not ranked less than 2 on both previous dates, and therefore, flag with a 0. For c on Jan 19, it was not ranked less than 2, so flag with 0.
Here's the output I'd like to see:
name date rank consistent
a 1/1/2014 3 NA
c 1/1/2014 2 NA
b 1/1/2014 1 NA
c 1/5/2014 3 NA
b 1/5/2014 2 NA
a 1/5/2014 1 NA
b 1/12/2014 3 0
a 1/12/2014 2 0
c 1/12/2014 1 0
c 1/19/2015 3 0
b 1/19/2015 2 0
a 1/19/2015 1 1
Thanks so much!
I have a data.table
-based solution:
library(data.table)
setDT(d)[,consistent:=rank < 3 & c(NA,rank[-.N])<3 & c(NA,NA,rank[-c(.N-1,.N)])<3,name]
d
# name date rank consistent
# 1: a 2014-01-01 3 FALSE
# 2: c 2014-01-01 2 NA
# 3: b 2014-01-01 1 NA
# 4: c 2014-01-05 3 FALSE
# 5: b 2014-01-05 2 NA
# 6: a 2014-01-05 1 FALSE
# 7: b 2014-01-12 3 FALSE
# 8: a 2014-01-12 2 FALSE
# 9: c 2014-01-12 1 FALSE
# 10: c 2015-01-19 3 FALSE
# 11: b 2015-01-19 2 FALSE
# 12: a 2015-01-19 1 TRUE
It does not produce the consistent
column exactly as in the sample output, but makes the job flagging the requested rows.
UPDATE
Here is the updated solution that produces output consistent with the one in the OP's question:
setDT(d)[,consistent:=rowSums(cbind(rank < 3, c(NA,rank[-.N])<3, c(NA,NA,rank[-c(.N-1,.N)])<3)) %/%3 ,name]
print(d)
# name date rank consistent
# 1: a 2014-01-01 3 NA
# 2: c 2014-01-01 2 NA
# 3: b 2014-01-01 1 NA
# 4: c 2014-01-05 3 NA
# 5: b 2014-01-05 2 NA
# 6: a 2014-01-05 1 NA
# 7: b 2014-01-12 3 0
# 8: a 2014-01-12 2 0
# 9: c 2014-01-12 1 0
# 10: c 2015-01-19 3 0
# 11: b 2015-01-19 2 0
# 12: a 2015-01-19 1 1
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments