我有一个数据表如下:
user time follow_group
1: 1 2017-09-01 00:01:01 1
2: 1 2017-09-01 00:01:20 1
3: 1 2017-09-01 00:03:01 1
4: 1 2017-09-01 00:10:01 2
5: 1 2017-09-01 00:11:01 2
6: 2 2017-09-01 00:01:03 1
7: 2 2017-09-01 00:01:08 1
8: 2 2017-09-01 00:03:01 1
从这里我想为每个用户获取所有具有最高 follow_group 的记录
所以我所做的是
data[max(follow_group), , by = list(user)]
但这给我带来了一个错误
Error in `[.data.table`(data, max(follow_group), :
'by' or 'keyby' is supplied but not j
任何帮助表示赞赏。谢谢。
你可以这样做data.table
:
library(data.table)
setDT(df)[, .SD[follow_group == max(follow_group)], by = user]
或者这与dplyr
:
library(dplyr)
df %>%
group_by(user) %>%
filter(follow_group == max(follow_group))
结果:
user time follow_group
1: 1 2017-09-01 00:10:01 2
2: 1 2017-09-01 00:11:01 2
3: 2 2017-09-01 00:01:03 1
4: 2 2017-09-01 00:01:08 1
5: 2 2017-09-01 00:03:01 1
# A tibble: 5 x 3
# Groups: user [2]
user time follow_group
<int> <chr> <int>
1 1 2017-09-01 00:10:01 2
2 1 2017-09-01 00:11:01 2
3 2 2017-09-01 00:01:03 1
4 2 2017-09-01 00:01:08 1
5 2 2017-09-01 00:03:01 1
数据:
df = structure(list(user = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L), time = c("2017-09-01 00:01:01",
"2017-09-01 00:01:20", "2017-09-01 00:03:01", "2017-09-01 00:10:01",
"2017-09-01 00:11:01", "2017-09-01 00:01:03", "2017-09-01 00:01:08",
"2017-09-01 00:03:01"), follow_group = c(1L, 1L, 1L, 2L, 2L,
1L, 1L, 1L)), class = "data.frame", .Names = c("user", "time",
"follow_group"), row.names = c(NA, -8L))
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句