我有一个这样的数据框
gender <- c("m","m","m","m","m","f","f","f","f","f")
age <- c(18,28,39,49,3,
13,16,6,19,37)
df <- data.frame(gender,age,stringsAsFactors = F)
我正在尝试创建一个ageband
列,其中包含0至50个5组。
df %>%
mutate(ageband = cut( age, breaks = seq(0, 50, 5), right = FALSE)) %>%
group_by(gender, ageband) %>%
mutate(population = 1) %>%
summarize(population = sum(population, na.rm = TRUE))
我得到这个输出
gender ageband population
1 f [5,10) 1
2 f [10,15) 1
3 f [15,20) 2
4 f [35,40) 1
5 m [0,5) 1
6 m [15,20) 1
7 m [25,30) 1
8 m [35,40) 1
9 m [45,50) 1
这不会显示具有空行的组。我想填充人口= 0的空白行。
我想要的输出是
gender ageband population
1 f [0,5) 0
2 f [5,10) 1
3 f [10,15) 1
4 f [15,20) 2
5 f [20,25) 0
6 f [25,30) 0
7 f [30,35) 0
8 f [35,40) 1
9 f [40,45) 0
10 f [45,50) 0
11 m [0,5) 1
12 m [5,10) 0
13 m [10,15) 0
14 m [15,20) 1
15 m [20,25) 0
16 m [25,30) 1
17 m [30,35) 0
18 m [35,40) 1
19 m [40,45) 0
20 m [45,50) 1
我尝试这样做,但效果不佳
df %>%
mutate(ageband = cut( age, breaks = seq(0, 50, 5), right = FALSE)) %>%
group_by(gender, ageband) %>%
mutate(population = 1) %>%
summarize(population = sum(population, na.rm = TRUE)) %>%
mutate(population = coalesce(population, 0L))
有人可以指出我正确的方向吗?
通过添加tidyr
,您可以执行以下操作:
df %>%
mutate(ageband = cut(age, breaks = seq(0, 50, 5), right = FALSE)) %>%
count(gender, ageband) %>%
complete(ageband, nesting(gender), fill = list(n = 0)) %>%
arrange(gender, ageband)
ageband gender n
<fct> <chr> <dbl>
1 [0,5) f 0
2 [5,10) f 1
3 [10,15) f 1
4 [15,20) f 2
5 [20,25) f 0
6 [25,30) f 0
7 [30,35) f 0
8 [35,40) f 1
9 [40,45) f 0
10 [45,50) f 0
11 [0,5) m 1
12 [5,10) m 0
13 [10,15) m 0
14 [15,20) m 1
15 [20,25) m 0
16 [25,30) m 1
17 [30,35) m 0
18 [35,40) m 1
19 [40,45) m 0
20 [45,50) m 1
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句