我很想知道哪个命令用于以下内容:我想计算“名称”列中城市和“年”列中每年的人口估计值。“增长”列提供了增长率。所以作为公式,它会是这样的:
Population[Lucknow,2030] = Population[Lucknow, 2020] * growth[2030]
等等。以下df:
df <- data.frame(YEAR=c(2020,2020,2020,2030,2040,2050), NAME=c("Lucknow","Delhi","Hyderadabad",NA,NA,NA), POPULATION=c(3704, 29274,10275,NA,NA,NA), growth=c(1.0,1.0,1.0,1.10,1.18,1.24))
Year Name Population growth
2020 Lucknow 3704 1.0000000
2020 Delhi 29274 1.0000000
2020 Hyderabad 10275 1.0000000
2030 <NA> NA <NA> 1.10
2040 <NA> NA <NA> 1.18
2050 <NA> NA <NA> 1.24
编辑:按照 Dom(谢谢!)在下面写的内容,输入将是:
df <- tibble( year = rep(c(2020,2030,2040,2050), each = 3), city =rep(c("Lucknow","Delhi","Hyderadabad"), times = 4), pop = c(3704, 29274,10275, rep(NA_integer_, times = 9)), growth = rep(c(1.0, 1.10, 1.18, 1.24), each = 3) )
year city pop growth
<dbl> <chr> <dbl> <dbl>
1 2020 Lucknow 3704 1
2 2020 Delhi 29274 1
3 2020 Hyderadabad 10275 1
4 2030 Lucknow NA 1.1
5 2030 Delhi NA 1.1
6 2030 Hyderadabad NA 1.1
7 2040 Lucknow NA 1.18
8 2040 Delhi NA 1.18
9 2040 Hyderadabad NA 1.18
10 2050 Lucknow NA 1.24
11 2050 Delhi NA 1.24
12 2050 Hyderadabad NA 1.24
输出应如下所示:
Year Name Population growth
2020 Lucknow 3704 1.0000000
2020 Delhi 29274 1.0000000
2020 Hyderabad 10275 1.0000000
2030 Lucknow 4074.4 1.1000000
2030 Delhi 32201.4 1.1000000
2030 Hyderabad 11302.5 1.1000000
....
如何在tibble中填充NA?
我对合并和 dplyr::mutate 进行了各种尝试,但由于这是一个向量操作,因此无法确定我需要在这里做什么。我很乐意为正确的命令提供任何线索来执行这样的基本操作。
谢谢!
使用dplyr
:
library(dplyr)
df %>%
arrange(city, year) %>%
group_by(city) %>%
mutate(pop = pop[1] * growth)
# A tibble: 12 x 4
# Groups: city [3]
year city pop growth
<dbl> <chr> <dbl> <dbl>
1 2020 Delhi 29274 1
2 2030 Delhi 32201. 1.1
3 2040 Delhi 34543. 1.18
4 2050 Delhi 36300. 1.24
5 2020 Hyderadabad 10275 1
6 2030 Hyderadabad 11303. 1.1
7 2040 Hyderadabad 12124. 1.18
8 2050 Hyderadabad 12741 1.24
9 2020 Lucknow 3704 1
10 2030 Lucknow 4074. 1.1
11 2040 Lucknow 4371. 1.18
12 2050 Lucknow 4593. 1.24
使用基础R
:
df <- df[order(df[["city"]], df[["year"]]), ]
df[["pop"]] <-
unlist(
lapply(
unique(df[["city"]]),
function(x) with(df[df[["city"]] == x, ], pop[1] * growth)
)
)
使用data.table
:
library(data.table)
setDT(df)[order(city, year), pop := pop[1] * growth, city]
数据:
df <- tibble(
year = rep(c(2020, 2030, 2040, 2050), each = 3),
city = rep(c("Lucknow", "Delhi", "Hyderadabad"), times = 4),
pop = c(3704, 29274, 10275, rep(NA, times = 9)),
growth = rep(c(1.0, 1.10, 1.18, 1.24), each = 3)
)
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句