我收到了一个平面数据,并且在平面化数据时缺少值。我必须根据ID,类型和日期将小时数提高到以小时为单位的资产净值,以便删除以美元为单位的资产净值
id<-c(1,2,1,1,1,2,1)
dollar<-as.numeric(c(100,200,300,500, NA, NA,NA))
hours<-as.numeric(c(NA,NA, NA, NA, 5,10,12))
type<-c("Engineer", "Engineer","Operating","Part", "Engineer","Engineer","Operating" )
Date<-c("2020-01-02","2020-01-03","2020-01-02","2020-01-04", "2020-01-02","2020-01-03","2020-01-02")
id dollar hours type Date
1 1 100 <NA> Engineer 2020-01-02
2 2 200 <NA> Engineer 2020-01-03
3 1 300 <NA> Operating 2020-01-02
4 1 500 <NA> Part 2020-01-04
5 1 <NA> 5 Engineer 2020-01-02
6 2 <NA> 10 Engineer 2020-01-03
7 1 <NA> 12 Operating 2020-01-02
我想按以下方式修改我的数据。
id dollar hours type Date
1 1 100 5 Engineer 2020-01-02
2 2 200 10 Engineer 2020-01-03
3 1 300 12 Operating 2020-01-02
4 1 500 <NA> Part 2020-01-04
它不仅按ID分组,而且与类型和日期匹配。“ id”具有类别,“ type”具有17个类别,“ Date”为3年。
请帮我。
这是使用的一种方法tidyverse
。您可以通过组id
,type
以及date
然后填充可用的值丢失NA。
library(tidyverse)
df %>%
group_by(id, type, Date) %>%
fill(c(dollar, hours), .direction = "updown") %>%
slice(1)
输出量
# A tibble: 4 x 5
# Groups: id, type, Date [4]
id dollar hours type Date
<dbl> <dbl> <dbl> <fct> <fct>
1 1 100 5 Engineer 2020-01-02
2 1 300 12 Operating 2020-01-02
3 1 500 NA Part 2020-01-04
4 2 200 10 Engineer 2020-01-03
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句