我有一个看起来像这样的df(dput()
在问题末尾):
id leg_activity
1044 home, pt, work, adpt, home, adpt, work, adpt, home
1215 home, adpt, work, adpt, home, car, outside, car, work, car, work, car, home
1238 home, work, leisure, adpt, home
1458 home, outside, pt, home, adpt, leisure, adpt, home
1569 home, car_passenger, leisure, walk, work, walk, leisure, walk, work, adpt, home, walk, home
1615 home, adpt, work, walk, work, walk, work, pt, home
我想将的单元格leg_activity
分成多个列。字符串的分离应尽快, home
发生。想法是查看两个home
条目之间会发生什么。对于第一行,所需结果将如下所示:
id 1 2
1044 home, pt, work, adpt, home home, adpt, work, adpt, home
如您所见,我想保留home
两个新单元格。
到目前为止,我的方法无法做到这一点,它只是删除了,home
但我需要它出现在剪切的两侧:
library(dplyr)
df <- agents %>%
mutate(n = row_number()) %>%
separate_rows(leg_activity, sep = 'home, ') %>%
group_by(n, id) %>%
mutate(n2 = row_number()) %>%
spread(n2, leg_activity) %>%
select(-n)
非常感谢您的提前帮助!
数据尝试代码:
structure(list(id = c(9L, 10L, 11L, 96L, 97L, 101L, 103L, 248L,
499L, 1044L, 1215L, 1238L, 1458L, 1569L, 1615L, 1626L, 1734L,
1735L, 1790L, 1912L, 9040L, 14858L, 14859L, 14967L, 15011L, 15012L,
15015L, 15045L, 15050L, 15058L, 15060L, 15086L, 15088L, 15094L,
15109L, 15113L, 15152L, 15157L, 15192L, 15193L, 15222L, 15230L,
15231L, 15234L, 15235L, 15237L, 15256L, 15257L, 15258L, 15269L
), leg_activity = c("home, adpt, shop, car_passenger, home, adpt, work, adpt, home, work, outside, pt, home",
"home, pt, outside, pt, home, car, leisure, car, other, car, leisure, car, leisure, car, other, car, leisure, car, other, car, leisure, car, home, adpt, leisure, adpt, home",
"home, work, adpt, home", "home, car, work, car, home, work, adpt, home",
"home, adpt, work, car_passenger, leisure, car_passenger, work, adpt, home, car_passenger, outside, car_passenger, outside, car_passenger, home",
"home, bike, outside, car_passenger, outside, car_passenger, outside, bike, home, adpt, leisure, adpt, home, bike, leisure, bike, home",
"home, adpt, work, adpt, home, walk, other, pt, home", "home, adpt, work, walk, home, adpt, work, walk, home",
"home, adpt, leisure, adpt, home, bike, outside, bike, home",
"home, pt, work, adpt, home, adpt, work, adpt, home", "home, adpt, work, adpt, home, car, outside, car, work, car, work, car, home",
"home, work, leisure, adpt, home", "home, outside, pt, home, adpt, leisure, adpt, home",
"home, car_passenger, leisure, walk, work, walk, leisure, walk, work, adpt, home, walk, home",
"home, adpt, work, walk, work, walk, work, pt, home", "home, car, work, pt, leisure, adpt, work, car, home, car, home",
"home, adpt, other, adpt, home, car, home", "home, adpt, other, adpt, home",
"home, education, walk, shop, walk, education, pt, outside, home, adpt, leisure, adpt, home",
"home, adpt, work, adpt, home, walk, home", "home, adpt, work, pt, leisure, adpt, work, adpt, work, adpt, home, adpt, other, walk, home",
"home, adpt, work, adpt, home, adpt, work, adpt, home, walk, leisure, walk, home",
"home, adpt, work, adpt, home, work, adpt, home, walk, leisure, walk, home",
"home, adpt, work, adpt, home, car_passenger, outside, car_passenger, leisure, car_passenger, home, car_passenger, home",
"home, adpt, other, adpt, home, car, work, car, home", "home, adpt, education, adpt, leisure, adpt, home, walk, leisure, walk, home",
"home, car_passenger, other, pt, home, walk, other, walk, home, car_passenger, other, walk, home, adpt, other, adpt, home",
"home, work, pt, work, adpt, work, adpt, home", "home, adpt, leisure, adpt, home, car, shop, car, other, car, home",
"home, adpt, work, adpt, home, walk, other, adpt, home", "home, adpt, work, adpt, home, car_passenger, leisure, car_passenger, home",
"home, car, other, car, home, adpt, shop, adpt, home", "home, pt, work, adpt, home",
"home, adpt, work, adpt, home", "home, adpt, work, adpt, home",
"home, walk, education, adpt, home, walk, education, walk, home, bike, leisure, bike, home",
"home, adpt, shop, adpt, home, car, home", "home, adpt, leisure, walk, leisure, walk, leisure, adpt, home",
"home, adpt, shop, pt, home, adpt, other, adpt, home", "home, adpt, other, adpt, home, car_passenger, leisure, walk, home",
"home, adpt, work, adpt, home, car_passenger, shop, car_passenger, home",
"home, adpt, other, adpt, work, adpt, home", "home, adpt, work, adpt, home, adpt, other, walk, shop, walk, home, car, outside, car, outside, car, outside, car, home",
"home, adpt, other, adpt, home", "home, adpt, education, adpt, home, adpt, education, adpt, home",
"home, pt, work, adpt, work, adpt, work, adpt, work, adpt, home, adpt, work, adpt, home",
"home, walk, other, car_passenger, education, walk, home, car_passenger, education, adpt, home",
"home, walk, shop, walk, home, walk, leisure, adpt, leisure, adpt, home",
"home, adpt, work, adpt, home, walk, shop, walk, home, walk, leisure, walk, home, walk, home",
"home, adpt, leisure, adpt, home")), row.names = c(NA, 50L), class = "data.frame")
下面是使用的溶液dplyr
,stringr
和tidyr
(或更好tidyverse
):
df %>%
separate_rows(leg_activity, sep = "home, ") %>%
mutate(leg_activity = str_replace(leg_activity, "(, (home)*)$", "")) %>%
filter(leg_activity != "") %>%
mutate(leg_activity = str_c("home, ", leg_activity, ", home")) %>%
group_by(id) %>%
mutate(n = row_number()) %>%
pivot_wider(names_from = n, values_from = leg_activity)
我想有更好的解决方案,但这应该可以为您提供所需的输出。
> df2
# A tibble: 50 x 5
# Groups: id [50]
id `1` `2` `3` `4`
<int> <chr> <chr> <chr> <chr>
1 9 home, adpt, shop, car_passen~ home, adpt, work, adpt, home home, work, ~ NA
2 10 home, pt, outside, pt, home home, car, leisure, car, other~ home, adpt, ~ NA
3 11 home, work, adpt, home NA NA NA
4 96 home, car, work, car, home home, work, adpt, home NA NA
5 97 home, adpt, work, car_passen~ home, car_passenger, outside, ~ NA NA
6 101 home, bike, outside, car_pas~ home, adpt, leisure, adpt, home home, bike, ~ NA
7 103 home, adpt, work, adpt, home home, walk, other, pt, home NA NA
8 248 home, adpt, work, walk, home home, adpt, work, walk, home NA NA
9 499 home, adpt, leisure, adpt, h~ home, bike, outside, bike, home NA NA
10 1044 home, pt, work, adpt, home home, adpt, work, adpt, home NA NA
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句