如何将字符数据转换为R中的矩阵格式？

debugcn 发表于 Dev

是的

我有以下要转换的数据框。当前，它看起来像这样：

 ID
              Items                                       items.split
1   2729    Bicycle                                             Bicycle
2   3979    TV, Mobile Phone, Bicycle, Water Tank               c("TV", "Mobile Phone", "Bicycle", "Water Tank")
3   3860    Mobile Phone, Bicycle, Fan                          c("Mobile Phone", "Bicycle", "Fan")
4   2357    Mobile Phone, Motorbike                             c("Mobile Phone", "Motorbike")
5   2278    TV, Mobile Phone, Wagon/Cart, Motorbike, Plow       c("TV", "Mobile Phone", "Wagon/Cart", "Motorbike", "Plow")
6   3277    TV, Mobile Phone, Bicycle, Motorbike, Fan           c("TV", "Mobile Phone", "Bicycle", "Motorbike", "Fan")
7   3501    Mobile Phone, Bicycle, Water Tank                   c("Mobile Phone", "Bicycle", "Water Tank")
8   3880    Tractor, Mobile Phone, Wagon/Cart, Motorbike, Plow  c("Tractor", "Mobile Phone", "Wagon/Cart", "Motorbike", "Plow")
9   3207    DVD Player, Bicycle, Plow                           c("DVD Player", "Bicycle", "Plow")
10  3928    TV, Mobile Phone, Bicycle, Fan                      c("TV", "Mobile Phone", "Bicycle", "Fan")

我想将上面的数据框转换为以下格式：

       Bicycle    TV      Mobile Phone    Water Tank [etc...]
2729     1         0       0                 0
3979     1         1       1                 1
3860 .   1         0       1                 0
[etc...]

我不经常使用字符串或字符，因此我一直在搞清楚如何items.split特别地操纵变量。我看过这样的问题这样，但我不想词的频率计数，而是对频率计数附加到每个ID。因此，我认为我正在努力的工作是将类似于将一个频率命令与每个ID链接FreqMat在一起的简单dplyr命令集成在一起。

任何帮助是极大的赞赏。数据如下。

structure(list(ID = c(2729L, 3979L, 3860L, 2357L, 2278L, 3277L, 
3501L, 3880L, 3207L, 3928L), Items = c("Bicycle", "TV, Mobile Phone, Bicycle, Water Tank", 
"Mobile Phone, Bicycle, Fan", "Mobile Phone, Motorbike", "TV, Mobile Phone, Wagon/Cart, Motorbike, Plow", 
"TV, Mobile Phone, Bicycle, Motorbike, Fan", "Mobile Phone, Bicycle, Water Tank", 
"Tractor, Mobile Phone, Wagon/Cart, Motorbike, Plow", "DVD Player, Bicycle, Plow", 
"TV, Mobile Phone, Bicycle, Fan"), items.split = list("Bicycle", 
    c("TV", "Mobile Phone", "Bicycle", "Water Tank"), c("Mobile Phone", 
    "Bicycle", "Fan"), c("Mobile Phone", "Motorbike"), c("TV", 
    "Mobile Phone", "Wagon/Cart", "Motorbike", "Plow"), c("TV", 
    "Mobile Phone", "Bicycle", "Motorbike", "Fan"), c("Mobile Phone", 
    "Bicycle", "Water Tank"), c("Tractor", "Mobile Phone", "Wagon/Cart", 
    "Motorbike", "Plow"), c("DVD Player", "Bicycle", "Plow"), 
    c("TV", "Mobile Phone", "Bicycle", "Fan"))), row.names = c(NA, 
10L), class = "data.frame")

罗纳克·沙

你可以使用cSplit_e从splitstackshape

splitstackshape::cSplit_e(df, "Items", type = "character", fill = 0, drop = TRUE)


#     ID                                        items.split Items_Bicycle Items_DVD Player Items_Fan
#1  2729                                            Bicycle             1                0         0
#2  3979              TV, Mobile Phone, Bicycle, Water Tank             1                0         0
#3  3860                         Mobile Phone, Bicycle, Fan             1                0         1
#4  2357                            Mobile Phone, Motorbike             0                0         0
#5  2278      TV, Mobile Phone, Wagon/Cart, Motorbike, Plow             0                0         0
#6  3277          TV, Mobile Phone, Bicycle, Motorbike, Fan             1                0         1
#7  3501                  Mobile Phone, Bicycle, Water Tank             1                0         0
#8  3880 Tractor, Mobile Phone, Wagon/Cart, Motorbike, Plow             0                0         0
#9  3207                          DVD Player, Bicycle, Plow             1                1         0
#10 3928                     TV, Mobile Phone, Bicycle, Fan             1                0         1

#   Items_Mobile Phone Items_Motorbike Items_Plow Items_Tractor Items_TV Items_Wagon/Cart Items_Water Tank
#1                   0               0          0             0        0                0                0
#2                   1               0          0             0        1                0                1
#3                   1               0          0             0        0                0                0
#4                   1               1          0             0        0                0                0
#5                   1               1          1             0        1                1                0
#6                   1               1          0             0        1                0                0
#7                   1               0          0             0        0                0                1
#8                   1               1          1             1        0                1                0
#9                   0               0          1             0        0                0                0
#10                  1               0          0             0        1                0                0

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。