对于每一行，哪些列包含值

debugcn 发表于 Dev

Froom2

我想查看每个响应的许多列，如果这些列中只有一个包含特定字符串，则将该列名称放入新列中。

示例数据框：

data <- structure(list(ParticipantID = 1:5, Usual = c("Pear", "Pear", 
"Apple", NA, NA), Pear_Freq = c("3 or more times a week", "3 or more times a week", 
"Once a week", "Once a week", "3 or more times a week"), Apple_Freq = c("Never", 
"Once a week", "3 or more times a week", "Never", "3 or more times a week"
), Peach_Freq = c("Once a week", "Never", "Never", "3 or more times a week", 
"Once a week")), .Names = c("ParticipantID", "Usual", "Pear_Freq", 
"Apple_Freq", "Peach_Freq"), class = "data.frame", row.names = c(NA, 
-5L))

因此，我希望能够摆脱它的是包含以下内容的新列：

ParticipantID   Newcol
1               Pear
2               Pear
3               Apple
4               Peach
5               NA

（作为检查人们的话语和所作所为是否匹配的一种方法，并在“常规”列中填写空白）

到目前为止，我有一些代码将计数添加到新的列中，这样我就可以选择每周仅在一列中勾选3次或更多（而不是2或0）的人：

test$tempcol <- NA
test$tempcol <- apply(test[,Freqcols], 1, function(x) sum(grepl("3 or more times a week", x)))

（我觉得我不需要用grepl这个，因为我确实希望匹配整个单元格而不是模式）

然后，我尝试使用来获取每个被调查者包含“每周3次或更多次”的列的索引，如下所示：

which(apply(test, 1, function(x) any(grepl("3 or more times a week", x))))

但是，当然，这只是告诉我，每个人至少每周讲3次或更多次。

然后我希望使用它来将列标题的Fruit位置粘贴到一个新的单元格中，但是我对如何实际到达该位置有点迷惑：（任何建议将不胜感激。

塔拉特

您可以尝试以下方法：

data$newcol <- apply(data[3:5], 1, function(x) 
   ifelse(length(which(x == "3 or more times a week")) != 1, NA,
   unlist(strsplit(names(data[3:5])[which(x == "3 or more times a week")], "_")))[1])

#  ParticipantID Usual              Pear_Freq             Apple_Freq             Peach_Freq newcol
#1             1  Pear 3 or more times a week                  Never            Once a week   Pear
#2             2  Pear 3 or more times a week            Once a week                  Never   Pear
#3             3 Apple            Once a week 3 or more times a week                  Never  Apple
#4             4  <NA>            Once a week                  Never 3 or more times a week  Peach
#5             5  <NA> 3 or more times a week 3 or more times a week            Once a week   <NA>

您将开始检查响应的频率为“每周3次或多次”，which如果响应出现的次数多于或少于一次，您将返回NA。如果仅发生一次，which则将告诉您发生它的列的索引，并names(data[3:5])用于查找匹配的列名。要仅获得名称的水果位，请将其除以“ _”（unlist结果列表），并仅使用其第一位写入新列。