我试图根据数据框中的几个条件来标记数据中的某些行。
我的数据如下所示:
X <- structure(list(Website = c("www.something.at", "www.something.nl", "www.something.ch", "www.something.dk", "www.something.at"),
Country = c("German", "Netherlands", "German", "Denmark", "Austria")),
.Names = c("Website", "Country"), row.names = c(NA, 10L), class = "data.frame")
我需要做的是添加一个新列,该列根据某些条件将数据标记在新列中。因此,在国家/地区等于德语的地方,我需要查看网站URL,并使用IF函数将其标记为其他国家/地区名称。即奥地利或瑞士。
我已经深入到下面,我希望我缺少一些非常简单的东西,但是该代码可以很好地标记瑞士,但是在所有其他情况下,所有内容都标记为奥地利。
for(i in 1:nrow(X)){
if(length(grep("German", X$Country[i]))>0)
if(length(grep("\\.at$", X$Website[i]))>0)
X$Website_2[i] <- "Austria"
else
if(length(grep("\\.ch$", X$Website[i]))>0)
X$Website_2[i] <- "Switzerland"
}
任何帮助,不胜感激!
您正在寻找类似这样的东西吗?(顺便说一句,您的dput似乎有问题,它指出有10行,但是只有5个值,因此我也在这里进行了更改。
> X <- structure(list(Website = c("www.something.at", "www.something.nl", "www.something.ch", "www.something.dk", "www.something.at"),
+ Country = c("German", "Netherlands", "German", "Denmark", "Austria")),
+ .Names = c("Website", "Country"), row.names = c(NA, 5L), class = "data.frame")
>
>
#we use upper to make it robust against multiple capitalization schemes
#instead of nesting another ifelse, we use the fact that we can add to logical values
# and use the returned number to index into out country vector.
> X<-within(X,
+ cleanCountry <- ifelse(toupper(Country)=="GERMAN",
+ c("Switzerland", "Austria")[1+grepl("\\.at", Website)],
+ Country))
> X
Website Country cleanCountry
1 www.something.at German Austria
2 www.something.nl Netherlands Netherlands
3 www.something.ch German Switzerland
4 www.something.dk Denmark Denmark
5 www.something.at Austria Austria
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句