I know there are many similar questions to this but just can't figure this out.
I want an ifelse
function to go over many columns in a dataframe. I want to add two variables to the dataframe, "C03_only" and "only_c02_and_c09". I am only focused on entries that contain values: "C02 ","C03", "C09".
Example data:
mydf<- data.frame(id=1:4,
x1=c("A02", "C02", "C03", "M01"),
x2=c("B02", "", "C02", "C09"),
x3=c("C03", "C03", "C09", "C02") )
R>mydf
id x1 x2 x3
1 1 A02 B02 C03
2 2 C02 C03
3 3 C03 C02 C09
4 4 M01 C09 C02
The new dataset should look like:
R>mydf
id x1 x2 x3 C03_only only_c02_and_c09
1 1 A02 B02 C03 1 0
2 2 C02 C03 0 0
3 3 C03 C02 C09 0 0
4 4 M01 C09 C02 0 1
I first tried something like this
mydf$C03_only <- with(mydf,ifelse(x1 != "C02" | "C09" & x2 !="C02" | "C09" & x3== "C03",1,0))
which didnt work but the idea is terrible as I have many columns so is a no runner. Similarly I tired something with a for loop
:
mydf$C03_only<-rep(0,nrow(mydf))
for (i in 2:nrow(mydf)){
if (mydf$x1[i]!="C02" && mydf$x2[i]!="C09" && mydf$x3[i]=="C03"){
mydf$C03_only[i]<-1}
}
This also didnt work but (only partially finished) with enough playing with it, it probably would.
I think the best approach is to use apply
function but can't get it working:
mydf$C03_only<- apply(mydf[,-1], MARGIN=1, FUN=function(x){
ifelse(any(x == "C03") & any(x != "C09" & x != "C02") , 1, 0)
}
)
mydf$only_c02_and_c09<- apply(mydf[,-1], MARGIN=1, FUN=function(x){
ifelse(any(x == "C02" & x == "C09") & any(x != "C03") , 1, 0)
}
)
These are close but no cigar. I need to replace any
with something but not sure what. Perhaps pass the variables of interest to a vector and run some conditional statement using %in%
on this but I'm not sure how.
Any suggestions would be great, thanks.
We can apply the conditions by row with. Note: the plus sign connected to paranthetical brackets coerces the from logical to numeric. Example: +(x)
is the same as as.numeric(x)
:
mydf$C03_only <- apply(mydf, 1, function(x) +(any(x=="C03") & all(x != "C02" & x != "C09")))
mydf$only_c02_and_c09 <- apply(mydf, 1, function(x) +(!any(x=="C03") & sum(x == "C02" | x == "C09") >= 2L))
mydf
# id x1 x2 x3 C03_only only_c02_and_c09
# 1 1 A02 B02 C03 1 0
# 2 2 C02 C03 0 0
# 3 3 C03 C02 C09 0 0
# 4 4 M01 C09 C02 0 1
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments