I have a simple data frame
d <- data.frame(var1=c(5,5,5),var1_c=c(5,NA,6),var2 =c(6,6,6),var2_c = c(8,6,NA))
with a lots of lines, and a lots of variables, all labeled "varXXX" and "varXXX_c", and I want that everytimes there's a NA in a varXXX_c to replace the NA with the value in the varXXX variable. In short, I want to do :
d[is.na(d$var1_c),"var1_c"] <- d$var1[is.na(d$var1_c)]
but try to find a better way to do this that copy paste and change "1" with the number of the variable.
I would rather find a solution in base R or dplyr, but would be grateful for any help !
We can use grep
to find the column names that start with var
followed by numbers (\\d+
) followed by _
and followed by c
. Similarly, we have another set of logical index for var
followed by one or more numbers (\\d+
) till the end of the string ($
) and then do the subset of columns based on the index and change the NA values (is.na(d[i1])
) to the corresponding elements in 'd[i2]`.
i1 <- grepl("var\\d+_c", names(d))
i2 <- grepl('var\\d+$', names(d))
d[i1][is.na(d[i1])] <- d[i2][is.na(d[i1])]
NOTE: This is based on the assumption that the columns are in the same order.
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments