我需要编写R代码的帮助,该代码将:
以下是我正在处理的代码的示例:
# load library
library(dplyr)
# set variables
a <- c("Jenny", "Jenny", "John", "Jenny", "John")
b <- c(1,0,1,0,1)
C <- c(0,1,1,1,0)
# bind into dataframe
dat <- cbind.data.frame(a, b, C)
# subsequent imaginary code joins df to another dataset. The join is supposed to add
# another variable called "d". For whatever reason, d does not exist.
# So dat still only has three variables - a, b & c.
# the script now runs an aggregating function
# but the aggregating function expects four variables - a, b, c & d
dat_A <- dat %>%
group_by(a) %>%
summarise(b_new = sum(b),
c_new = sum(C),
d_new = sum(d))
# because "d" is missing, R returns an error. I need code which will
# detect "d" is missing and create a dummy variable for this variable
# with zero value and bind to dat before aggregating.
检查字符串(表示所需的列名)是否在数据框中,并将其分配为0(如果不是本示例中那样):
x <- data.frame(a = 1:5) # Example data
x
#> a
#> 1 1
#> 2 2
#> 3 3
#> 4 4
#> 5 5
to_check <- c("a", "b", "c") # these are colum names to check for < CHANGE THIS
x[, setdiff(to_check, names(x))] <- 0 # this creates any missing columns as 0
x
#> a b c
#> 1 1 0 0
#> 2 2 0 0
#> 3 3 0 0
#> 4 4 0 0
#> 5 5 0 0
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句