我有一个数据框,如下所示:
TIMEdbMerge CopyNumber Study Sample HRE
TC015II NA TC015 II neg
TC015III 0 NA NA NA
TC015III NA TC015 III neg
TC015Quadrantic NA TC015 Quadrantic 24
TC016I NA TC016 I NA
TC016II 1 NA NA NA
TC016II NA TC016 II neg
TC016Quadrantic NA TC016 Quadrantic 6
TC017I NA TC017 I NA
TC017II 3 NA NA NA
TC017II NA TC017 II +
正是由于复杂的合并,我没有时间弄清楚。解决方法是,我只想合并重复的行,以使行中的实际值替换一对重复项的NA,因此结果应类似于:
TIMEdbMerge CopyNumber Study Sample HRE
TC015II NA TC015 II neg
TC015III 0 TC015 III neg
TC015 NA TC015 Q 24
TC016I NA TC016 I NA
TC016II 1 TC016 II neg
TC016Quadrantic NA TC016 Quadrantic 6
TC017I NA TC017 I NA
TC017II 3 TC017 II +
我知道如何删除重复的行,但是我不知道如何告诉r合并重复的行,但是仅当重复的任一行中的值都不为NA时才使用该值。我应该使用骨料吗?
我们可以na.locf
用来填充每个组('TIMEdbMerge')中'CopyNumber'的非NA元素的NA元素ave
。然后删除具有NA
“研究”,“样本”,“ HRE”列的所有元素的行
library(zoo)
df1$CopyNumber <- with(df1, ave(CopyNumber, TIMEdbMerge,
FUN=function(x) na.locf(x, na.rm=FALSE)))
df1[rowSums(is.na(df1[3:5]))!=3,]
# TIMEdbMerge CopyNumber Study Sample HRE
#1 TC015II NA TC015 II neg
#3 TC015III 0 TC015 III neg
#4 TC015Quadrantic NA TC015 Quadrantic 24
#5 TC016I NA TC016 I <NA>
#7 TC016II 1 TC016 II neg
#8 TC016Quadrantic NA TC016 Quadrantic 6
#9 TC017I NA TC017 I <NA>
#11 TC017II 3 TC017 II +
或使用原始数据集left_join
(或merge
从中获取base R
原始数据集)与仅包含“ CopyNumber”的非NA行的数据集的子集,然后如上所述,filter
取出属于3列的NA的行。
library(dplyr)
left_join(df1, filter(df1, !is.na(CopyNumber)) %>%
select(1:2),
by='TIMEdbMerge') %>%
select(-2) %>%
filter(rowSums(is.na(.[2:4]))!=3)
df1 <- structure(list(TIMEdbMerge = c("TC015II", "TC015III",
"TC015III",
"TC015Quadrantic", "TC016I", "TC016II", "TC016II", "TC016Quadrantic",
"TC017I", "TC017II", "TC017II"), CopyNumber = c(NA, 0L, NA, NA,
NA, 1L, NA, NA, NA, 3L, NA), Study = c("TC015", NA, "TC015",
"TC015", "TC016", NA, "TC016", "TC016", "TC017", NA, "TC017"),
Sample = c("II", NA, "III", "Quadrantic", "I", NA, "II",
"Quadrantic", "I", NA, "II"), HRE = c("neg", NA, "neg", "24",
NA, NA, "neg", "6", NA, NA, "+")), .Names = c("TIMEdbMerge",
"CopyNumber", "Study", "Sample", "HRE"), class = "data.frame",
row.names = c(NA, -11L))
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句