我有以下 df
df <- structure(list(position = c(44188968, 44188969, 44188970, 44188975,
44188977, 44188978), code1 = c(1, 0, 1, 0, 0, 1)), class = "data.frame", row.names = c(NA,
-6L))
>df
position code1
44188968 1
44188969 0
44188970 1
44188975 0
44188977 0
44188978 1
当满足以下条件时,我想添加另一列code2
(1
如果为true,0
否则):
position
,检查对方positions
是否位于+/- 3处。如果为true,则另一个position
必须具有code1 = 1
。然后,我将获得以下内容
position code1 code2
44188968 1 1
44188969 0 1
44188970 1 1
44188975 0 0
44188977 0 1
44188978 1 0
您能指导我如何取得这样的桌子吗?
编辑:我忘了提及我的数据包含NA
值
position code1
44188968 1
44188969 0
44188970 1
44188975 0
44188977 0
44188978 1
NA 1
NA 0
44189323 NA
如果是NA
值,code2
也NA
EDIT2:按照@jazzurro的要求,我正在为我的数据提供所有可能的模式
df <- structure(list(position = c(44188968, 44188969, 44188970, 44188975,
44188977, 44188979, 44188980, 44189323, 44189324, 44189328, 44189330,
44189334), code1 = c(1, 0, 1, 0, 0, 1, NA, NA, 1, NA, NA, NA)), class =
"data.frame", row.names = c(NA,
-12L))
>df
position code1
44188968 1
44188969 0
44188970 1
44188975 0
44188977 0
44188979 1
44188980 NA
44189323 NA
44189324 1
44189328 NA
44189330 NA
44189334 NA
所需的输出如下:
position code1 code2 # explanations
44188968 1 1 # code2 is 1 because 44188970 falls in the window of +/- 3 and code1 of 44188970 is 1. code1 of 44188969 is 0 so it is not taking into account.
44188969 0 1 # code2 is 1 because 44188968 or 44188970 falls in the window of +/- 3 and code1 of 44188968 or 44188970 is 1.
44188970 1 1 # code2 is 1 because 44188968 falls in the window of +/- 3 and code1 of 44188968 is 1.
44188975 0 0 # code2 is 0 because 44188977 falls in the window of +/- 3 but code1 of 44188977 is 0.
44188977 0 1 # code2 is 1 because 44188978 falls in the window of +/- 3 and code1 of 44188978 is 1. code1 of 44188975 is 0 so it is not taking into account.
44188979 1 0 # code2 is 0 because 44188977 falls in the window of +/- 3 but code1 of 44188977 is 0. code1 of 44188980 is NA so it is not taking into account.
44188980 NA 1 # code2 is 1 because 44188977 falls in the window of +/- 3 and code1 of 44188977 is 0.
44189323 NA 1 # code2 is 1 because 44189324 falls in the window of +/- 3 and code1 of 44189324 is 1.
44189324 1 0 # code2 is 0 because 44189323 falls in the window of +/- 3 but code1 of 44189323 is NA.
44189328 NA 0 # code2 is 0 because 44189330 falls in the window of +/- 3 but code1 of 44189330 is NA.
44189330 NA 0 # code2 is 0 because nothing falls in the window of +/- 3.
44189334 NA 0 # code2 is 0 because nothing falls in the window of +/- 3.
先感谢您。
这是我的尝试。鉴于以上所述,您所讨论的范围是+/- 2(不包括3)。我创建了两个数值向量,用于标识每个位置的正负2范围。然后,我进行了逻辑检查。检查是否有任何位置编号保留在每个范围内,并且代码等于1。然后,我取消嵌套该列表,check
并创建了一个名为的新列dum_position
。我提取了在position
和中没有相同数字的行dum_position
,并且check
为TRUE。到这个时候,position
我们想要找到的数字。
library(tidyverse)
mutate(df, check = map2(.x = position - 2,
.y = position + 2,
.f = function(x, y) {between(position, x, y) & code1 == 1})) %>%
unnest(check) %>%
group_by(position) %>%
mutate(dum_position = df$position) %>%
filter(position != dum_position & check == TRUE) %>%
distinct(position) %>%
unlist -> mynums
# Add 1 to the rows that have one of the numbers in mynums
mutate(df, code2 = if_else(position %in% mynums, 1, 0))
# position code1 code2
#1 44188968 1 1
#2 44188969 0 1
#3 44188970 1 1
#4 44188975 0 0
#5 44188977 0 1
#6 44188978 1 0
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句