我有一个名称列表,并为这些名称分配了阈值,以确定我是否适当分配了该名称。
您可以使用以下方法重新创建测试数据集:
df <- data.frame(level1 = c("Eukaryota","Eukaryota","Eukaryota","Eukaryota","Eukaryota"),
level2=c("Opisthokonta","Alveolata","Opisthokonta","Alveolata","Alveolata"),
level3=c("Fungi","Ciliophora","Fungi","Ciliophora","Dinoflagellata"),
level4=c("Basidiomycota","Spirotrichea","Basidiomycota","Spirotrichea","Dinophyceae"),
value = c("100;5;4;2", "100;100;100;100", "100;80;60;50", "90;50;40;40","100;80;20;0"))
我想使用整洁的诗歌,mutate()
并case_when()
找到通过适当阈值的分类标准。因此,下面整洁的经文声明将阈值分解,然后尝试执行此操作。我的瓶颈
case_when()
vsifelse()
语句-使用ifelse()可能更合适?level1:level3
语法,这样做会很痛苦!df_updated <- df %>%
separate(value, c("threshold1","threshold2", "threshold3", "threshold4"), sep =";") %>%
mutate(Name_updated = case_when(
threshold4 >= 50 ~ unite(level1:level4, sep = ";"), #Fill with all taxonomic names to level4
threshold4 < 50 & threshold3 >= 60 ~ unite(level1:level3, sep = ";"), #If last threshold is <50, only fill with taxonomic names to level3
threshold4 < 50 & threshold3 < 60 & threshold2 >= 50 ~ unite(level1:level2, sep = ";"), #If thresholds for level 3 and 4 are below, fill only level1;level2
TRUE ~ level1)) %>% #Otherwise fill with only level 1
data.frame
所需的输出
> df_updated$Name_updated
# Output of this new list:
Eukaryota
Eukaryota;Alveolata;Ciliophora;Spirotrichea
Eukaryota;Opisthokonta;Fungi;Basidiomycota
Eukaryota;Alveolata
Eukaryota;Alveolata
理想的下一步是编写一个函数,该函数允许用户指定脚本中使用的阈值。因此,我确实需要进行探测/确定什么阈值可以通过。
问题是unite
,也是type
在中separate
编列。默认情况下,convert = FALSE
它将是一个character
类列
library(dplyr)
library(tidyr)
library(purrr)
library(stringr)
df %>%
type.convert(as.is = TRUE) %>%
separate(value, c("threshold1","threshold2",
"threshold3", "threshold4"), sep =";", convert = TRUE) %>%
mutate(Name_updated =
case_when(
threshold4 >= 50 ~
select(., starts_with('level')) %>%
reduce(str_c, sep=";"),
threshold4 < 50 & threshold3 >= 60 ~
select(., level1:level3) %>%
reduce(str_c, sep=";"),
threshold4 < 50 & threshold3 < 60 & threshold2 >= 50 ~
select(., level1:level2) %>%
reduce(str_c, sep=";"),
TRUE ~ level1))
# level1 level2 level3 level4 threshold1 threshold2 threshold3 threshold4
#1 Eukaryota Opisthokonta Fungi Basidiomycota 100 5 4 2
#2 Eukaryota Alveolata Ciliophora Spirotrichea 100 100 100 100
#3 Eukaryota Opisthokonta Fungi Basidiomycota 100 80 60 50
#4 Eukaryota Alveolata Ciliophora Spirotrichea 90 50 40 40
#5 Eukaryota Alveolata Dinoflagellata Dinophyceae 100 80 20 0
# Name_updated
#1 Eukaryota
#2 Eukaryota;Alveolata;Ciliophora;Spirotrichea
#3 Eukaryota;Opisthokonta;Fungi;Basidiomycota
#4 Eukaryota;Alveolata
#5 Eukaryota;Alveolata
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句