使用以下数据集
structure(list(...1 = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12), V1 = c("overstress", "flicker", "lotteri", "life",
"charg", "capac", "health", "drain", "degrad", "protector", "bright",
"use", "overstress", "flicker", "lotteri", "life", "charg", "capac",
"health", "drain", "degrad", "protector", "bright", "use", "overstress",
"flicker", "lotteri", "life", "charg", "capac", "health", "drain",
"degrad", "protector", "bright", "use"), term = c("corr1", "corr1",
"corr1", "corr1", "corr1", "corr1", "corr1", "corr1", "corr1",
"corr1", "corr1", "corr1", "corr2", "corr2", "corr2", "corr2",
"corr2", "corr2", "corr2", "corr2", "corr2", "corr2", "corr2",
"corr2", "corr3", "corr3", "corr3", "corr3", "corr3", "corr3",
"corr3", "corr3", "corr3", "corr3", "corr3", "corr3"), correlation = c(0.5,
0.43, 0.42, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0.53,
0.29, 0.25, 0.25, 0.23, 0.2, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, 0.45, 0.16, 0.15)), row.names = c(NA, -36L), class = c("tbl_df",
"tbl", "data.frame"))
我想将单词是corr1,corr2或corr3更改为toil1,toil2或toil3。我尝试了以下代码,但仅收到以下错误术语:
three_terms_corrs_gathered$term <- if
(three_terms_corrs_gathered$term == "corr1"){toil1} else if
(three_terms_corrs_gathered$term == "corr2"){toil2} else
{toil3}
警告信息:
在if(three_terms_corrs_gathered $ term ==“ corr1”){中,条件的长度> 1,并且仅使用第一个元素。因此,它只会更改为第一个条件。我究竟做错了什么?
三种选择:
“合并”心态。当您有多个完全不同的匹配项时,这非常有效,因为它不仅对代码有效,而且易于可视化和维护。尽管此处的示例只有两个替换,但是如果corrs_df
有2行或200行,则代码不会更改,并且corrs_df
匹配任何内容的条目都将被静默丢弃,不会造成任何危害。
library(dplyr)
corrs_df <- data.frame(term = c("corr1", "corr2"), newterm = c("toil1", "toil2"))
dat %>%
left_join(corrs_df, by = "term") %>%
slice(c(1:3, 28:30))
# # A tibble: 6 x 5
# ...1 V1 term correlation newterm
# <dbl> <chr> <chr> <dbl> <chr>
# 1 1 overstress corr1 0.5 toil1
# 2 2 flicker corr1 0.43 toil1
# 3 3 lotteri corr1 0.42 toil1
# 4 4 life corr3 NA <NA>
# 5 5 charg corr3 NA <NA>
# 6 6 capac corr3 NA <NA>
dat %>%
left_join(corrs_df, by = "term") %>%
mutate(term = coalesce(newterm, term)) %>%
slice(c(1:3, 28:30))
# # A tibble: 6 x 5
# ...1 V1 term correlation newterm
# <dbl> <chr> <chr> <dbl> <chr>
# 1 1 overstress toil1 0.5 toil1
# 2 2 flicker toil1 0.43 toil1
# 3 3 lotteri toil1 0.42 toil1
# 4 4 life corr3 NA <NA>
# 5 5 charg corr3 NA <NA>
# 6 6 capac corr3 NA <NA>
您显然可以%>% select(-newterm)
。)coalesce
函数有效地说“给我这些变量的第一个非NA值”。在NA
当相关联的在newterm发生term
变量中不存在corrs_df
,其中我们假设的手段,使没有变化。
dplyr::case_when
。(如果您喜欢它,那么data.table::fcase
实际上会做同样的事情。)
dat %>%
mutate(
term = case_when(
term == "corr1" ~ "toil1",
term == "corr2" ~ "toil2",
TRUE ~ term)
) %>%
slice(c(1:3, 28:30))
# # A tibble: 6 x 4
# ...1 V1 term correlation
# <dbl> <chr> <chr> <dbl>
# 1 1 overstress toil1 0.5
# 2 2 flicker toil1 0.43
# 3 3 lotteri toil1 0.42
# 4 4 life corr3 NA
# 5 5 charg corr3 NA
# 6 6 capac corr3 NA
嵌套的ifelse
。实际上,由于您使用的dplyr
,这是很多更好的使用if_else
有很多原因(例如,这个)。
dat %>%
mutate(
term = if_else(term == "corr1", "toil1",
if_else(term == "corr2", "toil2", term))
) %>%
slice(c(1:3, 28:30))
# # A tibble: 6 x 4
# ...1 V1 term correlation
# <dbl> <chr> <chr> <dbl>
# 1 1 overstress toil1 0.5
# 2 2 flicker toil1 0.43
# 3 3 lotteri toil1 0.42
# 4 4 life corr3 NA
# 5 5 charg corr3 NA
# 6 6 capac corr3 NA
这对于1或2个嵌套效果很好,但在我看来,它看起来很杂乱,很难遵循;根据我的经验,由于难以遵循,因此可能难以维护,这使得特定选项/值的不正确放置非常简单。我认为可维护性和可读性非常重要。
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句