R将列名称串联到新列中，同时按其值排序

debugcn 发表于 Dev

芳基

我正在尝试连接一个字符串，该字符串通过它们的值标识列的顺序。

set.seed(100)

df <- tibble(id = 1:5,
             col1 = sample(1:50, 5),
             col2 = sample(1:50, 5),
             col3 = sample(1:50, 5)) %>% 
  mutate_at(vars(-id), ~if_else(. <= 20, NA_integer_, .))

# A tibble: 5 x 4
     id  col1  col2  col3
  <int> <int> <int> <int>
1     1    NA    44    NA
2     2    38    23    34
3     3    48    22    NA
4     4    25    NA    48
5     5    NA    NA    43

res <- df %>% 
  add_column(order = c('col2',
                       'col2_col3_co1',
                       'col2_col1',
                       'col1_col3',
                       'col3'))

# A tibble: 5 x 5
     id  col1  col2  col3 order        
  <int> <int> <int> <int> <chr>        
1     1    NA    44    NA col2         
2     2    38    23    34 col2_col3_co1
3     3    48    22    NA col2_col1    
4     4    25    NA    48 col1_col3    
5     5    NA    NA    43 col3

我当前的数据是df格式，而我要添加的列是res中的order列。字符串中元素的顺序由每列的值确定，并且还需要跳过NA。我正在尝试确定每个ID在每一列中填充值的顺序，因为这些值是以天为单位的时间。但是，并非所有ID的所有列都具有值，因此，整个值中都缺少值。我通常在tidyverse内工作，但是任何解决方案或想法都将不胜感激。

阿克伦

一个更简单的选择是apply，在行（MARGIN = 1）上循环，删除NA元素，order其余非NA，使用索引来获取列名和paste它们在一起

df$order <- apply(df[-1], 1, function(x) {x1 <- x[!is.na(x)]
           paste(names(x1)[order(x1)], collapse="_")})
df$order
#[1] "col2"           "col2_col3_col1" "col2_col1"      "col1_col3"      "col3"

或使用 tidyverse

library(dplyr)
library(tidyr)
library(stringr)
df %>%
   pivot_longer(cols = -id, values_drop_na = TRUE) %>%
   arrange(id,  value) %>%
   group_by(id) %>%
   summarise(order = str_c(name, collapse="_")) %>% 
   right_join(df) %>%
   select(names(df), order)
# A tibble: 5 x 5
#     id  col1  col2  col3 order         
#  <int> <int> <int> <int> <chr>         
#1     1    NA    44    NA col2          
#2     2    38    23    34 col2_col3_col1
#3     3    48    22    NA col2_col1     
#4     4    25    NA    48 col1_col3     
#5     5    NA    NA    43 col3

或使用pmap从purrr

library(purrr)
df %>% 
   mutate(order = pmap_chr(select(., starts_with('col')), ~
         {x <- c(...)
         x1 <- x[!is.na(x)]
         str_c(names(x1)[order(x1)], collapse="_")}))

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。