重新排列纵向数据

debugcn 发表于 Dev

Histelheim

我有一个大致这样的数据集：

case Year      2001 2002 2003 2004
1    2003      0    0    0    3
2    2002      0    5    3    2
3    2001      3    3    2    2

我正在尝试对其进行重组，以使每一列都代表从“年”变量开始计算的第一年，第二年（等），即：

case Year      yr1  yr2  yr3 yr4
1    2003      0    3    0    0 
2    2002      5    3    2    0
3    2001      3    3    2    2

此代码下载数据集并尝试使用@akrun建议的解决方案，但失败。

library("devtools")
df1 <- source_gist("b4c44aa67bfbcd6b72b9")

df1[-(1:2)] <- do.call(rbind,lapply(seq_len(nrow(df1)), function(i) {x <- df1[i, ]; x1 <- unlist(x[-(1:2)]); indx <- which(!is.na(x1))[1]; i <- as.numeric(names(indx))-x[,2]+1; x2 <- x1[!is.na(x1)]; x3 <- rep(NA, length(x1)); x3[i:(i+length(x2)-1)]<- x2; x3}))

这将产生：

Error in i:(i + length(x2) - 1) : NA/NaN argument
In addition: Warning message:
In FUN(1:234[[1L]], ...) : NAs introduced by coercion

如何转换数据，使每一列代表从每一行的“ Year”变量中的值开始计算的第一年，第二年（等等）？

dimitris_ps

这将创建您要查找的操作。

library("devtools")
df1 <- source_gist("b4c44aa67bfbcd6b72b9")
temp <- df1[[1]]

library(dplyr); library(tidyr); library(stringi) 

temp <- temp %>% 
  gather(new.Years, X, -Year) %>%  # convert rows to one column
  mutate(Year.temp=paste0(rownames(temp), "-", Year)) %>% # concatenate the Year with row number to make them unique
  mutate(new.Years = as.numeric(gsub("X", "", new.Years)), diff = new.Years-Year+1) %>% # calculate the difference to get the yr0 yr1 and so on
  mutate(diff=paste0("yr", stri_sub(paste0("0", (ifelse(diff>0, diff, 0))), -2, -1))) %>% # convert the differences in Yr01 ...
  select(-new.Years) %>% filter(diff != "yr00") %>% # drop new.Years column
  spread(diff, X) %>%  # convert column to rows
  select(-Year.temp) # Drop Year.temp column

temp[is.na(temp)] <- 0 # replace NA with 0

temp %>% View

请注意，这可以使用长达99年。

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。