I am working R program that needs to take the mean values of all the results of the same experiment. For instance, there are two experiments, respectively experiment 1 and experiment 2. Experiment 1 has three results per row, and experiment 2 has two results per row. The program should calculate the mean results of experiment 1 and the mean results of experiment 2.
cols<-c('experiment 1 result 1','experiment 1 result 2','experiment 1 result 3','experiment 2 result 1','experiment 2 result 2')
df <- data.frame(matrix(ncol = 5, nrow = 1))
colnames(df)<-cols
df[1,]<-c(1,3,2,2,4)
In the case of the given example the output should be the following dataframe:
cols<-c('experiment 1','experiment 2')
df <- data.frame(matrix(ncol = 2, nrow = 1))
colnames(df)<-cols
df[1,]<-c(2,3)
Depend on the situation the number of experiments, and the number of results per experiment can vary. I am, therefore looking for a generic approach to solve this. Is there somebody who could help me with this?
Thank you in advance.
Keep only the "experiment" number from the column names :
sub(' result \\d+', '', names(df))
#[1] "experiment 1" "experiment 1" "experiment 1" "experiment 2" "experiment 2"
Use this as a grouping variable in tapply
to get :
tapply(unlist(df), sub(' result \\d+', '', names(df)), mean)
#experiment 1 experiment 2
# 2 3
For more than 1 row we can use split.default
:
sapply(split.default(df, sub(' result \\d+', '', names(df))), rowMeans)
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments