r - Calculated mean and sum values group by the first row

user7987850

I have a dataframe, I would like to calculate all the mean values of x and all the sum of y group by the first row of the dateframe.

The dateframe to be calculate

The following link is the result I want. The result expected

Here are the data.

dt=structure(list(year = structure(c(5L, 1L, 2L, 3L, 4L), .Label = c("1980", 
    "1981", "1982", "1985", "group"), class = "factor"), x1 = structure(c(4L, 
    1L, 3L, 2L, 1L), .Label = c("1", "2", "4", "A"), class = "factor"), 
        y1 = structure(c(4L, 1L, 3L, 2L, 2L), .Label = c("1", "3", 
        "5", "A"), class = "factor"), x2 = structure(c(5L, 1L, 4L, 
        3L, 2L), .Label = c("2", "4", "5", "6", "A"), class = "factor"), 
        y2 = structure(c(4L, 1L, 3L, 3L, 2L), .Label = c("3", "5", 
        "7", "A"), class = "factor"), x3 = structure(c(4L, 1L, 3L, 
        2L, 1L), .Label = c("4", "6", "8", "B"), class = "factor"), 
        y3 = structure(c(4L, 1L, 3L, 2L, 1L), .Label = c("3", "5", 
        "6", "B"), class = "factor"), x4 = structure(c(4L, 1L, 3L, 
        2L, 3L), .Label = c("2", "4", "5", "C"), class = "factor"), 
        y4 = structure(c(5L, 1L, 2L, 3L, 4L), .Label = c("3", "4", 
        "5", "6", "C"), class = "factor"), x5 = structure(c(5L, 2L, 
        1L, 3L, 4L), .Label = c("3", "4", "6", "7", "C"), class = "factor"), 
        y5 = structure(c(4L, 2L, 1L, 3L, 2L), .Label = c("2", "5", 
        "8", "C"), class = "factor")), class = "data.frame", row.names = c(NA, 
    -5L))

And result expected,

result_expected <- structure(list(year = c(1980L, 1981L, 1982L, 1985L), A_x_mean = c(1.5, 
5, 3.5, 2.5), A_y_sum = c(4L, 12L, 10L, 8L), B_x_mean = c(4L, 
8L, 6L, 4L), B_y_sum = c(3L, 6L, 5L, 3L), C_x_mean = 3:6, C_y_sum = c(8L, 
6L, 13L, 11L)), class = "data.frame", row.names = c(NA, -4L))

I have search key words in goole and stackoverflow, but no proper answers. My current thinking is to calculate unique group A,B,C in first row.

require(tidyverse)
group_variables <- dt%>%gather(key,value)%>%distinct(value)%>%arrange(value)

then get the row in group_variables by the for

for i in group_variables{......}

or can I change the structure of the dataframe by gathe and spread in tidyr,and by dplyr method, something just like following code,

dt_new%>% group_by (group)%>%
          summarise(mean=mean(x,na.rm=TRUE),
          sum=sum(x,na.rm=TURE))
StupidWolf

First we need to take out the first row having the group, make the data frame long, simplify x1,x2,x3 to x etc and put the groups back:

group_var = sapply(dt[1,-1],as.character)
mat <- 
dt[-1,] %>% pivot_longer(-year) %>% 
   mutate(value=as.numeric(as.character(value))) %>% 
   mutate(group=as.character(group_var[as.character(name)])) %>% 
   mutate(name=substr(name,1,1))

mat
# A tibble: 40 x 4
   year  name  value group
   <fct> <chr> <dbl> <chr>
 1 1980  x         1 A    
 2 1980  y         1 A    
 3 1980  x         2 A    
 4 1980  y         3 A    
 5 1980  x         4 B    
 6 1980  y         3 B    
 7 1980  x         2 C    
 8 1980  y         3 C    
 9 1980  x         4 C    
10 1980  y         5 C   

Now what's left is to group them according to year, name and group and do the respective function, so we define a function:

func = function(DF,func){
DF %>% 
   group_by(group,name,year) %>% 
   summarise_all(func) %>%
   mutate(label=paste(group,name,func,sep="_")) %>%
   ungroup %>%
   select(year,value,label) %>%
   pivot_wider(values_from=value,names_from=label)
}

And we apply it over two parts of the data:

cbind(func(mat %>% filter(name=="x"),"mean"),func(mat %>% filter(name=="y"),"sum"))

year A_x_mean B_x_mean C_x_mean year A_y_sum B_y_sum C_y_sum
1 1980      1.5        4        3 1980       4       3       8
2 1981      5.0        8        4 1981      12       6       6
3 1982      3.5        6        5 1982      10       5      13
4 1985      2.5        4        6 1985       8       3      11

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

R insert row with mean after group of values

From Dev

Group and sum by row in R

From Dev

Calculate mean of calculated values

From Dev

Calculated column with the sum of values from many columns in a row

From Dev

R: Group in sparklyr ("sum", "count distinct" , "mean")

From Dev

sum values from a previous row by group

From Dev

SQL: Group by Case Statement to sum Row Values

From Dev

How to return values only for the first row in a group

From Dev

Group rows into a new row and sum in r

From Dev

row and column matrix sum in R by group

From Dev

R First Row By Group When Condition Is Met

From Dev

R how Sum values by group by date

From Dev

R - Sum row values in a specific consecutive fashion

From Dev

R - sum every two rows and divide by the first row in that sum

From Dev

How to find mean across rows, grouped by first row values?

From Dev

Calculate sum and average of a column in a pyspark dataframe and create a new row for the calculated values

From Dev

Calculated mean column for a group transposed in a dataframe

From Dev

R compute mean and sum of value in dataframe using group_by

From Dev

R: group_id by changing row values

From Dev

Get sum of values from last nth row by group id

From Dev

Don't show row based on condition but sum values in parent group

From Dev

Mysql Query sum and group by not showing a row with null values

From Dev

Cumulative sum of first occurence of consecutive True values in a group in Pandas

From Dev

Row-wise sum of first, second and third highest values

From Dev

Pandas: Sum of first N non-missing values per row

From Dev

Clear the variables except first row in a group after separate row R

From Dev

How to adding row based on the information of first row of each group in R

From Dev

Use the current calculated value from a group by sum to calculate another value in R using data.table

From Dev

Summarise to get mean and first value of group by in data.table r

Related Related

  1. 1

    R insert row with mean after group of values

  2. 2

    Group and sum by row in R

  3. 3

    Calculate mean of calculated values

  4. 4

    Calculated column with the sum of values from many columns in a row

  5. 5

    R: Group in sparklyr ("sum", "count distinct" , "mean")

  6. 6

    sum values from a previous row by group

  7. 7

    SQL: Group by Case Statement to sum Row Values

  8. 8

    How to return values only for the first row in a group

  9. 9

    Group rows into a new row and sum in r

  10. 10

    row and column matrix sum in R by group

  11. 11

    R First Row By Group When Condition Is Met

  12. 12

    R how Sum values by group by date

  13. 13

    R - Sum row values in a specific consecutive fashion

  14. 14

    R - sum every two rows and divide by the first row in that sum

  15. 15

    How to find mean across rows, grouped by first row values?

  16. 16

    Calculate sum and average of a column in a pyspark dataframe and create a new row for the calculated values

  17. 17

    Calculated mean column for a group transposed in a dataframe

  18. 18

    R compute mean and sum of value in dataframe using group_by

  19. 19

    R: group_id by changing row values

  20. 20

    Get sum of values from last nth row by group id

  21. 21

    Don't show row based on condition but sum values in parent group

  22. 22

    Mysql Query sum and group by not showing a row with null values

  23. 23

    Cumulative sum of first occurence of consecutive True values in a group in Pandas

  24. 24

    Row-wise sum of first, second and third highest values

  25. 25

    Pandas: Sum of first N non-missing values per row

  26. 26

    Clear the variables except first row in a group after separate row R

  27. 27

    How to adding row based on the information of first row of each group in R

  28. 28

    Use the current calculated value from a group by sum to calculate another value in R using data.table

  29. 29

    Summarise to get mean and first value of group by in data.table r

HotTag

Archive