Create column with grouped values based on another column

JoeF

I'm sure this has been asked before, but I don't know what to search for, so I apologise in advance.

Let's say that I have the following data frame:

grades <- data.frame(a = 1:40, b = sample(45:100, 40))

Using deplyr, I want to create a new variable that indicates the grade the student received, based on the following criteria: 90-100 = excellent, 80-90 = very good, etc.

I thought I could use the following to get that result with nestling ifelse() inside of mutate():

grades %>%
mutate(ifelse(b >= 90, "excellent"), 
       ifelse(b >= 80 & b < 90, "very_good"),
       ifelse(b >= 70 & b < 80, "fair"),
       ifelse(b >= 60 & b < 70, "poor", "fail"))

This doesn't work, as I get the error message "argument no is missing, with no default"). I thought the "no" would be the "fail" at the end, but obviously I'm getting the syntax wrong.

I can get this to get if I first filter the original data individually, and then call ifelse, as follows:

a <- grades %>%
     filter( b >= 90) %>%
     mutate(final = ifelse(b >= 90, "excellent"))

and the rbind a, b, c, etc. Obviously,this isn't how I want to do it, but I wanted to understand the syntax of ifelse(). I'm guessing the latter works because there aren't any values that don't fill the criteria, but I still can't figure out how to get it to work when there is more than one ifelse.

talat

Define vectors with the levels and labels and then use cut on the b column:

levels <- c(-Inf, 60, 70, 80, 90, Inf)
labels <- c("Fail", "Poor", "fair", "very good", "excellent")
grades %>% mutate(x = cut(b, levels, labels = labels))
    a   b         x
1   1  66      Poor
2   2  78      fair
3   3  97 excellent
4   4  46      Fail
5   5  89 very good
6   6  57      Fail
7   7  80      fair
8   8  98 excellent
9   9 100 excellent
10 10  93 excellent
11 11  59      Fail
12 12  51      Fail
13 13  69      Poor
14 14  75      fair
15 15  72      fair
16 16  48      Fail
17 17  74      fair
18 18  54      Fail
19 19  62      Poor
20 20  64      Poor
21 21  88 very good
22 22  70      Poor
23 23  85 very good
24 24  58      Fail
25 25  95 excellent
26 26  56      Fail
27 27  65      Poor
28 28  68      Poor
29 29  91 excellent
30 30  76      fair
31 31  82 very good
32 32  55      Fail
33 33  96 excellent
34 34  83 very good
35 35  61      Poor
36 36  60      Fail
37 37  77      fair
38 38  47      Fail
39 39  73      fair
40 40  71      fair

Or using data.table:

library(data.table)
setDT(grades)[, x := cut(b, levels, labels)]

Or simply in base R:

grades$x <- cut(grades$b, levels, labels)

Note

After taking another close look at your initial approach, I noticed that you would need to include right = FALSE in the cut call, because for example, 90 points should be "excellent", not just "very good". So it is used to define where the interval should be closed (left or right) and the default is on the right, which is slightly different from OP's initial approach. So in dplyr, it would then be:

grades %>% mutate(x = cut(b, levels, labels, right = FALSE))

and accordingly in the other options.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Create new column based on values of another column

From Dev

How to create a new column in a dataframe based on grouped permutations of another column

From Dev

create values on one column based on values in another column based on group

From Dev

get the percentage of a grouped values based on another column pands python

From Dev

Pandas Grouping - Values as Percent of Grouped Totals Based on Another Column

From Dev

Create a new column based on values in another column and another table

From Dev

Grouped column diff, based on another column

From Dev

Create a column based on another dataframe values

From Dev

Create a dataframe based on column values of another dataframe

From Dev

Find different values in a column grouped by another column

From Dev

Create a pandas column based on existing columns: conditional min of a column grouped by another column

From Dev

Excel: create columns based on one column values AND another column categories

From Dev

Create columns based on a categorical column and values from another column

From Dev

How to create the column in pandas based on values of another column

From Dev

Create a new column in Pandas Dataframe based on the 'NaN' values in another column

From Dev

pandas create a column based on values in another column which selected as conditions

From Dev

Create a boolean column in pandas datafame based on percentile values of another column

From Dev

Dgrid formatter create column based on values in another column

From Dev

Create a new column based on Grouping of similar values in another column in pandas

From Dev

create dataframe of difference of medians in column based on values of another column

From Dev

Create dataframe column based on the progression values of another column?

From Dev

Create new column based on last 2 digits of values in another column

From Dev

Create new column with largest number indexes based on values of another column

From Dev

create new column with values from another column based on condition

From Dev

create new column based on values of another column in python

From Dev

min/max value of a column based on values of another column, grouped by and transformed in pandas

From Dev

Get the value of a column based on min max values of another column of a pandas dataframe in a grouped aggregate function

From Dev

Efficient way to create a new column that is the sum of unique values grouped by another column?

From Dev

How can I create a pivot table indexed on a column with duplicate entries that should be grouped by values of another column?

Related Related

  1. 1

    Create new column based on values of another column

  2. 2

    How to create a new column in a dataframe based on grouped permutations of another column

  3. 3

    create values on one column based on values in another column based on group

  4. 4

    get the percentage of a grouped values based on another column pands python

  5. 5

    Pandas Grouping - Values as Percent of Grouped Totals Based on Another Column

  6. 6

    Create a new column based on values in another column and another table

  7. 7

    Grouped column diff, based on another column

  8. 8

    Create a column based on another dataframe values

  9. 9

    Create a dataframe based on column values of another dataframe

  10. 10

    Find different values in a column grouped by another column

  11. 11

    Create a pandas column based on existing columns: conditional min of a column grouped by another column

  12. 12

    Excel: create columns based on one column values AND another column categories

  13. 13

    Create columns based on a categorical column and values from another column

  14. 14

    How to create the column in pandas based on values of another column

  15. 15

    Create a new column in Pandas Dataframe based on the 'NaN' values in another column

  16. 16

    pandas create a column based on values in another column which selected as conditions

  17. 17

    Create a boolean column in pandas datafame based on percentile values of another column

  18. 18

    Dgrid formatter create column based on values in another column

  19. 19

    Create a new column based on Grouping of similar values in another column in pandas

  20. 20

    create dataframe of difference of medians in column based on values of another column

  21. 21

    Create dataframe column based on the progression values of another column?

  22. 22

    Create new column based on last 2 digits of values in another column

  23. 23

    Create new column with largest number indexes based on values of another column

  24. 24

    create new column with values from another column based on condition

  25. 25

    create new column based on values of another column in python

  26. 26

    min/max value of a column based on values of another column, grouped by and transformed in pandas

  27. 27

    Get the value of a column based on min max values of another column of a pandas dataframe in a grouped aggregate function

  28. 28

    Efficient way to create a new column that is the sum of unique values grouped by another column?

  29. 29

    How can I create a pivot table indexed on a column with duplicate entries that should be grouped by values of another column?

HotTag

Archive