How to do groupby max to create new columns in pandas dataframe

Rahul rajan

I have dataframe like below to which I want apply an sql logic which mentioned below

df.head(25)

ORDER_ID    CODE    STATUS_DATE                       RNK
19837715    0400    22/10/19 08:11:08.000000000 AM GMT  2
19837715    0400    22/10/19 10:00:03.000000000 AM GMT  1
19837715    0400    22/10/19 10:47:08.000000000 AM GMT  3
19837715    0500    22/10/19 10:00:00.000000000 AM GMT  1
19837715    1100    01/11/19 10:02:00.000000000 AM GMT  1
19837715    1240    02/11/19 08:00:00.000000000 AM GMT  1
19837833    0400    22/10/19 08:13:09.000000000 AM GMT  3
19837833    0400    22/10/19 08:22:09.000000000 AM GMT  4
19837833    0400    23/10/19 04:30:10.000000000 AM GMT  1
19837833    0400    23/10/19 09:30:07.000000000 PM GMT  2
19837833    0500    23/10/19 01:08:00.000000000 AM GMT  1
19837833    0500    23/10/19 04:30:00.000000000 AM GMT  3
19840750    0500    23/10/19 12:30:00.000000000 PM GMT  1
19840750    1100    01/11/19 10:06:02.000000000 AM GMT  1
19840750    1240    02/11/19 08:40:05.000000000 AM GMT  1
19840750    1305    05/11/19 07:21:03.000000000 AM GMT  2
19840750    1305    05/11/19 08:22:03.000000000 AM GMT  1
19840750    1400    09/11/19 06:13:12.000000000 AM GMT  3

I want to apply the below sql logic on this dataframe.

select
    order_id
    , TRUNC(MAX(decode(df.code, '0400', STATUS_DATE, Null))) act_0400
    , TRUNC(MAX(decode(df.code, '0500', STATUS_DATE, Null))) act_0500
from
    dataframe df  
where 
      df.rnk =1 
group by    
    order_id

Here I am trying to create new columns act_0400 and act_0500 by taking maximum date value from the status date column for condition rank =1 and grouping them based on order id

Expected Output

ORDER_ID    ACT_0400    ACT_0500
19837715    22/10/2019  22/10/2019
19837833    23/10/2019  23/10/2019
19840750                23/10/2019

How could this be done in pandas

jezrael

You can first convert STATUS_DATE to datetimes by to_datetime with Series.dt.date, then filter by boolean indexing with Series.isin and last reshape by DataFrame.pivot_table with aggregate max, last some data cleaning by DataFrame.rename_axis, DataFrame.rename_axis and DataFrame.reset_index:

df['STATUS_DATE'] = pd.to_datetime(df['STATUS_DATE']).dt.date
df = (df[(df['RNK'] == 1) & df['CODE'].isin([400,500])]
        .pivot_table(index="ORDER_ID", columns="CODE", values="STATUS_DATE", aggfunc='max')
        .rename_axis(None, axis=1)
        .add_prefix('ACT_')
        .reset_index())
print (df)
   ORDER_ID     ACT_400     ACT_500
0  19837715  2019-10-22  2019-10-22
1  19837833  2019-10-23  2019-10-23
2  19840750         NaN  2019-10-23

この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。

侵害の場合は、連絡してください[email protected]

編集
0

コメントを追加

0

関連記事

分類Dev

how to do math operations on a pandas columns and save it as a new dataframe

分類Dev

Use one column of a groupby to create X new columns with pandas

分類Dev

pandas groupby create a new dataframe with label from apply operation

分類Dev

How to create new column conditional on existing columns in pandas dataframe using for loop

分類Dev

Pandas DataFrame GroupBy sum / count to new DataFrame

分類Dev

How do I access data inside a pandas dataframe groupby object?

分類Dev

Pandas: create new columns with conditions

分類Dev

How to create new date and insert as index in pandas dataframe?

分類Dev

How to speed up Pandas apply function to create a new column in the dataframe?

分類Dev

Apply a function to Pandas dataframe row by row (axis = 0) to create four new columns

分類Dev

How to pivot a pandas dataframe such that unique values across multiple columns become new columns?

分類Dev

Merge Multiple Columns As New Rows in Pandas Dataframe

分類Dev

Groupby on two columns with bins(ranges) on one of them in Pandas Dataframe

分類Dev

Use lookup values to create new pandas dataframe

分類Dev

In a Pandas dataframe why does the max() function skip columns?

分類Dev

How to calculate new vectorized column from bool and float columns in pandas dataframe?

分類Dev

Groupby and create a new column by randomly assign multiple strings into it in Pandas

分類Dev

Pandas Multiindex Groupby on Columns

分類Dev

Pandas Multiindex Groupby on Columns

分類Dev

Pandas Groupby Sum to Columns

分類Dev

All max value in dataframe with groupby

分類Dev

How to create new values in a pandas dataframe column based on values from another column

分類Dev

Groupby in pandas dataframe

分類Dev

Plot Pandas groupby dataframe

分類Dev

Pandas Dataframe groupby Display

分類Dev

Create a new column by merging multiple columns separated by "-" in a dataframe

分類Dev

Create a new dataframe column by comparing two other columns in different dataframes

分類Dev

How to create a dictionary with two dataframe columns in pyspark?

分類Dev

pandas rolling_max with groupby

Related 関連記事

  1. 1

    how to do math operations on a pandas columns and save it as a new dataframe

  2. 2

    Use one column of a groupby to create X new columns with pandas

  3. 3

    pandas groupby create a new dataframe with label from apply operation

  4. 4

    How to create new column conditional on existing columns in pandas dataframe using for loop

  5. 5

    Pandas DataFrame GroupBy sum / count to new DataFrame

  6. 6

    How do I access data inside a pandas dataframe groupby object?

  7. 7

    Pandas: create new columns with conditions

  8. 8

    How to create new date and insert as index in pandas dataframe?

  9. 9

    How to speed up Pandas apply function to create a new column in the dataframe?

  10. 10

    Apply a function to Pandas dataframe row by row (axis = 0) to create four new columns

  11. 11

    How to pivot a pandas dataframe such that unique values across multiple columns become new columns?

  12. 12

    Merge Multiple Columns As New Rows in Pandas Dataframe

  13. 13

    Groupby on two columns with bins(ranges) on one of them in Pandas Dataframe

  14. 14

    Use lookup values to create new pandas dataframe

  15. 15

    In a Pandas dataframe why does the max() function skip columns?

  16. 16

    How to calculate new vectorized column from bool and float columns in pandas dataframe?

  17. 17

    Groupby and create a new column by randomly assign multiple strings into it in Pandas

  18. 18

    Pandas Multiindex Groupby on Columns

  19. 19

    Pandas Multiindex Groupby on Columns

  20. 20

    Pandas Groupby Sum to Columns

  21. 21

    All max value in dataframe with groupby

  22. 22

    How to create new values in a pandas dataframe column based on values from another column

  23. 23

    Groupby in pandas dataframe

  24. 24

    Plot Pandas groupby dataframe

  25. 25

    Pandas Dataframe groupby Display

  26. 26

    Create a new column by merging multiple columns separated by "-" in a dataframe

  27. 27

    Create a new dataframe column by comparing two other columns in different dataframes

  28. 28

    How to create a dictionary with two dataframe columns in pyspark?

  29. 29

    pandas rolling_max with groupby

ホットタグ

アーカイブ