How can I concatenate date from another column when I use groupby and aggregation in a pandas dataframe

SrikanthR

I am having the following dataframe initially, then I perform a groupby and an aggregate to concatenate overlapping time ranges. I want to add another column in the final dataframe and this column will be formed by a concatenation of data on the overlapping rows.

df['newid']=(df['START']-df['END'].shift()).dt.total_seconds().gt(0).cumsum()
print (df.to_string(index=False))

                ELEMENT                                    TEXT               START                 END  newid
 OLT2227-LT3-PON0-ONT03           USECASE1 - ALARM1 -NO OVERLAP 2021-01-19 18:00:00 2021-01-19 19:00:00      0
 OLT2227-LT3-PON0-ONT03          USECASE1 - ALARM2 - NO OVERLAP 2021-01-19 19:10:00 2021-01-19 20:00:12      1
 OLT2227-LT3-PON0-ONT05     USECASE2 - ALARM1 - Fully Contained 2021-01-19 18:00:00 2021-01-19 23:00:00      1
 OLT2227-LT3-PON0-ONT05     USECASE2 - ALARM2 - Fully Contained 2021-01-19 19:00:00 2021-01-19 20:00:12      1
 OLT2227-LT3-PON0-ONT10  USECASE3 - ALARM1 - START-END-RELATION 2021-01-19 22:00:00 2021-01-19 22:30:00      2
 OLT2227-LT3-PON0-ONT10  USECASE3 - ALARM2 - START-END-RELATION 2021-01-19 22:30:00 2021-01-19 23:00:12      2
 OLT2227-LT3-PON0-ONT21                         USECASE3-ALARM1 2021-01-19 22:00:00 2021-01-19 22:10:00      2
 OLT2227-LT3-PON0-ONT21                  USECASE3-ALARM2-NO-END 2021-01-19 22:15:00 2042-01-19 20:00:12      3
  OLT2227-LT3-PON0-ONT4                               USECASE-4 2021-01-19 17:30:00 2042-01-19 20:00:12      3
  OLT2227-LT3-PON0-ONT4                               USECASE-4 2021-01-19 20:00:00 2021-01-19 23:00:00      3
 OLT2227-LT3-PON0-ONT99                               USECASE-5 2021-01-19 17:30:00 2021-01-19 22:00:00      3
 OLT2227-LT3-PON0-ONT99                               USECASE-5 2021-01-19 20:00:00 2042-01-19 20:00:12      3

newdf=df.groupby(['newid','ELEMENT']).agg({'START':'min','END':'max'}).reset_index(level=1)
print (newdf.to_string(index=False))

                ELEMENT               START                 END
 OLT2227-LT3-PON0-ONT03 2021-01-19 18:00:00 2021-01-19 19:00:00
 OLT2227-LT3-PON0-ONT03 2021-01-19 19:10:00 2021-01-19 20:00:12
 OLT2227-LT3-PON0-ONT05 2021-01-19 18:00:00 2021-01-19 23:00:00
 OLT2227-LT3-PON0-ONT10 2021-01-19 22:00:00 2021-01-19 23:00:12
 OLT2227-LT3-PON0-ONT21 2021-01-19 22:00:00 2021-01-19 22:10:00
 OLT2227-LT3-PON0-ONT21 2021-01-19 22:15:00 2042-01-19 20:00:12
  OLT2227-LT3-PON0-ONT4 2021-01-19 17:30:00 2042-01-19 20:00:12
 OLT2227-LT3-PON0-ONT99 2021-01-19 17:30:00 2042-01-19 20:00:12

As you can see, In the last dataframe, I get only the columns ELEMENT, START and END. However, what I would like to get is a dataframe that will concatenate the TEXT columns during the process of Aggregation.

                ELEMENT               START                 END                    TEXT
 OLT2227-LT3-PON0-ONT03 2021-01-19 18:00:00 2021-01-19 19:00:00     USECASE1 - ALARM1 -NO OVERLAP
 OLT2227-LT3-PON0-ONT03 2021-01-19 19:10:00 2021-01-19 20:00:12     USECASE1 - ALARM2 - NO OVERLAP
 OLT2227-LT3-PON0-ONT05 2021-01-19 18:00:00 2021-01-19 23:00:00     USECASE2 - ALARM1 - Fully Contained; USECASE2 - ALARM2 - Fully Contained
 OLT2227-LT3-PON0-ONT10 2021-01-19 22:00:00 2021-01-19 23:00:12     USECASE3 - ALARM1 - START-END-RELATION; USECASE3 - ALARM2 - START-END-RELATION
 OLT2227-LT3-PON0-ONT21 2021-01-19 22:00:00 2021-01-19 22:10:00     USECASE3-ALARM1
 OLT2227-LT3-PON0-ONT21 2021-01-19 22:15:00 2042-01-19 20:00:12     USECASE3-ALARM2-NO-END 
  OLT2227-LT3-PON0-ONT4 2021-01-19 17:30:00 2042-01-19 20:00:12     USECASE-4 ; USECASE-4
 OLT2227-LT3-PON0-ONT99 2021-01-19 17:30:00 2042-01-19 20:00:12     USECASE-5 ; USECASE-5
 

Can any one please help ?

Pablo C

You can aggregate the method str.join:

(df.groupby(['newid','ELEMENT'])
    .agg({'START': 'min', 'END':'max', 'TEXT': ' ; '.join})
    .reset_index(1))

Output (TEXT column only):

USECASE1 - ALARM1 -NO OVERLAP
USECASE1 - ALARM2 - NO OVERLAP
USECASE2 - ALARM1 - Fully Contained ; USECASE2 - ALARM2 - Fully Contained
USECASE3 - ALARM1 - START-END-RELATION ; USECASE3 - ALARM2 - START-END-RELATION
USECASE3-ALARM1
USECASE3-ALARM2-NO-END
USECASE-4 ; USECASE-4
USECASE-5 ; USECASE-5

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

How can I add a column from one dataframe to another dataframe?

From Dev

How can I concatenate a column into existing column of list in pandas that is not None?

From Dev

How can I update the date column when the another column is current date?

From Dev

How can I concatenate a Series onto a DataFrame with Pandas?

From Dev

How can I concatenate a Series onto a DataFrame with Pandas?

From Dev

Pandas DataFrame column another DataFrame when I'm expecting a Series

From Dev

Pandas DataFrame column another DataFrame when I'm expecting a Series

From Dev

How can I concatenate Pandas DataFrames by column and index?

From Dev

How can I concatenate a string with a variable and use it as another variable?

From Dev

How can I concatenate values from 3 column per id?

From Dev

How can I use melt() to reshape a pandas DataFrame to a list, creating an index from a crosstab column and creating a new variable in its place?

From Dev

How can I use melt() to reshape a pandas DataFrame to a list, creating an index from a crosstab column and creating a new variable in its place?

From Dev

In Pandas, how can I patch a dataframe with missing values with values from another dataframe given a similar index?

From Dev

How can I keep Date and Time columns when I use parse_dates in pandas?

From Dev

How can I search a pandas dataframe to fill in another dataframe?

From Dev

How can I use the results from 1 column as a part of another column

From Dev

How can I use pandas groupby.count() for a condition

From Dev

How can I use pandas groupby.count() for a condition

From Dev

How can I group by date alone, when date column is in milliseconds?

From Java

How can I make this kind of aggregation in pandas?

From Dev

How can I use the value of a cell in a row to chose find a column name in a pandas dataframe?

From Dev

How i can use aggregation in django

From Dev

How i can use aggregation in django

From Dev

How can I calculate dates from another column and another table?

From Dev

How can I calculate dates from another column and another table?

From Dev

How can I delete rows for a particular Date in a Pandas dataframe?

From Dev

How can i hide a Column group when I use pagebreak

From Dev

How can I use split() in a string when broadcasting a dataframe's column?

From Dev

How can I select a specific column from each row in a Pandas DataFrame?

Related Related

  1. 1

    How can I add a column from one dataframe to another dataframe?

  2. 2

    How can I concatenate a column into existing column of list in pandas that is not None?

  3. 3

    How can I update the date column when the another column is current date?

  4. 4

    How can I concatenate a Series onto a DataFrame with Pandas?

  5. 5

    How can I concatenate a Series onto a DataFrame with Pandas?

  6. 6

    Pandas DataFrame column another DataFrame when I'm expecting a Series

  7. 7

    Pandas DataFrame column another DataFrame when I'm expecting a Series

  8. 8

    How can I concatenate Pandas DataFrames by column and index?

  9. 9

    How can I concatenate a string with a variable and use it as another variable?

  10. 10

    How can I concatenate values from 3 column per id?

  11. 11

    How can I use melt() to reshape a pandas DataFrame to a list, creating an index from a crosstab column and creating a new variable in its place?

  12. 12

    How can I use melt() to reshape a pandas DataFrame to a list, creating an index from a crosstab column and creating a new variable in its place?

  13. 13

    In Pandas, how can I patch a dataframe with missing values with values from another dataframe given a similar index?

  14. 14

    How can I keep Date and Time columns when I use parse_dates in pandas?

  15. 15

    How can I search a pandas dataframe to fill in another dataframe?

  16. 16

    How can I use the results from 1 column as a part of another column

  17. 17

    How can I use pandas groupby.count() for a condition

  18. 18

    How can I use pandas groupby.count() for a condition

  19. 19

    How can I group by date alone, when date column is in milliseconds?

  20. 20

    How can I make this kind of aggregation in pandas?

  21. 21

    How can I use the value of a cell in a row to chose find a column name in a pandas dataframe?

  22. 22

    How i can use aggregation in django

  23. 23

    How i can use aggregation in django

  24. 24

    How can I calculate dates from another column and another table?

  25. 25

    How can I calculate dates from another column and another table?

  26. 26

    How can I delete rows for a particular Date in a Pandas dataframe?

  27. 27

    How can i hide a Column group when I use pagebreak

  28. 28

    How can I use split() in a string when broadcasting a dataframe's column?

  29. 29

    How can I select a specific column from each row in a Pandas DataFrame?

HotTag

Archive