How to use pandas groupby with aggregation for counting total time taken by a particular event for a particular ID?

Darknorth

Sample input:

    ID       Timestamp       Event
1   1   2010-05-31 13:43:45    a
2   1   2010-05-31 13:44:25    a
3   1   2010-05-31 13:44:55    a
4   1   2010-05-31 13:45:35    b
5   1   2010-05-31 13:47:05    b
6   1   2010-05-31 13:47:45    a
7   1   2010-05-31 13:49:45    c
8   1   2010-05-31 13:50:45    b
9   1   2010-06-01 09:20:00    c
10  1   2010-06-01 09:22:00    c
11  1   2010-06-01 09:23:20    c
12  1   2010-06-01 09:24:00    a
13  1   2010-06-01 09:24:15    a
14  1   2010-06-01 09:25:00    b
15  1   2010-06-01 09:25:42    b
16  2   2010-05-31 11:25:38    a
17  2   2010-05-31 11:26:00    c
18  2   2010-05-31 11:26:45    a
19  2   2010-05-31 11:27:10    a
20  2   2010-05-31 11:27:37    a
21  2   2010-05-31 11:29:35    b
22  2   2010-05-31 11:30:25    b
 .  .             .            .
 .  .             .            .
 .  .             .            .

This is a pseudocode of what I'm currently doing, but its not working properly

# Right now I'm looping through the dataframe, using this pseudocode:
for row in df.itertuples():
    check for ID match: (say, 1 == 1)
        check for event match: (say a == a)
           calculate time duration, 
           check for < 3600
                store in a dictionary of the events ('a', 'b',etc = keys), cumulative
                store this  events dictionary for the ID
           else Increment session accordingly
        else
           store the previous occurrence time of the previous event,
           use it for the next occurrence of the event, using a dictionary;

           At the same time, find the current event's duration w.r.t.
           the time stored at the previous changing of the event(in the above line), 
         again check for < 3600 
                   store in a dictionary of the events ('a', 'b',etc = keys), cumulative
                   store this  events dictionary for the ID
       else Increment session accordingly
    else
        reset events dictionary to zero values
        reset the event change dictionary to zero values

This is a sample output, but similar to what I want to do.

    ID   a_time   b_time   c_time   Session
1    1     120      200      100       2
2    2     235      340      145       3
.    .      .        .        .        .
.    .      .        .        .        .

Actually there are 7 unique events in all, I've just mentioned 3
to keep it less complicated.

{Basically “a_time” = sum of time taken by all occurences the event ‘a’ for the current ID (if the time duration for that occurrence isn’t greater than 3600s; else increase the session count accordingly  session+time-duration/3600, OR by plus 1 only if the date changes as well)

‘’ = similar entry}

Hope: Is there a more pythonic, or a vectorized way to do this? Like using

df.groupby(['ID'])['Timestamp'].diff()

I used a lot of versions of this, but it didn't work for me, as I get "no aggregation function available for pd.groupby"

If I didn't need to use the time logs then I'd just use the command

df.groupby(['ID'])['Event'].value_counts().unstack().fillna(0)

to get an output which would be the count of each of the unique events for each ID.

My obtained output, on using the pseudocode approach is just the repetition of the 1st row (just some random numbers shown here, not calculated) shown below, and I'm not able to see all the events as well, just three:

        ID       a_time       b_time       c_time
1       1         120          200           100

Using groupby with ['Timestamp'].diff() as mentioned in the above part gives me error "no aggregation function available for pd.groupby". I even used sort_values, apply but I always got this same error (just mentioned in the above line).

Darknorth

I solved it myself. I figured out that I could just use this:

df.at[index, col] = df.at[index, col] + duration

Basically, as dataframe can visualized as a matrix, hence I still can access a dataframe by it's cell, so I managed to solve it.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

How to decrease running time of this particular solution?

From Dev

The particular digit in the number of total

From Dev

Pandas: how to get a particular group after groupby?

From Dev

How to make a function run for a particular time period?

From Dev

How to get avg time taken by grouping a particular column using sql?

From Dev

How to schedule a request at a particular time?

From Dev

How to sequently apply a function on a particular event in jQuery?

From Dev

How to use drawRect in a particular area?

From Dev

How to trigger a particular Transaction at same time

From Dev

In pandas how to filter based on a particular weekday and range of time

From Dev

how to trim a video in swift for a particular time

From Dev

How does form_for decide what action is to be taken for a particular request?

From Dev

Pandas: Delete rows of a DataFrame if total count of a particular column occurs only 1 time

From Dev

How to check whether a time is between particular range?

From Dev

How to process the records in Pandas to find the total time for 'true' and then 'false' and then again for 'true' for sorted data on particular day

From Dev

How to schedule a download only to start at a particular time?

From Dev

How to remove a lot of packages installed at a particular time?

From Dev

How to convert Particular number to time of the day Javascript

From Dev

How to Force Automatic Updates to Install at a particular time

From Dev

How to save the time of execution of a particular transaction ?

From Dev

Mysql query formation error in calculating total time taken by particular projects

From Dev

How to use drawRect in a particular area?

From Dev

How to check for the same ID, and then increment for that particular id?

From Dev

Pandas: Delete rows of a DataFrame if total count of a particular column occurs only 1 time

From Dev

Pandas: sort within groupby on a particular column

From Dev

Tkinter how to catch this particular key event

From Dev

How to display notification at a particular time

From Dev

Pandas: How to sum particular rows

From Dev

How Can I use filter for remove particular event?

Related Related

  1. 1

    How to decrease running time of this particular solution?

  2. 2

    The particular digit in the number of total

  3. 3

    Pandas: how to get a particular group after groupby?

  4. 4

    How to make a function run for a particular time period?

  5. 5

    How to get avg time taken by grouping a particular column using sql?

  6. 6

    How to schedule a request at a particular time?

  7. 7

    How to sequently apply a function on a particular event in jQuery?

  8. 8

    How to use drawRect in a particular area?

  9. 9

    How to trigger a particular Transaction at same time

  10. 10

    In pandas how to filter based on a particular weekday and range of time

  11. 11

    how to trim a video in swift for a particular time

  12. 12

    How does form_for decide what action is to be taken for a particular request?

  13. 13

    Pandas: Delete rows of a DataFrame if total count of a particular column occurs only 1 time

  14. 14

    How to check whether a time is between particular range?

  15. 15

    How to process the records in Pandas to find the total time for 'true' and then 'false' and then again for 'true' for sorted data on particular day

  16. 16

    How to schedule a download only to start at a particular time?

  17. 17

    How to remove a lot of packages installed at a particular time?

  18. 18

    How to convert Particular number to time of the day Javascript

  19. 19

    How to Force Automatic Updates to Install at a particular time

  20. 20

    How to save the time of execution of a particular transaction ?

  21. 21

    Mysql query formation error in calculating total time taken by particular projects

  22. 22

    How to use drawRect in a particular area?

  23. 23

    How to check for the same ID, and then increment for that particular id?

  24. 24

    Pandas: Delete rows of a DataFrame if total count of a particular column occurs only 1 time

  25. 25

    Pandas: sort within groupby on a particular column

  26. 26

    Tkinter how to catch this particular key event

  27. 27

    How to display notification at a particular time

  28. 28

    Pandas: How to sum particular rows

  29. 29

    How Can I use filter for remove particular event?

HotTag

Archive