pandas - keep only True values after groupby a DataFrame

Fabio Lamanna

I've been working on a DataFrame with User_IDs, DateTime objects and other information, like the following extract:

User_ID;Latitude;Longitude;Datetime
222583401;41.4020375;2.1478710;2014-07-06 20:49:20
287280509;41.3671346;2.0793115;2013-01-30 09:25:47
329757763;41.5453577;2.1175164;2012-09-25 08:40:59
189757330;41.5844998;2.5621569;2013-10-01 11:55:20
624921653;41.5931846;2.3030671;2013-07-09 20:12:20
414673119;41.5550136;2.0965829;2014-02-24 20:15:30
414673119;41.5550136;2.0975829;2014-02-24 20:16:30
414673119;41.5550136;2.0985829;2014-02-24 20:17:30

I've grouped Users with:

g = df.groupby(['User_ID','Datetime'])

and then check for no-single DataTime objects:

df = df.groupby('User_ID')['Datetime'].apply(lambda g: len(g)>1)

I've obtained the following boolean DataFrame:

User_ID
189757330    False
222583401    False
287280509    False
329757763    False
414673119     True
624921653    False
Name: Datetime, dtype: bool

which is fine for my purposes to keep only User_ID with a True masked value. Now I would like to keep only the User_ID values associated to the True values, and write them to a new DataFrame with pandas.to_csv, for instance. The expected DataFrame would contain only the User_ID with more than one DateTime object:

User_ID;Latitude;Longitude;Datetime
414673119;41.5550136;2.0965829;2014-02-24 20:15:30
414673119;41.5550136;2.0975829;2014-02-24 20:16:30
414673119;41.5550136;2.0985829;2014-02-24 20:17:30

How may I have access to the boolean values for each User_ID? Thanks for your kind help.

EdChum

Assign the result of df.groupby('User_ID')['Datetime'].apply(lambda g: len(g)>1) to a variable so you can perform boolean indexing and then use the index from this to call isin and filter your orig df:

In [366]:

users = df.groupby('User_ID')['Datetime'].apply(lambda g: len(g)>1)
users

Out[366]:
User_ID
189757330    False
222583401    False
287280509    False
329757763    False
414673119     True
624921653    False
Name: Datetime, dtype: bool

In [367]:   
users[users]

Out[367]:
User_ID
414673119    True
Name: Datetime, dtype: bool

In [368]:
users[users].index

Out[368]:
Int64Index([414673119], dtype='int64')

In [361]:
df[df['User_ID'].isin(users[users].index)]

Out[361]:
     User_ID   Latitude  Longitude            Datetime
5  414673119  41.555014   2.096583 2014-02-24 20:15:30
6  414673119  41.555014   2.097583 2014-02-24 20:16:30
7  414673119  41.555014   2.098583 2014-02-24 20:17:30

You can then call to_csv on the above as normal

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

How to keep only the consecutive values in a Pandas dataframe using Python

From Dev

Pandas - return a dataframe after groupby

From Java

How to GroupBy a Dataframe in Pandas and keep Columns

From Dev

pandas groupby column to list and keep certain values

From Dev

How to keep the only the top N values in a dataframe

From Java

How to show only column with Values in Pandas Groupby

From Dev

pandas - check for non unique values in dataframe groupby

From Dev

Pandas dataframe groupby and combine multiple row values

From Dev

How to groupby consecutive values in pandas DataFrame

From Dev

Identify consecutive same values in Pandas Dataframe, with a Groupby

From Dev

Filtering out only true values from Pandas DataFrame, return tuples of (Row,Col)

From Dev

How to keep original index of a DataFrame after groupby 2 columns?

From Dev

Conditionally keep only one of the duplicates in pandas groupby groups

From Dev

pandas: How to keep the top N (only N) value in a dataframe (Pandas)

From Dev

Keep only first occurrence of continuous repetition of values in a dataframe

From Dev

pandas dataframe groupby: sum/count of only positive numbers

From Java

How to drop null values in dataframe after groupby while writing to excel

From Dev

How to keep only the last entry based on certain columns in a Pandas dataframe?

From Dev

Pandas Groupby count on multiple columns for specific string values only

From Dev

Groupby shift (lagged values) analogue with only Numpy (no pandas)

From Java

pandas dataframe groupby index and convert row values into columns

From Dev

Deleting rows from Pandas dataframe based on groupby values

From Dev

Select CONSECUTIVE rows from a DataFrame based on values in a column in Pandas with Groupby

From Dev

How generate all pairs of values, from the result of a groupby, in a pandas dataframe

From Dev

Pandas: Add new column with several values to groupby dataframe

From Dev

merge values of groupby results with dataframe in new column Python Pandas

From Dev

Groupby Pandas dataframe and drop values conditionally based on rank

From Dev

How to split a pandas dataframe into many columns after groupby

From Dev

Using print to access individual cells of pandas dataframe after groupby

Related Related

  1. 1

    How to keep only the consecutive values in a Pandas dataframe using Python

  2. 2

    Pandas - return a dataframe after groupby

  3. 3

    How to GroupBy a Dataframe in Pandas and keep Columns

  4. 4

    pandas groupby column to list and keep certain values

  5. 5

    How to keep the only the top N values in a dataframe

  6. 6

    How to show only column with Values in Pandas Groupby

  7. 7

    pandas - check for non unique values in dataframe groupby

  8. 8

    Pandas dataframe groupby and combine multiple row values

  9. 9

    How to groupby consecutive values in pandas DataFrame

  10. 10

    Identify consecutive same values in Pandas Dataframe, with a Groupby

  11. 11

    Filtering out only true values from Pandas DataFrame, return tuples of (Row,Col)

  12. 12

    How to keep original index of a DataFrame after groupby 2 columns?

  13. 13

    Conditionally keep only one of the duplicates in pandas groupby groups

  14. 14

    pandas: How to keep the top N (only N) value in a dataframe (Pandas)

  15. 15

    Keep only first occurrence of continuous repetition of values in a dataframe

  16. 16

    pandas dataframe groupby: sum/count of only positive numbers

  17. 17

    How to drop null values in dataframe after groupby while writing to excel

  18. 18

    How to keep only the last entry based on certain columns in a Pandas dataframe?

  19. 19

    Pandas Groupby count on multiple columns for specific string values only

  20. 20

    Groupby shift (lagged values) analogue with only Numpy (no pandas)

  21. 21

    pandas dataframe groupby index and convert row values into columns

  22. 22

    Deleting rows from Pandas dataframe based on groupby values

  23. 23

    Select CONSECUTIVE rows from a DataFrame based on values in a column in Pandas with Groupby

  24. 24

    How generate all pairs of values, from the result of a groupby, in a pandas dataframe

  25. 25

    Pandas: Add new column with several values to groupby dataframe

  26. 26

    merge values of groupby results with dataframe in new column Python Pandas

  27. 27

    Groupby Pandas dataframe and drop values conditionally based on rank

  28. 28

    How to split a pandas dataframe into many columns after groupby

  29. 29

    Using print to access individual cells of pandas dataframe after groupby

HotTag

Archive