How to select rows values starting by specific letters by group in a python dataframe?

JEG

I have the following dataframe "data" composed of ID and associated cluster number :

ID      cluster 
FP_101   1  
FP_102   1     
SP_209   3
SP_300   3
SP_209   1
FP_45    90
SP_50    90
FP_398   100
...

I would like to print clusters which contain more than one ID starting by SP and/or FP. I think that I have the two parts of the answer but just do not know of to combine them in propre way :

  • data = data[data['ID'].str.startswith('FP')] (same for SP)
  • selection fonction : data = data.groupby(['cluster']).filter(lambda x: x['ID'].nunique() > 1)

The result should give from the previous example :

    ID      cluster 
    FP_101   1  
    FP_102   1
    SP_209   1     
    SP_209   3
    SP_300   3

How can I combine arrange these fonction to obtain this result ?

Megha John

This is my understanding of your question; let me know if it helps:

  1. Separating SP & FP

df['Prefix'] = df['ID'].apply(lambda x: x.split('_')[0])

  1. Grouping by clusters

df2 = df.groupby(['cluster', 'Prefix'], as_index = False).agg({'ID':['nunique','unique']})

  1. Filtering

df2.columns = df2.columns.to_flat_index().str.join('')

df2[df2['IDnunique']>1]

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

How to add rows with specific values to a dataframe in python

From Dev

How to select specific rows in a dataframe, group them and find the sum using python?

From Dev

In python, how to shift and fill with a specific values for all the shifted rows in DataFrame?

From Java

How to select rows in a DataFrame between two values, in Python Pandas?

From Dev

How to remove a group of specific rows from a dataframe?

From Dev

How to select rows per group by varying values?

From Dev

Select rows with the same order of values in python dataframe?

From Dev

How to select N rows with highest values from every group in pandas DataFrame

From Dev

Python DataFrame - Select dataframe rows based on values in another dataframe

From Dev

how to sample/group rows of dataframe to fix specific distributions within a group?

From Dev

Python Dataframe: select rows based on previous rows values

From Dev

How to select starting letter in a list of letters

From Dev

How to calculate mean of specific rows in python dataframe?

From Dev

How to use python to select all rows with minimum values for each group in a table

From Dev

How can I select rows randomly in proportion to the number of unique values for each group in Python?

From Dev

Python Pandas: How to subtract values in two non-consecutive rows in a specific column of a dataframe from one another

From Dev

how to select rows which one of its columns values contains specific string in python?

From Dev

How can i group rows using specific criteria that are not in pandas dataframe?

From Dev

How to group dataframe rows on unique elements in a specific column?

From Dev

dataframe + pandas + select specific rows

From Dev

How to iterate a dataframe then return rows values in python?

From Dev

how to change the values of specific rows for specifc columns, with the values of specific rows in the same dataframe in pandas

From Dev

How to select rows by group if years are next to each other in pandas dataframe?

From Dev

Select the duplicate rows with specific values

From Dev

Extract values from a column in a Dataframe based in starting letters

From Python

python: summation of specific rows values based on location in dataframe

From Dev

Drop dataframe rows that meet two specific criteria (values) using Python

From Dev

select rows where values match specific characters python

From

How to select the rows with maximum values in each group with dplyr?

Related Related

  1. 1

    How to add rows with specific values to a dataframe in python

  2. 2

    How to select specific rows in a dataframe, group them and find the sum using python?

  3. 3

    In python, how to shift and fill with a specific values for all the shifted rows in DataFrame?

  4. 4

    How to select rows in a DataFrame between two values, in Python Pandas?

  5. 5

    How to remove a group of specific rows from a dataframe?

  6. 6

    How to select rows per group by varying values?

  7. 7

    Select rows with the same order of values in python dataframe?

  8. 8

    How to select N rows with highest values from every group in pandas DataFrame

  9. 9

    Python DataFrame - Select dataframe rows based on values in another dataframe

  10. 10

    how to sample/group rows of dataframe to fix specific distributions within a group?

  11. 11

    Python Dataframe: select rows based on previous rows values

  12. 12

    How to select starting letter in a list of letters

  13. 13

    How to calculate mean of specific rows in python dataframe?

  14. 14

    How to use python to select all rows with minimum values for each group in a table

  15. 15

    How can I select rows randomly in proportion to the number of unique values for each group in Python?

  16. 16

    Python Pandas: How to subtract values in two non-consecutive rows in a specific column of a dataframe from one another

  17. 17

    how to select rows which one of its columns values contains specific string in python?

  18. 18

    How can i group rows using specific criteria that are not in pandas dataframe?

  19. 19

    How to group dataframe rows on unique elements in a specific column?

  20. 20

    dataframe + pandas + select specific rows

  21. 21

    How to iterate a dataframe then return rows values in python?

  22. 22

    how to change the values of specific rows for specifc columns, with the values of specific rows in the same dataframe in pandas

  23. 23

    How to select rows by group if years are next to each other in pandas dataframe?

  24. 24

    Select the duplicate rows with specific values

  25. 25

    Extract values from a column in a Dataframe based in starting letters

  26. 26

    python: summation of specific rows values based on location in dataframe

  27. 27

    Drop dataframe rows that meet two specific criteria (values) using Python

  28. 28

    select rows where values match specific characters python

  29. 29

    How to select the rows with maximum values in each group with dplyr?

HotTag

Archive