Pandas: split dataframe on values of ID column and write to csv, generate filenames from unique values in column

user2165857

I have a pandas dataframe I would like to iterate over. A simplified example of my dataframe:

chr    start    end    Gene    Value   MoreData
chr1    123    123    HAPPY    41.1    3.4
chr1    125    129    HAPPY    45.9    4.5
chr1    140    145    HAPPY    39.3   4.1
chr1    342    355    SAD    34.2    9.0
chr1    360    361    SAD    44.3    8.1
chr1    390    399    SAD    29.0   7.2
chr1    400    411    SAD    35.6   6.5
chr1    462    470    LEG    20.0    2.7

I would like to iterate over each unique gene and create a new file named:

for Gene in df: ## this is where I need the most help

    OutFileName = Gene+".pdf"

For the above example I should get three iterations with 3 outfiles and 3 dataframes:

# HAPPY.pdf
chr1    123    123    HAPPY    41.1    3.4 
chr1    125    129    HAPPY    45.9    4.5 
chr1    140    145    HAPPY    39.3   4.1

# SAD.pdf
chr1    342    355    SAD    34.2    9.0 
chr1    360    361    SAD  44.3    8.1 
chr1    390    399    SAD    29.0   7.2 
chr1    400    411    SAD    35.6   6.5

# Leg.pdf
chr1    462    470    LEG    20.0    2.7

The resulting data frame contents split up by chunks will be sent to another function that will perform the analysis and return the contents to be written to file.

EdChum

You can obtain the unique values calling unique, iterate over this, build the filename and write this out to csv:

genes = df['Gene'].unique()
for gene in genes:
    outfilename = gene + '.pdf'
    print(outfilename)
    df[df['Gene'] == gene].to_csv(outfilename)

HAPPY.pdf
SAD.pdf
LEG.pdf

A more pandas-thonic method is to groupby on 'Gene' and then iterate over the groups:

gp = df.groupby('Gene')
# groups() returns a dict with 'Gene':indices as k:v pair
for g in gp.groups.items():
    print(df.loc[g[1]])   
    
    chr  start  end   Gene  Value  MoreData
0  chr1    123  123  HAPPY   41.1       3.4
1  chr1    125  129  HAPPY   45.9       4.5
2  chr1    140  145  HAPPY   39.3       4.1
    chr  start  end Gene  Value  MoreData
3  chr1    342  355  SAD   34.2       9.0
4  chr1    360  361  SAD   44.3       8.1
5  chr1    390  399  SAD   29.0       7.2
6  chr1    400  411  SAD   35.6       6.5
    chr  start  end Gene  Value  MoreData
7  chr1    462  470  LEG     20       2.7

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Creating Dictionary from Pandas DataFrame Column Based on Unique Values in Column

From Dev

From Pandas Dataframe find unique values in column and see if those values have the same values in another column

From Dev

To_CSV unique values of a pandas column

From Dev

Unique values from some column, DF pandas

From Dev

Subsetting a CSV by unique column values

From Java

pandas: modifying values in dataframe from another column

From Dev

How to extract unique values from pandas column where values are in list

From Java

Counting unique values in a column in pandas dataframe like in Qlik?

From Dev

Find unique values in a Pandas dataframe, irrespective of row or column location

From Dev

Pandas Dataframe to_dict() with unique column values as keys

From Dev

Counting unique values in a column in pandas dataframe like in Qlik?

From Dev

Cut Pandas dataframe based on unique values per column

From Dev

Getting count of unique values in pandas Dataframe when there is a list object in a column

From Dev

Split Pandas Dataframe into separate pieces based on column values

From Dev

How to split pandas dataframe based on difference of values in a column

From Dev

Split a text(with names and values) column into multiple columns in Pandas DataFrame

From Dev

Split Pandas Dataframe into separate pieces based on column values

From Dev

How to split pandas dataframe based on difference of values in a column

From Dev

Pandas: Split a dataframe rows and re-arrange column values

From Dev

New column to pandas dataframe according to values from other column

From Dev

Efficiently replace values from a column to another column Pandas DataFrame

From Dev

pandas - write to csv only if column contains certain values

From Dev

Find unique values from a column

From Dev

Unique values from array in column

From Dev

PANDAS split dataframe to multiple by unique values rows

From Dev

Extract unique values and number of occurrences of each value from dataframe column

From Dev

Add a column to a dataframe using (extracting unique values) from existing columns

From Dev

Add a column to a dataframe using (extracting unique values) from existing columns

From Dev

R - reshape dataframe from duplicated column names but unique values

Related Related

  1. 1

    Creating Dictionary from Pandas DataFrame Column Based on Unique Values in Column

  2. 2

    From Pandas Dataframe find unique values in column and see if those values have the same values in another column

  3. 3

    To_CSV unique values of a pandas column

  4. 4

    Unique values from some column, DF pandas

  5. 5

    Subsetting a CSV by unique column values

  6. 6

    pandas: modifying values in dataframe from another column

  7. 7

    How to extract unique values from pandas column where values are in list

  8. 8

    Counting unique values in a column in pandas dataframe like in Qlik?

  9. 9

    Find unique values in a Pandas dataframe, irrespective of row or column location

  10. 10

    Pandas Dataframe to_dict() with unique column values as keys

  11. 11

    Counting unique values in a column in pandas dataframe like in Qlik?

  12. 12

    Cut Pandas dataframe based on unique values per column

  13. 13

    Getting count of unique values in pandas Dataframe when there is a list object in a column

  14. 14

    Split Pandas Dataframe into separate pieces based on column values

  15. 15

    How to split pandas dataframe based on difference of values in a column

  16. 16

    Split a text(with names and values) column into multiple columns in Pandas DataFrame

  17. 17

    Split Pandas Dataframe into separate pieces based on column values

  18. 18

    How to split pandas dataframe based on difference of values in a column

  19. 19

    Pandas: Split a dataframe rows and re-arrange column values

  20. 20

    New column to pandas dataframe according to values from other column

  21. 21

    Efficiently replace values from a column to another column Pandas DataFrame

  22. 22

    pandas - write to csv only if column contains certain values

  23. 23

    Find unique values from a column

  24. 24

    Unique values from array in column

  25. 25

    PANDAS split dataframe to multiple by unique values rows

  26. 26

    Extract unique values and number of occurrences of each value from dataframe column

  27. 27

    Add a column to a dataframe using (extracting unique values) from existing columns

  28. 28

    Add a column to a dataframe using (extracting unique values) from existing columns

  29. 29

    R - reshape dataframe from duplicated column names but unique values

HotTag

Archive