Pandas DataFrame compare columns to a threshold column using where()

Lamakaha

I need to null values in several columns where they are less in absolute value than correspond values in the threshold column

        import pandas as pd
        import numpy as np
        df=pd.DataFrame({'key1': ['Ohio', 'Ohio', 'Ohio', 'Nevada', 'Nevada'],
          'key2': [2000, 2001, 2002, 2001, 2002], 
          'data1': np.random.randn(5),
          'data2': np.random.randn(5),
           'threshold': [0.5,0.4,0.6,0.1,0.2]}).set_index(['key1','key2'])

                   data1    data2   threshold
key1    key2            
Ohio    2000    0.201240    0.083833    0.5
        2001    -1.993489   -1.081208   0.4
        2002    0.759038    -1.688769   0.6
Nevada  2001    -0.543916   1.412679    0.1
        2002    -1.545781   0.181224    0.2

this gives me an error "cannot join with no level specified and no overlapping names" df.where(df.abs()>df['threshold'])

this works but obviously against a scalar df.where(df.abs()>0.5)

                       data1           data2    threshold
        key1    key2            
        Ohio    2000    NaN              NaN    NaN
                2001    -1.993489   -1.081208   NaN
                2002    0.759038    -1.688769   NaN
      Nevada    2001    -0.543916   1.412679    NaN
                2002    -1.545781        NaN    NaN

BTW, this does appear to be giving me an OK result - still want to find out how to do it with where() method

      df.apply(lambda x:x.where(x.abs()>x['threshold']),axis=1)
Marius

Here's a slightly different option using the DataFrame.gt (greater than) method.

df[df.abs().gt(df['threshold'], axis='rows')]
Out[16]: 
# Output might not look the same because of different random numbers,
# use np.random.seed() for reproducible random number gen
Out[13]: 
                data1     data2  threshold
key1   key2                               
Ohio   2000       NaN       NaN        NaN
       2001  1.954543  1.372174        NaN
       2002       NaN       NaN        NaN
Nevada 2001  0.275814  0.854617        NaN
       2002       NaN  0.204993        NaN

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Apply a threshold on a Pandas DataFrame column

From Dev

SQL where in clause using column in pandas dataframe

From Dev

Sort pandas dataframe by multiple columns using custom compare functions

From Dev

Sort pandas dataframe by multiple columns using custom compare functions

From Dev

Compare column names of Pandas Dataframe

From Java

Compare two columns using pandas

From Dev

Changing values in multiple columns of a pandas DataFrame using known column values

From Dev

Pandas dataframe - identify rows with value over threshold in any column

From Dev

How to iterate over a pandas dataframe and compare certain columns based on a third column?

From Java

Pandas compare multiple columns against one column

From Dev

Compare Boolean Row values across multiple Columns in Pandas using & / np.where() / np.any()

From Dev

Combine columns in a Pandas DataFrame to a column of lists in a DataFrame

From Dev

How to compare a value of a single column over multiple columns in the same row using pandas?

From Dev

creating Pandas DataFrame where each header column will have two sub columns

From Dev

creating Pandas DataFrame where each header column will have two sub columns

From Dev

Identify where pandas dataframe columns match string

From Dev

how to compare two columns and get the mean value of the the 3rd column for all matching items in the two in python pandas dataframe?

From Dev

Issues using compare lists in pandas DataFrame

From Dev

pandas compare and put datas using two dataframe

From Dev

Create a new column using specific columns in Pandas using DataFrame.apply

From Dev

pandas dataframe: return column that is a compression of other columns

From Dev

Pandas dataframe - transform column values into individual columns

From Dev

Pandas, DataFrame: Splitting one column into multiple columns

From Dev

Slice pandas dataframe json column into columns

From Dev

Subtracting columns based on key column in pandas dataframe

From Dev

Convert a column in a pandas DataFrame into multiple columns

From Dev

Pandas Dataframe Filtering Columns and return column name

From Dev

Pandas, DataFrame: Splitting one column into multiple columns

From Dev

Splitting a Pandas DataFrame column into two columns

Related Related

  1. 1

    Apply a threshold on a Pandas DataFrame column

  2. 2

    SQL where in clause using column in pandas dataframe

  3. 3

    Sort pandas dataframe by multiple columns using custom compare functions

  4. 4

    Sort pandas dataframe by multiple columns using custom compare functions

  5. 5

    Compare column names of Pandas Dataframe

  6. 6

    Compare two columns using pandas

  7. 7

    Changing values in multiple columns of a pandas DataFrame using known column values

  8. 8

    Pandas dataframe - identify rows with value over threshold in any column

  9. 9

    How to iterate over a pandas dataframe and compare certain columns based on a third column?

  10. 10

    Pandas compare multiple columns against one column

  11. 11

    Compare Boolean Row values across multiple Columns in Pandas using & / np.where() / np.any()

  12. 12

    Combine columns in a Pandas DataFrame to a column of lists in a DataFrame

  13. 13

    How to compare a value of a single column over multiple columns in the same row using pandas?

  14. 14

    creating Pandas DataFrame where each header column will have two sub columns

  15. 15

    creating Pandas DataFrame where each header column will have two sub columns

  16. 16

    Identify where pandas dataframe columns match string

  17. 17

    how to compare two columns and get the mean value of the the 3rd column for all matching items in the two in python pandas dataframe?

  18. 18

    Issues using compare lists in pandas DataFrame

  19. 19

    pandas compare and put datas using two dataframe

  20. 20

    Create a new column using specific columns in Pandas using DataFrame.apply

  21. 21

    pandas dataframe: return column that is a compression of other columns

  22. 22

    Pandas dataframe - transform column values into individual columns

  23. 23

    Pandas, DataFrame: Splitting one column into multiple columns

  24. 24

    Slice pandas dataframe json column into columns

  25. 25

    Subtracting columns based on key column in pandas dataframe

  26. 26

    Convert a column in a pandas DataFrame into multiple columns

  27. 27

    Pandas Dataframe Filtering Columns and return column name

  28. 28

    Pandas, DataFrame: Splitting one column into multiple columns

  29. 29

    Splitting a Pandas DataFrame column into two columns

HotTag

Archive