How do I apply a value from a dataframe based on the value of a multi-index of another dataframe?

Thornhale

I have the following:

Dataframe 1 (Multi-index dataframe):

                             |      Assay_A       |
---------------------------------------------------
Index_A | Index_B | Index_C  | mean | std | count |
---------------------------------------------------
   128     12345      AAA      123    2     4

Dataframe 2:

    Index | Col_A | Col_B | Col_C | mean
    -------------------------------------
      1      128    12345   AAA     456

where Col_X = Index_X for a,b,c.

I have been spending all morning trying to do the following:

How do I pick the correct mean in dataframe 2 (which has to match up on Col ABC) so I can do mathematical operations on it. For example, I want to take the mean of dataframe 1 and divide it by the correctly chosen mean of dataframe 2.

Ideally, I want to store the results of the operation in a new column. So the final output should look like this:

                             |            Assay_A          |
------------------------------------------------------------
Index_A | Index_B | Index_C  | mean | std | count | result |
------------------------------------------------------------
   128     12345      AAA      123    2     4       0.26

Perhaps there is an easier way to do this I would be open to any such suggestions as well.

ehhh

what I suggest you do is 1) rename the columns of Dataframe 2 to the respective names of the index columns of Dataframe 1, 2) reset the index on Dataframe 1, and 3) merge the two tables based on the now matching column names. Afterwards you can compute whatever you like. The MultiIndex on the columns of Dataframe 2 adds a bit of additional overhead.

Explicitly:

import pandas as pd

# re-create table1
row_index = pd.MultiIndex.from_tuples([(128, 12345, 'AAA')])
row_index.names=['Index_A', 'Index_B', 'Index_C']
table1 = pd.DataFrame(data={'mean': 123, 'std': 2, 'count': 4}, index=row_index)
table1.columns = pd.MultiIndex.from_tuples(zip(['Assay A'] * 3, table1.columns))
print "*** table 1:"
print table1
print ""

# re-create table2
table2 = pd.DataFrame([{'Col_A': 128, 'Col_B': 12345, 'Col_C': 'AAA', 'mean': 456}], index=[1])
table2.index.name = 'Index'
print "*** table 2:"
print table2
print ""

# re-name columns of table2 to match names of respective index columns in table1
table2 = table2.rename(columns={'Col_A': 'Index_A', 'Col_B': 'Index_B', 'Col_C': 'Index_C'})

# Drop 'Assay A' index level on columns of table1;
# without doing that, the following reset_index() will produce a column multi-index
# for Index_A/B/C, so column names will not match the simple column index of table2_renamed.
# If you need to keep the 'Assay A' level here, you will need to also construct a column
# multi-index for table2_renamed (with empty values for the second level).
table1.columns = table1.columns.levels[1]

# Move index columns of table1 back to regular columns
table1 = table1.reset_index()

# Merge the two tables on the now common column names. 'mean' appears in both tables,
# give the column from table2 a suffix '_2'.
joint = pd.merge(table1.reset_index(), table2, on=['Index_A', 'Index_B', 'Index_C'], suffixes={'', '_2'})
print "*** joint, before re-setting index:"
print joint
print ""

# Restore index of the joint table
joint = joint.set_index(['Index_A', 'Index_B', 'Index_C'])

# Compute the 'result'
joint['result'] = joint['mean'] / joint['mean_2']

# drop unused columns
joint = joint.drop(['index', 'mean_2'], axis=1)

# restore column index level
joint.columns = pd.MultiIndex.from_tuples(zip(['Assay A'] * 4, joint.columns))

print "*** final result:"
print joint
print ""

The script output is:

*** table 1:
                        Assay A         
                          count mean std
Index_A Index_B Index_C                 
128     12345   AAA           4  123   2

*** table 2:
       Col_A  Col_B Col_C  mean
Index                          
1        128  12345   AAA   456

*** joint, before re-setting index:
   index  Index_A  Index_B Index_C  count  mean  std  mean_2
0      0      128    12345     AAA      4   123    2     456

*** final result:
                        Assay A                   
                          count mean std    result
Index_A Index_B Index_C                           
128     12345   AAA           4  123   2  0.269737

Hope that helps!

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

How do I assign value to a row in a dataframe based on another in r?

From Dev

How do i lookup a row on one dataframe based on the column cell value and append that to a row on another dataframe?

From Dev

Setting value on one dataframe based on the id and value from another dataframe

From Dev

How to get value from python dataframe based on column index?

From Dev

How can I get a value from other dataframe's column based on other index?

From Dev

How do i change a single index value in pandas dataframe?

From Dev

How do I filter a pandas DataFrame based on value counts?

From Dev

How can I retrieve a row by index value from a Pandas DataFrame?

From Dev

How do I filter DataFrame rows based on a key value pair from a list of dictionaries column field?

From Dev

How can I add a dataframe converted to an array as an element of another dataframe based on a key value

From Dev

With PySpark dataframe locate value from one array based on index and copy to another array

From Dev

Pandas Dataframe apply() method provides a row object, but how do you access the index value

From Dev

How do I substract a value from a column in a dataframe, with unique rows

From Dev

how to drop rows where the index succeed another index value in a dataframe?

From Dev

How do I multiply all elements of a column of a dataframe with value specified for that column in another dataframe?

From Dev

How to add new columns to dataframe with value taken from another dataframe?

From Dev

Copy value from matching index in another dataframe after criteria matched

From Java

If one dataframe value exists in another dataframe, then get a value from the dataframe

From Dev

Change the value from another dataframe

From Dev

How do you update a column's value in a dataframe based off of another row?

From Java

Python Filling dataframe values based on Column Index present in another dataframe value

From Dev

How do I get a table value based on the value of another string?

From Dev

Replace Pandas Dataframe Value Based on Index Range

From Dev

Python Pandas: Find index based on value in DataFrame

From Dev

How do I merge two columns in a dataframe based on a datetime index?

From Dev

Conditional Sums based on another DataFrame column value

From Dev

How do I filter rows based on whether a column value is in a Set of Strings in a Spark DataFrame

From Dev

How can I create an "ID" column in one dataframe based on the index of another dataframe?

From Dev

updating column in one dataframe with value from another dataframe based on matching values

Related Related

  1. 1

    How do I assign value to a row in a dataframe based on another in r?

  2. 2

    How do i lookup a row on one dataframe based on the column cell value and append that to a row on another dataframe?

  3. 3

    Setting value on one dataframe based on the id and value from another dataframe

  4. 4

    How to get value from python dataframe based on column index?

  5. 5

    How can I get a value from other dataframe's column based on other index?

  6. 6

    How do i change a single index value in pandas dataframe?

  7. 7

    How do I filter a pandas DataFrame based on value counts?

  8. 8

    How can I retrieve a row by index value from a Pandas DataFrame?

  9. 9

    How do I filter DataFrame rows based on a key value pair from a list of dictionaries column field?

  10. 10

    How can I add a dataframe converted to an array as an element of another dataframe based on a key value

  11. 11

    With PySpark dataframe locate value from one array based on index and copy to another array

  12. 12

    Pandas Dataframe apply() method provides a row object, but how do you access the index value

  13. 13

    How do I substract a value from a column in a dataframe, with unique rows

  14. 14

    how to drop rows where the index succeed another index value in a dataframe?

  15. 15

    How do I multiply all elements of a column of a dataframe with value specified for that column in another dataframe?

  16. 16

    How to add new columns to dataframe with value taken from another dataframe?

  17. 17

    Copy value from matching index in another dataframe after criteria matched

  18. 18

    If one dataframe value exists in another dataframe, then get a value from the dataframe

  19. 19

    Change the value from another dataframe

  20. 20

    How do you update a column's value in a dataframe based off of another row?

  21. 21

    Python Filling dataframe values based on Column Index present in another dataframe value

  22. 22

    How do I get a table value based on the value of another string?

  23. 23

    Replace Pandas Dataframe Value Based on Index Range

  24. 24

    Python Pandas: Find index based on value in DataFrame

  25. 25

    How do I merge two columns in a dataframe based on a datetime index?

  26. 26

    Conditional Sums based on another DataFrame column value

  27. 27

    How do I filter rows based on whether a column value is in a Set of Strings in a Spark DataFrame

  28. 28

    How can I create an "ID" column in one dataframe based on the index of another dataframe?

  29. 29

    updating column in one dataframe with value from another dataframe based on matching values

HotTag

Archive