Add values to one column of a pandas dataframe based on the values in another

juniper-

Say I have two matrices, an original and a reference:

import pandas as pa
print "Original Data Frame"
# Create a dataframe
oldcols = {'col1':['a','a','b','b'], 'col2':['c','d','c','d'], 'col3':[1,2,3,4]}
a = pa.DataFrame(oldcols)
print "Original Table:"
print a

print "Reference Table:"
b = pa.DataFrame({'col1':['x','x'], 'col2':['c','d'], 'col3':[10,20]})
print b

Where the tables look like this:

Original Data Frame
Original Table:
  col1 col2  col3
0    a    c     1
1    a    d     2
2    b    c     3
3    b    d     4

Reference Table:
  col1 col2  col3
0    x    c    10
1    x    d    20

Now I want to subtract from the third column (col3) of the original table (a), the value in the reference table (c) in the row where the second columns of the two tables match. So the first row of table two should have the value 10 added to the third column, because the row of table b where the column is col2 is 'c' has a value of 10 in col3. Make sense? Here's some code that does that:

col3 = []
for ix, row in a.iterrows():
    col3 += [row[2] + b[b['col2'] == row[1]]['col3']]

a['col3'] = col3
print "Output Table:"
print a

Yielding the following output:

Output Table:
  col1 col2  col3
0    a    c  [11]
1    a    d  [22]
2    b    c  [13]
3    b    d  [24]

My question is, is there a more elegant way to do this? Also, the results in 'col3' should not be lists. Solutions using numpy are also welcome.

behzad.nouri

I did not quite understand your description of what you are trying to do, but the output you have shown can be generated by first merging the two data frames and then some simple operations;

>>> df = a.merge(b.filter(['col2', 'col3']), how='left',
                 left_on='col2', right_on='col2', suffixes=('', '_'))
>>> df
  col1 col2  col3  col3_
0    a    c     1     10
1    b    c     3     10
2    a    d     2     20
3    b    d     4     20

[4 rows x 4 columns]
>>> df.col3_.fillna(0, inplace=True) # in case there are no matches
>>> df.col3 += df.col3_
>>> df
  col1 col2  col3  col3_
0    a    c    11     10
1    b    c    13     10
2    a    d    22     20
3    b    d    24     20

[4 rows x 4 columns]
>>> df.drop('col3_', axis=1, inplace=True)
>>> df
  col1 col2  col3
0    a    c    11
1    b    c    13
2    a    d    22
3    b    d    24

[4 rows x 3 columns]

If values in col2 in b are not unique, then probably you also need something like:

>>> b.groupby('col2', as_index=False)['col3'].aggregate(sum)

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Add values to one column of a pandas dataframe based on the values in another

From Dev

Select column dynamically in Pandas dataframe based on values in a list or another column

From Dev

Add up one column based on the values of another column

From Dev

Add column from one dataframe to another, for values present in overlapping column

From Dev

Record values of one column based on another column: Python & Pandas

From Dev

How to add new column with values based on another column in pandas

From Dev

Rolling over values from one column to other based on another dataframe

From Dev

Average values of one column based on the values of another

From Dev

Replace column values based on another dataframe python pandas - better way?

From Dev

Pandas Set multiple column and row values to nan based on another dataframe

From Dev

Incrementing Values in Column Based on Values in Another (Pandas)

From Dev

update non-available values of one pandas column based on another

From Dev

Changing values of one column based on the other three columns in pandas dataframe

From Dev

Finding a timedelta in pandas dataframe based upon specific values in one column

From Dev

Changing values of one column based on the other three columns in pandas dataframe

From Dev

Pandas, add column reading values from another dataframe

From Dev

subtracting values in one column based on another column

From Dev

Create a column based on another dataframe values

From Dev

replace column values in one dataframe by values of another dataframe

From Dev

Replace values from a column of one dataframe by values from another dataframe

From Dev

Excel concatenation of one column based on values in another

From Dev

What is the fastest way to populate one pandas dataframe based on values from another pandas dataframe?

From Dev

Creating a dataframe from another one by column values

From Dev

Creating a dataframe from another one by column values

From Dev

Pandas, subtract values based on value of another column

From Java

pandas: modifying values in dataframe from another column

From Java

Pandas - map values from one dataframe to another

From Dev

Replace values in pandas dataframe based on column names

From Dev

Find Pandas dataframe column based on values, in Python

Related Related

  1. 1

    Add values to one column of a pandas dataframe based on the values in another

  2. 2

    Select column dynamically in Pandas dataframe based on values in a list or another column

  3. 3

    Add up one column based on the values of another column

  4. 4

    Add column from one dataframe to another, for values present in overlapping column

  5. 5

    Record values of one column based on another column: Python & Pandas

  6. 6

    How to add new column with values based on another column in pandas

  7. 7

    Rolling over values from one column to other based on another dataframe

  8. 8

    Average values of one column based on the values of another

  9. 9

    Replace column values based on another dataframe python pandas - better way?

  10. 10

    Pandas Set multiple column and row values to nan based on another dataframe

  11. 11

    Incrementing Values in Column Based on Values in Another (Pandas)

  12. 12

    update non-available values of one pandas column based on another

  13. 13

    Changing values of one column based on the other three columns in pandas dataframe

  14. 14

    Finding a timedelta in pandas dataframe based upon specific values in one column

  15. 15

    Changing values of one column based on the other three columns in pandas dataframe

  16. 16

    Pandas, add column reading values from another dataframe

  17. 17

    subtracting values in one column based on another column

  18. 18

    Create a column based on another dataframe values

  19. 19

    replace column values in one dataframe by values of another dataframe

  20. 20

    Replace values from a column of one dataframe by values from another dataframe

  21. 21

    Excel concatenation of one column based on values in another

  22. 22

    What is the fastest way to populate one pandas dataframe based on values from another pandas dataframe?

  23. 23

    Creating a dataframe from another one by column values

  24. 24

    Creating a dataframe from another one by column values

  25. 25

    Pandas, subtract values based on value of another column

  26. 26

    pandas: modifying values in dataframe from another column

  27. 27

    Pandas - map values from one dataframe to another

  28. 28

    Replace values in pandas dataframe based on column names

  29. 29

    Find Pandas dataframe column based on values, in Python

HotTag

Archive