Efficient calculation on a pandas dataframe

learn2day

I need to make my code faster. The problem is very simple, but I'm not finding a good way to make the calculation without looping through the whole DataFrame.

I've got three dataFrames: A, B and C.

A and B have 3 columns each, and the following format:

A (10 rows):

     Canal Gerencia grad
0    'ABC'   'DEF'   23
etc...

B (25 rows):

     Marca  Formato  grad
0    'GHI'   'JKL'    43
etc...

DataFrame C, on the other hand, has 5 columns:

C (5000 rows):

     Marca  Formato  Canal  Gerencia  grad
0    'GHI'   'JKL'    'ABC'   'DEF'   -102
etc...

I need a vector with the same length of DataFrame 'C' that adds up the values of 'grad' from the three tables, for example:

m = 'GHI'
f = 'JKL'
c = 'ABC'
g = 'DEF'
res = C['grad'][C['Marca']==m][C['Formato']==f][C['Canal']==c][C['Gerencia']==g] + A['grad'][A['Canal']==c][A['Gerencia']==g] + B['grad'][B['Formato']==f][B['Marca']==m]
>>-36

I tried looping through the C dataFrame, but is too slow. I understand I should try to avoid the loop through the dataFrame, but don't know how to do this. My actual code is the following (works, but VERY slow):

res=[]
for row_index, row in C.iterrows():
    vec1 = A['Gerencia']==row['Gerencia']
    vec2 = A['Canal']==row['Canal']
    vec3 = B['Marca']==row['Marca']
    vec4 = B['Formato']==row['Formato']
    grad = row['grad']
    res.append(grad + sum(A['grad'][vec1][vec2])+ sum(B['grad'][vec3][vec4]))

I would really appreciate any help on making this routine quicker. Thank you!

Ami Tavory

IIUC, you need to merge C with A:

C = pd.merge(C, A, on=['Canal', 'Gerencia'])

(this will add a column to it) and then merge the result with B:

C = pd.merge(C, B, on=['Marca', 'Formato'])

(again adding a column to C)

At this point, check C for the names of the columns; say they are grad_foo, grad_bar, grad_baz. So just add them

C.grad_foo + C.grad_bar + C.grad_baz

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Pandas Efficient VWAP Calculation

From Dev

Pandas Dataframe Complex Calculation

From Dev

pandas dataframe aggregate calculation

From Dev

pandas dataframe aggregate calculation

From Dev

Efficient Pandas Dataframe insert

From Dev

Pandas DataFrame matrix based calculation

From Dev

Calculation within Pandas dataframe group

From Dev

pandas efficient dataframe set row

From Dev

Pandas DataFrame efficient moving of data

From Dev

Pandas DataFrame efficient moving of data

From Dev

Efficient loop through pandas dataframe

From Dev

Calculation between groups in a Pandas multiindex dataframe

From Dev

propagate calculation along pandas dataframe rows

From Dev

Optimal Way to solve a pandas dataframe percentage calculation

From Dev

Pandas, multiple calculation per row of the dataframe

From Dev

Efficient storage of large string column in pandas dataframe

From Dev

how to modify a pandas dataframe column more efficient

From Dev

Efficient max selection in pandas dataframe with selection condition

From Dev

The efficient way to transform pandas dataframe into new format

From Dev

Efficient strided slicing along a column in a pandas dataframe

From Dev

simple/efficient way to expand a pandas dataframe

From Dev

Adding pandas dataframe iteratively in a memory efficient way

From Dev

Efficient operation over grouped dataframe Pandas

From Dev

Efficient way to add new column to pandas dataframe

From Dev

how to modify a pandas dataframe column more efficient

From Dev

More efficient way to iterate groupby Pandas dataframe?

From Dev

Convert a pandas dataframe function into a more efficient function

From Dev

Efficient way to avoid for loops in Pandas DataFrame

From Dev

Efficient evaluation of weighted average variable in a Pandas Dataframe

Related Related

  1. 1

    Pandas Efficient VWAP Calculation

  2. 2

    Pandas Dataframe Complex Calculation

  3. 3

    pandas dataframe aggregate calculation

  4. 4

    pandas dataframe aggregate calculation

  5. 5

    Efficient Pandas Dataframe insert

  6. 6

    Pandas DataFrame matrix based calculation

  7. 7

    Calculation within Pandas dataframe group

  8. 8

    pandas efficient dataframe set row

  9. 9

    Pandas DataFrame efficient moving of data

  10. 10

    Pandas DataFrame efficient moving of data

  11. 11

    Efficient loop through pandas dataframe

  12. 12

    Calculation between groups in a Pandas multiindex dataframe

  13. 13

    propagate calculation along pandas dataframe rows

  14. 14

    Optimal Way to solve a pandas dataframe percentage calculation

  15. 15

    Pandas, multiple calculation per row of the dataframe

  16. 16

    Efficient storage of large string column in pandas dataframe

  17. 17

    how to modify a pandas dataframe column more efficient

  18. 18

    Efficient max selection in pandas dataframe with selection condition

  19. 19

    The efficient way to transform pandas dataframe into new format

  20. 20

    Efficient strided slicing along a column in a pandas dataframe

  21. 21

    simple/efficient way to expand a pandas dataframe

  22. 22

    Adding pandas dataframe iteratively in a memory efficient way

  23. 23

    Efficient operation over grouped dataframe Pandas

  24. 24

    Efficient way to add new column to pandas dataframe

  25. 25

    how to modify a pandas dataframe column more efficient

  26. 26

    More efficient way to iterate groupby Pandas dataframe?

  27. 27

    Convert a pandas dataframe function into a more efficient function

  28. 28

    Efficient way to avoid for loops in Pandas DataFrame

  29. 29

    Efficient evaluation of weighted average variable in a Pandas Dataframe

HotTag

Archive