Faster way to transform group with mean value in Pandas

YXD

I have a Pandas dataframe where I am trying to replace the values in each group by the mean of the group. On my machine, the line df["signal"].groupby(g).transform(np.mean) takes about 10 seconds to run with N and N_TRANSITIONS set to the numbers below.

Is there any faster way to achieve the same result?

import pandas as pd
import numpy as np
from time import time

np.random.seed(0)

N = 120000
N_TRANSITIONS = 1400

# generate groups
transition_points = np.random.permutation(np.arange(N))[:N_TRANSITIONS]
transition_points.sort()
transitions = np.zeros((N,), dtype=np.bool)
transitions[transition_points] = True
g = transitions.cumsum()

df = pd.DataFrame({ "signal" : np.random.rand(N)})

# here is my bottleneck for large N
tic = time()
result = df["signal"].groupby(g).transform(np.mean)
toc = time()
print toc - tic
YXD

Inspired by Jeff's answer. This is the fastest method on my machine:

pd.Series(np.repeat(grp.mean().values, grp.count().values))

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Faster way to transform group with mean value in Pandas

From Java

Faster way of computing the mean with pandas groupy + apply and condensing groups

From Dev

Pandas - Faster way to find same value in different columns in CSV file?

From Dev

Plot with pandas: group and mean

From Dev

Pandas transform columns into percentage by group

From Dev

Group by column and get mean of the the group pandas

From Dev

Faster way to read Excel files to pandas dataframe

From Dev

Faster way to rank rows in subgroups in pandas dataframe

From Dev

Strange Behavior With Pandas Group By - Transform On String Columns

From Dev

Pandas - group by column and transform the data to numpy array

From Dev

Strange Behavior With Pandas Group By - Transform On String Columns

From Dev

NumPy - Faster way to implement threshold value ceiling

From Dev

Faster way to find the first TRUE value in a vector

From Dev

Faster way to find the next greatest value in array

From Dev

The efficient way to transform pandas dataframe into new format

From Dev

Pandas: Sorting columns by their mean value

From Dev

value counts of group by in pandas

From Dev

value counts of group by in pandas

From Dev

Select one group and transform the remaining group to columns in pandas

From Dev

Transform each value in a list the same way

From Dev

Python pandas dataframe group mean filtered by condition

From Dev

Pandas: Group by, filter rows, get the mean

From Java

Group pandas dataframe in unusual way

From Dev

Pythonic way to group by a pandas table

From Dev

Need to transform file faster

From Dev

Pandas groupby transform to get not null date value

From Dev

pandas dataframe: is there any way to transform columns as row values in pandas

From Dev

calculate mean by group by avoiding first value in the group in R

From Dev

Group By a specific value on a column for faster execution time - SQL

Related Related

  1. 1

    Faster way to transform group with mean value in Pandas

  2. 2

    Faster way of computing the mean with pandas groupy + apply and condensing groups

  3. 3

    Pandas - Faster way to find same value in different columns in CSV file?

  4. 4

    Plot with pandas: group and mean

  5. 5

    Pandas transform columns into percentage by group

  6. 6

    Group by column and get mean of the the group pandas

  7. 7

    Faster way to read Excel files to pandas dataframe

  8. 8

    Faster way to rank rows in subgroups in pandas dataframe

  9. 9

    Strange Behavior With Pandas Group By - Transform On String Columns

  10. 10

    Pandas - group by column and transform the data to numpy array

  11. 11

    Strange Behavior With Pandas Group By - Transform On String Columns

  12. 12

    NumPy - Faster way to implement threshold value ceiling

  13. 13

    Faster way to find the first TRUE value in a vector

  14. 14

    Faster way to find the next greatest value in array

  15. 15

    The efficient way to transform pandas dataframe into new format

  16. 16

    Pandas: Sorting columns by their mean value

  17. 17

    value counts of group by in pandas

  18. 18

    value counts of group by in pandas

  19. 19

    Select one group and transform the remaining group to columns in pandas

  20. 20

    Transform each value in a list the same way

  21. 21

    Python pandas dataframe group mean filtered by condition

  22. 22

    Pandas: Group by, filter rows, get the mean

  23. 23

    Group pandas dataframe in unusual way

  24. 24

    Pythonic way to group by a pandas table

  25. 25

    Need to transform file faster

  26. 26

    Pandas groupby transform to get not null date value

  27. 27

    pandas dataframe: is there any way to transform columns as row values in pandas

  28. 28

    calculate mean by group by avoiding first value in the group in R

  29. 29

    Group By a specific value on a column for faster execution time - SQL

HotTag

Archive