pandas groupby resample unexpected result

piRSquared

Setup

import pandas as pd

df = pd.DataFrame({'grp': [1, 2] * 2, 'value': range(4)},
                  index=pd.Index(pd.date_range('2016-03-01', periods=7)[::2], name='Date')
                 ).sort_values('grp')

I wanted to group by 'grp' and resample my index daily, forward filling missing values. I expected this to work:

print df.groupby('grp').resample('D').ffill()

            grp  value
Date                  
2016-03-01    1      0
2016-03-05    1      2
2016-03-03    2      1
2016-03-07    2      3

It did not. So I tried this:

print df.groupby('grp', group_keys=False).apply(lambda df: df.resample('D').ffill())

            grp  value
Date                  
2016-03-01    1      0
2016-03-02    1      0
2016-03-03    1      0
2016-03-04    1      0
2016-03-05    1      2
2016-03-03    2      1
2016-03-04    2      1
2016-03-05    2      1
2016-03-06    2      1
2016-03-07    2      3

It did work. Shouldn't these two methods have produced the same output? What am I missing?


Response to ayhan's comment

print sys.version
print pd.__version__

2.7.11 |Anaconda custom (x86_64)| (default, Dec  6 2015, 18:57:58) 
[GCC 4.2.1 (Apple Inc. build 5577)]
0.18.0

ayhan showed that the results looked the same on python 3, pandas 18.1

After updating pandas to 18.1

2.7.11 |Anaconda custom (x86_64)| (default, Dec  6 2015, 18:57:58) 
[GCC 4.2.1 (Apple Inc. build 5577)]
0.18.1

The issue has been resolved.

ayhan

It looks like one of the issues due to the changes in resample API in version 0.18.0.

It works as expected in 0.18.1:

df.groupby('grp').resample('D').ffill()
Out[2]: 
                grp  value
grp Date                  
1   2016-03-01    1      0
    2016-03-02    1      0
    2016-03-03    1      0
    2016-03-04    1      0
    2016-03-05    1      2
2   2016-03-03    2      1
    2016-03-04    2      1
    2016-03-05    2      1
    2016-03-06    2      1
    2016-03-07    2      3

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Pandas groupby result shape unexpected

From Dev

Pandas: resample timeseries with groupby

From Dev

Calculate percentiles/quantiles for a timeseries with resample or groupby - pandas

From Dev

Computing np.diff in Pandas after using groupby leads to unexpected result

From Dev

Pandas groupby result into multiple columns

From Dev

Filtering a multicolumn groupby result in Pandas

From Java

limit amount of rows as result of groupby Pandas

From Dev

Merging a pandas groupby result back into DataFrame

From Dev

Pandas : Assign result of groupby to dataframe to a new column

From Dev

How to broadcast Pandas groupby result to all rows?

From Dev

Adding a 'count' column to the result of a groupby in pandas?

From Dev

Pandas multiplying a dataframe column with groupby result

From Dev

Construct a superset from pandas groupby operation result

From Dev

Transform pandas groupby / aggregate result to dataframe

From Dev

Pandas: groupby and make a new column by concatenating the result

From Dev

Pandas resample OHLC

From Dev

Pandas - resample and standard deviation

From Java

pandas resample documentation

From Java

Pandas resample with start date

From Dev

resample in pandas with the method in a variable

From Dev

Pandas resample numpy array

From Dev

Pandas: resample with an external Series

From Dev

RE pandas resample

From Dev

Resample a 'tidy' dataframe with pandas

From Dev

Resample a Pandas dataframe with coefficients

From Dev

RE pandas resample

From Dev

Resample a 'tidy' dataframe with pandas

From Dev

Pandas Resample Missing Rows

From Dev

Python Pandas groupby and returning result to original Pandas Data Frame

Related Related

HotTag

Archive