resample irregularly spaced data in pandas

durbachit Published at Dev

durbachit

Is it somehow possible to use resample on irregularly spaced data? (I know that the documentation says it's for "resampling of regular time-series data", but I wanted to try if it works on irregular data, too. Maybe it doesn't, or maybe I am doing something wrong.)

In my real data, I have generally 2 samples per hour, the time difference between them ranging usually from 20 to 40 minutes. So I was hoping to resample them to a regular hourly series.

To test if I am using it right, I used some random list of dates that I already had, so it may not be a best example but at least a solution that works for it will be very robust. here it is:

    fraction  number                time
0   0.729797       0 2014-10-23 15:44:00
1   0.141084       1 2014-10-30 19:10:00
2   0.226900       2 2014-11-05 21:30:00
3   0.960937       3 2014-11-07 05:50:00
4   0.452835       4 2014-11-12 12:20:00
5   0.578495       5 2014-11-13 13:57:00
6   0.352142       6 2014-11-15 05:00:00
7   0.104814       7 2014-11-18 07:50:00
8   0.345633       8 2014-11-19 13:37:00
9   0.498004       9 2014-11-19 22:47:00
10  0.131665      10 2014-11-24 15:28:00
11  0.654018      11 2014-11-26 10:00:00
12  0.886092      12 2014-12-04 06:37:00
13  0.839767      13 2014-12-09 00:50:00
14  0.257997      14 2014-12-09 02:00:00
15  0.526350      15 2014-12-09 02:33:00

Now I want to resample these for example monthly:

df_new = df.set_index(pd.DatetimeIndex(df['time']))
df_new['fraction'] = df.fraction.resample('M',how='mean')
df_new['number'] = df.number.resample('M',how='mean')

But I get TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'RangeIndex' - unless I did something wrong with assigning the datetime index, it must be due to the irregularity?

So my questions are:

Am I using it correctly?
If 1==True, is there no straightforward way to resample the data?

(I only see a solution in first reindexing the data to get finer intervals, interpolate the values in between and then reindexing it to hourly interval. If it is so, then a question regarding the correct implementation of reindex will follow shortly.)

root

You don't need to explicitly use DatetimeIndex, just set 'time' as the index and pandas will take care of the rest, so long as your 'time' column has been converted to datetime using pd.to_datetime or some other method. Additionally, you don't need to resample each column individually if you're using the same method; just do it on the entire DataFrame.

# Convert to datetime, if necessary.
df['time'] = pd.to_datetime(df['time'])

# Set the index and resample (using month start freq for compact output).
df = df.set_index('time')
df = df.resample('MS').mean()

The resulting output:

            fraction  number
time                        
2014-10-01  0.435441     0.5
2014-11-01  0.430544     6.5
2014-12-01  0.627552    13.5

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at2020-10-25

Comments

0 comments

From Dev

Related Related

Article

resample irregularly spaced data in pandas

resample irregularly spaced data in pandas

Unevenly (irregularly) spaced data for colorbar with evenly spaced colors

Irregularly spaced columns in a table

Resample OHLC data with pandas

Resample or normalize trajectory data so points are evenly spaced

pandas resample nested ohlc data

How to groupby and resample data in pandas?

Efficient method of calculating density of irregularly spaced points

gaussian filter on irregularly spaced (x,y) series?

resample or interpolate a unevenly spaced path

How to handle irregularly spaced timeseries and returns a regularly spaced one

How to resample data in Pandas with discrete data?

Pandas resample by first day in my data

pandas groupby resample leads to missing data

Resample Daily Data to Monthly with Pandas (date formatting)

Pandas Resample Upsample last date / edge of data

resample data within each group in pandas

Pandas resample based on higher resolution data

How to use pandas to resample time series data

Pandas Resample OHLC data Skipping time

Creating Probability/Frequency Axis Grid (Irregularly Spaced) with Matplotlib

Selecting approximately regular samples from an irregularly spaced vector

How to create view of an irregularly spaced slice of a numpy array?

Extracting strings from an irregularly spaced pdf into a tidy R dataframe

Use an irregularly spaced, non-categorical axis on a categorical plot in seaborn

how to calculate timestamp of 2 events which are irregularly spaced apart

How to get evenly-spaced data quickly with a MultiIndex in pandas

How to turn (interpolate) this irregularly spaced time series into a regularly spaced one in R or Matlab?

How to resample daily data to hourly data for all whole days with pandas?

resample time data from list data in pandas python