I have sales data for different customers on different dates. But the dates are not continuous and I would like to resample the data to daily frequency. How can I do this?
import numpy as np
import pandas as pd
df = pd.DataFrame({'id': list('aababcbc'),
'date': pd.date_range('2022-01-01',periods=8),
'value':range(8)}).sort_values('id')
df
id date value
0 a 2022-01-01 0
1 a 2022-01-02 1
3 a 2022-01-04 3
2 b 2022-01-03 2
4 b 2022-01-05 4
6 b 2022-01-07 6
5 c 2022-01-06 5
7 c 2022-01-08 7
The required output is following
id date value
a 2022-01-01 0
a 2022-01-02 1
a 2022-01-03 0 ** there is no data for a in this day
a 2022-01-04 3
b 2022-01-03 2
b 2022-01-04 0 ** there is no data for b in this day
b 2022-01-05 4
b 2022-01-06 0 ** there is no data for b in this day
b 2022-01-07 6
c 2022-01-06 5
c 2022-01-07 0 ** there is no data for c in this day
c 2022-01-08 7
df.groupby(['id']).resample('D',on='date')['value'].sum().reset_index()
df["date"] = pd.to_datetime(df["date"])
df.set_index("date").groupby("id").resample("1d").sum()
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments