我得到了这个功能:
def source_revenue(self):
items = self.data.items()
df = pandas.DataFrame(
{'SOURCE OF BUSINESS': [i[0] for i in items], 'INCOME': [i[1] for i in items]})
pivoting = pd.pivot_table(df, index=['SOURCE OF BUSINESS'], values=['INCOME'])
suming = pivoting.sum(index=(0), columns=(1))
此函数产生以下结果:
INCOME 216424.9
dtype: float64
不求和,它将返回完整的数据帧,如下所示:
INCOME
SOURCE OF BUSINESS
BYD - Other 500.0
BYD - Retail 1584.0
BYD - Transport 42498.0
BYD Beverage - A La Carte 39401.5
BYD Food - A La Carte 瓦厂食品-零点 68365.0
BYD Food - Catering Banquet 53796.0
BYD Rooms 瓦厂房间 5148.0
GS - Retail 386.0
GS Food - A La Carte 48.0
Orchard Retail 130.0
SCH - Food - A La Carte 96.0
SCH - Retail 375.4
SCH - Transport 888.0
SCH Beverage - A La Carte 119.0
Spa 3052.0
XLM Beverage - A La Carte 38.0
我这样做的原因是因为我试图获取所有返回的行的总数,将它们求和并将总数附加到数据帧。
最初,我尝试使用margins = True(我在这里读到要对总数求和并将其附加到数据框,而不是true)
因此,我想知道是否有一种方法可以返回数据框,但是也可以对这些值求和并将总和附加到数据框的末尾,就像这样margins = True
做一样。
我认为您可以将groupby
用作pivot_table
,因为这里groupby
速度更快。
您可以使用pivot_table
,但默认aggfunc
值为np.mean
。很容易忘记它:
pivoting = pd.pivot_table(df,
index=['SOURCE OF BUSINESS'],
values=['INCOME'],
aggfunc=np.mean)
我认为您需要aggfunc=np.sum
:
print df
A B C D
0 zoo one small 1
1 zoo one large 2
2 zoo one large 2
3 foo two small 3
4 foo two small 3
5 bar one large 4
6 bar one small 5
7 bar two small 6
8 bar two large 7
print pd.pivot_table(df, values='D', index=['A'], aggfunc=np.sum)
A
bar 22
foo 6
zoo 5
Name: D, dtype: int64
df1 = df.groupby('A')['D'].sum()
print df1
A
bar 22
foo 6
zoo 5
Name: D, dtype: int64
如果需要添加Total
到Series,请使用loc
和sum
:
print df1.sum()
33
df1.loc['Total'] = df1.sum()
print df1
A
bar 22
foo 6
zoo 5
Total 33
Name: D, dtype: int64
时间:
In [111]: %timeit df.groupby('A')['D'].sum()
1000 loops, best of 3: 581 µs per loop
In [112]: %timeit pd.pivot_table(df, values='D', index=['A'], aggfunc=np.sum)
100 loops, best of 3: 2.28 ms per loop
添加Total
在您df
通过与放大设置:
print df
INCOME
SOURCE OF BUSINESS
BYD - Other 500.0
BYD - Retail 1584.0
BYD - Transport 42498.0
BYD Beverage - A La Carte 39401.5
BYD Food - A La Carte 68365.0
BYD Food - Catering Banquet 53796.0
BYD Rooms 5148.0
GS - Retail 386.0
GS Food - A La Carte 48.0
Orchard Retail 130.0
SCH - Food - A La Carte 96.0
SCH - Retail 375.4
SCH - Transport 888.0
SCH Beverage - A La Carte 119.0
Spa 3052.0
XLM Beverage - A La Carte 38.0
df.loc['Total', 'INCOME'] = df['INCOME'].sum()
print df
INCOME
SOURCE OF BUSINESS
BYD - Other 500.0
BYD - Retail 1584.0
BYD - Transport 42498.0
BYD Beverage - A La Carte 39401.5
BYD Food - A La Carte 68365.0
BYD Food - Catering Banquet 53796.0
BYD Rooms 5148.0
GS - Retail 386.0
GS Food - A La Carte 48.0
Orchard Retail 130.0
SCH - Food - A La Carte 96.0
SCH - Retail 375.4
SCH - Transport 888.0
SCH Beverage - A La Carte 119.0
Spa 3052.0
XLM Beverage - A La Carte 38.0
Total 216424.9
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句