I have a dataframe
name salary department position
a 25000 x normal employee
b 50000 y normal employee
c 10000 y experienced employee
d 20000 x experienced employee
I would like to get the result like the format below:
dept total salary salary_percentage count_normal_employee count_experienced_employee
x 55000 55000/115000 1 1
y 60000 60000/115000 1 1
You can use pivot_table
with fillna
for df1
, groupby
with sum
, divide new column total salary
with sum
of original column salary
for df2
and last merge
:
#pivot df, fill NaN by 0
df1 = df.pivot_table(index='department', columns='position', values='name', aggfunc='count').fillna(0).reset_index()
#reset column name - for nicer df
df1.columns.name = None
print df1
department experienced employee normal employee
0 x 1 1
1 y 1 1
#sum by groups by column department and rename column salary
df2 = df.groupby('department')['salary'].sum().reset_index().rename(columns={'salary':'total salary'})
df2['salary_percentage'] = df2['total salary'] / df['salary'].sum()
print df2
department total salary salary_percentage
0 x 45000 0.428571
1 y 60000 0.571429
print pd.merge(df1, df2, on=['department'])
department experienced employee normal employee total salary \
0 x 1 1 45000
1 y 1 1 60000
salary_percentage
0 0.428571
1 0.571429
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments