下面是df,我需要分析数据。
gender dob list
0 M 01/01/87 [['Office/Work'],['31-35'], ['Salaried']]
1 M 01/01/94 [[Movies,Restaurants'],['21-25'], ['Salaried']]
2 M 01/01/95 [['College/Park'],['21-25'],['Student']]
3 F 01/01/97 [['College'], ['21-25'], ['Student']]
预期结果1.我需要分析数据集中有多少薪水
df ['salaried']
Total = 2, Male = 2, Female = 0
Total = 2, Male = 1, Female = 1
Total = 1, Male = 1, Female=0
按不同年龄段df ['age_group']分组
Age_Group Total Male Female ['21-25'] 3 2 1 ['31-35'] 1 1 0
男对女的百分比是多少
round(len(df.loc[df['gender'] == 'M']) / (len(df.loc[df['gender'] == 'M']) + len(df.loc[df['gender'] == 'F'])),2)*100
explode
将列的元素列表拆分为行。df=pd.DataFrame({'gender':['M','M','M','F'],'B':[[['Office/Work'],['31-35'], ['Salaried']],[['Movies,Restaurants'],['21-25'], ['Salaried']],[[
'College/Park'],['21-25'],['Student']],[['College'], ['21-25'], ['Student']]]})
df:
gender B
0 M [[Office/Work], [31-35], [Salaried]]
1 M [[Movies,Restaurants], [21-25], [Salaried]]
2 M [[College/Park], [21-25], [Student]]
3 F [[College], [21-25], [Student]]
x=df.explode('B')
X:
gender B
0 M [Office/Work]
0 M [31-35]
0 M [Salaried]
1 M [Movies,Restaurants]
1 M [21-25]
1 M [Salaried]
2 M [College/Park]
2 M [21-25]
2 M [Student]
3 F [College]
3 F [21-25]
3 F [Student]
x['B']=x.B.astype(str)
final_df=x.groupby(['B','gender']).size().unstack(fill_value=0)
final_df:
gender F M
B
['21-25'] 1 2
['31-35'] 0 1
['College'] 1 0
['College/Park'] 0 1
['Movies,Restaurants'] 0 1
['Office/Work'] 0 1
['Salaried'] 0 2
['Student'] 1 1
您可以使用F,M列计算总计。
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句