我有这样的csv文件:
ID Value Amount
---- ------- -------
A 3 2
A 4 4
B 3 6
C 5 5
A 3 2
B 10 1
我想通过“ ID”列获取“值”或“金额”列的总和。我想要输出“ A”,它应该给我所有与A均值[3 + 4 + 3]相关的值之和。
我的代码:
import csv
file = open(datafile.csv)
rows=csv.DictReader(file)
summ=0.0
count=0
for r in rows:
summ=summ+int(r['Value'])
count=count+1
print "Mean for column Value is: ",(summ/count)
file.close()
您可以使用defaultdict
oflist
来按ID列对数据进行分组。然后使用sum()
产生总计。
from collections import defaultdict
with open('datafile.csv') as f:
d = defaultdict(list)
next(f) # skip first header line
next(f) # skip second header line
for line in f:
id_, value, amount = line.split()
d[id_].append((int(value), int(amount)))
# sum and average of column Value by ID
for id_ in d:
total = sum(t[0] for t in d[id_])
average = total / float(len(d[id_]))
print('{}: sum = {}, avg = {:.2f}'.format(id_, total, average))
输入数据的输出:
A:总和= 10,平均= 3.33 C:总和= 5,平均= 5.00 B:总和= 13,平均= 6.50
也可以使用标准的Python字典来完成。解决方案非常相似:
with open('datafile.csv') as f:
d = {}
next(f) # skip first header line
next(f) # skip second header line
for line in f:
id_, value, amount = line.split()
d[id_] = d.get(id_, []) + [(int(value), int(amount))]
# sum and average of column Value by ID
for id_ in d:
total = sum(t[0] for t in d[id_])
average = total / float(len(d[id_]))
print('{}: sum = {}, avg = {:.2f}'.format(id_, total, average))
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句