我有每日温度文件,我想合并为一年一次的文件。
例如输入文件
Day_1.dat
Toronto -22.5
Montreal -10.6
Day_2.dat
Toronto -15.5
Montreal -1.5
Day_3.dat
Toronto -5.5
Montreal 10.6
所需的输出文件
Toronto -22.5 -15.5 -5.5
Montreal -10.6 -1.5 10.6
到目前为止,这是我为程序的这一部分编写的代码:
#Open files for reading (input) and appending (output)
readFileObj = gzip.open(readfilename, 'r') #call built in utility to unzip file for reading
appFileObj = open(outFileName, 'a')
for line in readfileobj:
fileString = readFileObj.read(line.split()[-1]+'\n') # read last 'word' of each line
outval = "" + str(float(filestring) +"\n" #buffer with a space and then signal end of line
appFileObj.write(outval) #this is where I need formatting help to append outval
在这里,迭代fileinput.input
允许我们迭代所有文件,一次获取一行。现在,我们将每行分隔为空白,然后使用城市名称作为关键字,将相应的温度(或该温度的任何值)存储在列表中。
import fileinput
d = {}
for line in fileinput.input(['Day_1.dat', 'Day_2.dat', 'Day_3.dat']):
city, temp = line.split()
d.setdefault(city, []).append(temp)
现在d
包含:
{'Toronto': ['-22.5', '-15.5', '-5.5'],
'Montreal': ['-10.6', '-1.5', '10.6']}
现在,我们可以简单地遍历此字典并将数据写入输出文件。
with open('output_file', 'w') as f:
for city, values in d.items():
f.write('{} {}\n'.format(city, ' '.join(values)))
输出:
$ cat output_file
Toronto -22.5 -15.5 -5.5
Montreal -10.6 -1.5 10.6
请注意,字典没有任何特定顺序。因此,这里的输出本来可以是Montreal
first,然后是Toronto
。如果顺序很重要,则需要使用collections.OrderedDict
。
您的代码的工作版本:
d = {}
#Considering you've a list of all `gzip` files to be opened.
for readfilename in filenames:
#populate the dictionary by collecting data from each file
with gzip.open(readfilename, 'r') as f:
for line in f:
city, temp = line.split()
d.setdefault(city, []).append(temp)
#Now write to the output file
with open(outFileName, 'w') as f:
for city, values in d.items():
f.write('{} {}\n'.format(city, ' '.join(values)))
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句