根据文本文件的数据求平均值

debugcn 发表于 Dev

你父亲

我有一个文本文件，如下所示，其中字符串之间有两列：

1   23
2   29
3   21
4   18
5   19
6   18
7   19
8   24
Cluster analysis done for this configuration!

1   23
2   22
3   19
4   18
5   23
6   17
7   19
8   31
9   21
10   27
11   19
Cluster analysis done for this configuration!

1   22
2   26
3   27
4   23
5   25
6   32
7   23
8   19
9   19
10   18
11   30
12   21
13   23
14   16
Cluster analysis done for this configuration!

1   23
2   19
3   23
4   27
5   20
6   17
7   15
8   22
9   16
10   23
11   20
12   23
Cluster analysis done for this configuration!

所需的输出将是：

1 22.75
2 24.0
3 22.5
4 21.5
5 21.75
6 21.0
7 19.0
8 24.0
9 18.666666666666668
10 22.666666666666668
11 23.0
12 22.0
13 23.0
14 16.0

我想获得第一列中每个数字的平均值。如果以本示例为例，则对应于“ 1”的平均值为：（23 + 23 + 22 + 23）/ 4 = 22.75，以此类推，对于“ 2”，“ 3”……等等，请注意字符串“ Cluster analysis ...”之间的行不相同。没关系例如，在这种情况下，“ 14”的平均值仅为16，因为除“ 3rd”块外没有其他对应于“ 14”的数字。

我一直在想，有人需要打印字符串“ Cluster analysis ...”之间的所有数字。然后可能在一个数组中存储一个左右，然后只求平均值但无法在代码中实现它。谁能给我带头？

我对编码语言没有任何偏好；它只需要解决问题。我在考虑bash / shell，但也欢迎使用python。

恩里科

awk '/^[0-9]+ +[0-9]+$/ { # pick only lines with two numbers
         arr[$1] += $2    # accumulate the numbers in indexed bins
         n[$1]++          # keep track of how may numbers are in each bin
     }
     END {                     # finally,
         for (e in arr)        # for each bin
             print arr[e]/n[e] # divide
     }' your_input_file

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。