我想使用awk用“(”分割文件的第一列,并计算split命令的每个第二个变量的出现次数。
cluster1(2 genes, 2 taxa): column2 column 3
cluster1(2 genes, 2 taxa): column2 column 3
cluster1(3 genes, 2 taxa): column2 column 3
cluster1(3 genes, 2 taxa): column2 column 3
cluster1(4 genes, 2 taxa): column2 column 3
所以我的输出是
2 genes, 2 taxa = 2
3 genes, 2 taxa = 2
4 genes, 2 taxa = 1
谢谢您的帮助,凯特
$ awk -F '[()]' '{arr[$2]++} END{for(i in arr) print i " = " arr[i]}' data
4 genes, 2 taxa = 1
3 genes, 2 taxa = 2
2 genes, 2 taxa = 2
或使用uniq
以下方法进行管道计数:
$ grep -oP '(?<=\().*(?=\))' data | uniq -c | awk '{print $2,$3,$4,$5 " =",$1}'
2 genes, 2 taxa = 2
3 genes, 2 taxa = 2
4 genes, 2 taxa = 1
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句