I have two files to be compared. I found how to compare columns and print according to condition. My problem at hand is that I have to check if the value of column[2] in file1 lies between value in file2 defined as a range in two columns, col [2] col[3]. If that is true, then I should print column[4] of file 2 in my file1.
scaffold1_size11 12
scaffold2_size22 26
scaffold3_size33 67
scaffold1_size11 1 10 Os01
scaffold1_size11 12 20 Os08
scaffold1_size11 29 59 Os07
scaffold2_size22 17 24 Os09
scaffold2_size22 27 38 Os09
scaffold2_size22 39 60 Os10
scaffold2_size22 67 78 Os10
scaffold3_size33 15 27 Os03
scaffold3_size33 29 62 Os08
scaffold3_size33 64 78 Os02
scaffold3_size33 80 98 Os01
scaffold1_size11 12 Os08
scaffold2_size22 26
scaffold3_size33 67 Os02
How should this be done?
There's a standard idiom in awk
which uses FNR
(file record number) and NR
(overall record number) to detect when you're reading the first file. You read and save the values of the first file in arrays, and then use the arrays while reading the second file.
In this context, you want to read file1
first, saving the records based on the value in column 1 ($1
). This assumes that the keys in file1
(the first field) are unique. Then, when reading the second file,
awk 'FNR == NR { val[$1] = $2 }
FNR != NR { if ($1 in val && val[$1] >= $2 && val[$1] <= $3)
print $1, val[$1], $4
}' file1 file2
Sample output:
scaffold1_size11 12 Os08
scaffold2_size22 26 Os09
scaffold3_size33 67 Os02
Note that this is different from the sample output in the question, which is:
scaffold1_size11 12 Os08
scaffold2_size22 26
scaffold3_size33 67 Os02
I assume that's a typo in the question since none of the rows in file2
is missing the fourth column.
You'll also see the idiom used like:
awk 'FNR == NR { …save…; next }
{ …process… }'
The next
skips the second block of code while reading the first file. It might be marginally more efficient, but I tend to like the explicit clarity of the two inverted conditions.
If spacing in the output is an issue, use an appropriate printf
statement in place of the print
.
이 기사는 인터넷에서 수집됩니다. 재 인쇄 할 때 출처를 알려주십시오.
침해가 발생한 경우 연락 주시기 바랍니다[email protected] 삭제
몇 마디 만하겠습니다