I wrote a simple shell script that works but it's terribly inefficient. It takes too long to run on larger files. I'm looking for a faster solution.
Sample input files:
data.csv:
1,data,data
3,data,data
4,data,data
9,data,data
...
matches.txt:
3
9
16
17
...
The script I wrote iterates through each item in matches.txt
. It uses sed
to match the beginning of the lines in the csv file and comments them out by prepending **
.
#!/bin/bash
IFS=$'\r\n' GLOBIGNORE='*' :; XYZ=$(<matches.txt)
for id in ${XYZ[@]}
do
sed -i '' "${id}s/^**//" data.csv
done
I'm using OS X so sed
parameters are slightly different.
Rather than calling sed
in a loop you can use this awk:
awk -F ',' 'FNR==NR{a[$1]++; next} $1 in a{$0 = "**" $0} 1' matches.txt data.csv
1,data,data
**3,data,data
4,data,data
**9,data,data
To save output from awk
:
awk -F ',' 'FNR==NR{a[$1]++; next} $1 in a{$0 = "**" $0} 1' matches.txt data.csv > _tmp
mv _tmp data.csv
Explanation:
-F ','
- Use field separator as commaFNR==NR
- Execute this block for first file{a[$1]++; next}
- Create an array with key as $1
from first file and move to next line$1 in a{$0 = "**" $0}
- For the 2nd file if 1st field is in array a
then prepend **
in current line.1
- default awk action (print the line)Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments