I have a CSV file named test.csv
. It looks like this:
1,Color
1,Width
2,Color
2,Height
I want to find out how many distinct values are in the first column. The shell script should return 2
in this case.
I tried running sort -u -t, -k2,2 test.csv
, which I saw on another question, but it printed out far more info than I need.
How do I write a shell script that prints the number of distinct values in the first column of test.csv
?
Using awk you can do:
awk -F, '!seen[$1]++{c++} END{print c}' file
2
This awk command uses key $1
, and stores them in an array seen. Value of which is incremented to 1 when a key is populated first time. Every time we get a unique key we increment count c
and print it in the end.
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments