I wanted to sort a list of data and I intended to sort it based on its first column which is an IP address.
192.168.1.100
192.168.1.101
192.168.1.110
192.168.1.119
192.168.1.20
192.168.1.30
192.168.1.33
192.168.1.54
192.168.1.64
192.168.1.6
192.168.1.91
On my first machine, I tested sort -n
and It worked as I expected
# coreutils, version: 8.31, release: 23
192.168.1.6
192.168.1.20
192.168.1.30
192.168.1.33
192.168.1.54
192.168.1.64
192.168.1.91
192.168.1.100
192.168.1.101
192.168.1.110
192.168.1.119
But on my second machine, it won't sort properly
# coreutils, version:8.4
192.168.1.100
192.168.1.101
192.168.1.110
192.168.1.119
192.168.1.20
192.168.1.30
192.168.1.33
192.168.1.54
192.168.1.6
192.168.1.64
192.168.1.91
Both machines have the same locale en_US.UTF-8
Why is this happening? How can I resolve it?
Without a proper key position, sort
uses the entire line as the key. Since in all the lines, the first three octets remain the same, the entirety of the sorting is based on the numerical positions of the first character in the last octet. Since 1
appears before 2
the octets with 100
, 101
appear before the other.
Define the proper key position and use the numerical sort. For e.g. in your case set the delimiter for the input as .
and let sort
to work its magic on 4th field only. The 4,4
means start at the 4th field delimited by .
and stop at the same 4th field.
sort -n -t'.' -k4,4 file
Also you can override any other locale
settings defined in your system and directly use the system's default with LC_ALL=C
locally to the command. See What does LC_ALL=C
do? to understand why
LC_ALL=C sort -n -t'.' -k4,4 file
Thanks to Kamil Maciorowski's comment which highlighted the actual issue.
The first machine seems to be using a locale where
locale thousands_sep
returns.
Probably it's noten_US.UTF-8
(at least not asLC_NUMERIC
). The second machine doesn't use.
as thousands separator.
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments