GNU coreutils `sort` behave differently

debugcn Published at Dev

annahri

I wanted to sort a list of data and I intended to sort it based on its first column which is an IP address.

192.168.1.100
192.168.1.101
192.168.1.110
192.168.1.119
192.168.1.20
192.168.1.30
192.168.1.33
192.168.1.54
192.168.1.64
192.168.1.6
192.168.1.91

On my first machine, I tested sort -n and It worked as I expected

# coreutils, version: 8.31, release: 23

192.168.1.6
192.168.1.20
192.168.1.30
192.168.1.33
192.168.1.54
192.168.1.64
192.168.1.91
192.168.1.100
192.168.1.101
192.168.1.110
192.168.1.119

But on my second machine, it won't sort properly

# coreutils, version:8.4

192.168.1.100
192.168.1.101
192.168.1.110
192.168.1.119
192.168.1.20
192.168.1.30
192.168.1.33
192.168.1.54
192.168.1.6
192.168.1.64
192.168.1.91

Both machines have the same locale en_US.UTF-8

Why is this happening? How can I resolve it?

Inian

Without a proper key position, sort uses the entire line as the key. Since in all the lines, the first three octets remain the same, the entirety of the sorting is based on the numerical positions of the first character in the last octet. Since 1 appears before 2 the octets with 100, 101 appear before the other.

Define the proper key position and use the numerical sort. For e.g. in your case set the delimiter for the input as . and let sort to work its magic on 4th field only. The 4,4 means start at the 4th field delimited by . and stop at the same 4th field.

sort -n -t'.' -k4,4 file

Also you can override any other locale settings defined in your system and directly use the system's default with LC_ALL=C locally to the command. See What does LC_ALL=C do? to understand why

LC_ALL=C sort -n -t'.' -k4,4 file

Thanks to Kamil Maciorowski's comment which highlighted the actual issue.

The first machine seems to be using a locale where locale thousands_sep returns . Probably it's not en_US.UTF-8 (at least not as LC_NUMERIC). The second machine doesn't use . as thousands separator.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at2021-07-21

Comments

0 comments

From Java

Related Related

Article

GNU coreutils `sort` behave differently

GNU coreutils `sort` behave differently

Why do Object and var variables behave differently?

Why stdout is not equal to 1 in gnu coreutils?

JSHint behave differently in Webstorm and Grunt

<|> in Parsec - why do these examples behave differently?

Perl implementation of the GNU Coreutils

Why do while modifiers behave differently with blocks?

Why do these generator expressions behave differently?

Why does strptime() behave differently on OSX and on Linux?

Does import with wildcard behave differently with user package

Why does strerror_r behave differently when compiled with gnu90 and c90 standards?

Does wildcard characters behave differently?

Why does sem (GNU parallel) behave differently with single quotes and double quotes?

Github page links behave differently offline and online

Why is coreutils sort slower than Python?

Ignore blank keys in coreutils sort

for attribute in label behave differently in ie

Any options to replace GNU coreutils on Linux?

Sort by function using bash/coreutils instead of perl

rbenv missing GNU coreutils

How to make sort consider blanks (GNU coreutils)

Is gnu coreutils sort broken?

Why GNU find -execdir command behave differently than BSD find?

Why does Gnu sort sort differently on my OSX machine and Linux machine?

how does the gnu coreutils `date` work?

GNU coreutils need suggestions

GNU coreutils and GNU Bash

Why does sem (GNU parallel) behave differently with single quotes and double quotes?

object.ReferenceEquals behave differently

Do joins in Hive behave differently?