Welcome to the next pikoTutorial!
Basic usage
At its most basic, the sort command sorts lines in a file alphabetically. For example, if you have a file named data.txt, you can print its sorted content with:
sort data.txt
Of course, such console output can be redirected to a file:
sort data.txt > sorted_data.txt
Numerical sorting
Instead of performing the default alphabetical sorting, by adding the -n
option, you can sort the file content numerically:
sort -n data.txt
Note for advanced: maybe you noticed that if you have a file with 3 lines e.g. “3 1 2”, the default sort (without
-n
option) will output lines in the correct order “1 2 3”, so what’s the point of adding-n
option? The difference shows up when you work with files containing mixed letters and numbers. Let’s say there is a file with lines “1 a 2 b”. The defaultsort
will sort it to alphabetical order “1 2 a b” andsort
with-n
option will output “a b 1 2”. Why does the numerical sort puts letters first? The answer lies in howsort
treats non-numeric character – it assumes their value to be 0 and 0 goes in front of both 1 and 2.
Reverse order
To reverse the order of the sort, call:
sort -r data.txt
Removing duplicates
sort
also provides an option for basing filtering the output by removing duplicates:
sort -u data.txt
Sorting by column
Often there are files which consist of multiple columns:
5 2 7
2 8 5
1 7 4
If the columns are separated with white space, we can sort such file not only by the first character, but also by any given column. To sort it by the second column, call:
sort -k 2 data.txt
This will output:
5 2 7
1 7 4
2 8 5
If your file contains data separated by some other delimiter, e.g. comma, you must specify this delimiter explicitly with -t
option:
sort -t, -k 2 data.txt
Note for beginners: remember that not all characters can be used directly in the command line. If your delimiter is e.g. a semicolon, you must provide it as
-t";"
.
Randomizing lines
As it turns out, sort
allows not only for sorting, but also for the opposite – for randomizing the lines in the given file:
sort -R data.txt
Don’t confuse it with -r
option which stands for “reverse”.