9

The title is a summary of what I'm trying to achieve, but I'll give an example to illustrate what my problem is and how I've been trying to solve it.

Example folder

Let's say I've got a folder on a Linux system with the following files: .a, .A, .b, .B, a, A, b and B.

Thunar

When I open the folder in Thunar, my file manager of choice, the files display in this order:

.a .A .b .B a A b B 

This is an output that makes sense to me; first the hidden files (or directories), then sorted alphabetically (where the case is taken into account). Preferably, I'd have the upper-case files sorted before the lower-case ones, but it's not too bad. In other words, this is the output that I'm trying to achieve with ls.

ls

When I want to list the files of this folder via ls, this is what I get:

$ ls -lA total 0 -rw-r--r-- 1 lucas lucas 0 Jan 26 14:58 .a -rw-r--r-- 1 lucas lucas 0 Jan 26 14:58 a -rw-r--r-- 1 lucas lucas 0 Jan 26 14:58 .A -rw-r--r-- 1 lucas lucas 0 Jan 26 14:58 A -rw-r--r-- 1 lucas lucas 0 Jan 26 14:58 .b -rw-r--r-- 1 lucas lucas 0 Jan 26 14:58 b -rw-r--r-- 1 lucas lucas 0 Jan 26 14:58 .B -rw-r--r-- 1 lucas lucas 0 Jan 26 14:58 B 

Here, the hidden files aren't sorted to the top, but the files overall are sorted in a 'sensible' alphabetical order.

Experimenting with LC_ALL=C and LC_COLLATE=C

A couple of solutions for sorting the hidden files to the top are to temporarily set either LC_ALL or LC_COLLATE to C (I'm struggling to see the difference between the two, so an explanation there would be much appreciated):

$ LC_ALL=C ls -lA total 0 -rw-r--r-- 1 lucas lucas 0 Jan 26 14:58 .A -rw-r--r-- 1 lucas lucas 0 Jan 26 14:58 .B -rw-r--r-- 1 lucas lucas 0 Jan 26 14:58 .a -rw-r--r-- 1 lucas lucas 0 Jan 26 14:58 .b -rw-r--r-- 1 lucas lucas 0 Jan 26 14:58 A -rw-r--r-- 1 lucas lucas 0 Jan 26 14:58 B -rw-r--r-- 1 lucas lucas 0 Jan 26 14:58 a -rw-r--r-- 1 lucas lucas 0 Jan 26 14:58 b $ LC_COLLATE=C ls -lA total 0 -rw-r--r-- 1 lucas lucas 0 Jan 26 14:58 .A -rw-r--r-- 1 lucas lucas 0 Jan 26 14:58 .B -rw-r--r-- 1 lucas lucas 0 Jan 26 14:58 .a -rw-r--r-- 1 lucas lucas 0 Jan 26 14:58 .b -rw-r--r-- 1 lucas lucas 0 Jan 26 14:58 A -rw-r--r-- 1 lucas lucas 0 Jan 26 14:58 B -rw-r--r-- 1 lucas lucas 0 Jan 26 14:58 a -rw-r--r-- 1 lucas lucas 0 Jan 26 14:58 b 

As you can see, this does solve the hidden file problem, but the behaviour of the alphabetical sort is now inconsistent with how Thunar sorts files alphabetically.

Questions

So this begs the question: how do I get ls to sort in the same way as Thunar? Preferably, I want to avoid piping ls to another command like sort, since I'd like to alias this new command to "ls" itself.

And if this isn't possible, how can I get Thunar to sort files the way ls would sort files (the LC_ALL=C/LC_COLLATE=C method seems nice enough to me)?

By extension, I'd like to ask what the best practices are when sorting files alphabetically. The behaviour I've just described is just what seems sensible to me, but maybe it isn't after all?

4
  • 2
    These seem apropos: Why not parse ls (and what to do instead)?, along with Why you shouldn't parse the output of ls(1) Commented Jan 26 at 16:33
  • 1
    Why are those relevant, @AndrewHenle? Who is trying to parse ls here? The OP in fact explicitly stated they want to avoid passing to an external tool (i.e. parsing). Commented Jan 27 at 9:59
  • 1
    "set either LC_ALL or LC_COLLATE to C (I'm struggling to see the difference between the two, so an explanation there would be much appreciated)" -- info libc 'Locale Categories', but in summary LC_ALL overrides all other LC_*, including LC_COLLATE. Commented Jan 27 at 19:41
  • 1
    Your best option is to set Thunar to sort by date, then invoke ls as ls -t. Thunar's "sort by name" uses a "smart sort" algorithm that doesn't correspond to any locale collation system. Commented Jan 28 at 0:08

2 Answers 2

10

If using zsh, you can define a globbing sorting order function to be used with the o+funcname glob qualifier which puts hidden files whose tail (basename) starts with . first with:

thunarOrder() { [[ $REPLY:t = .* ]]; REPLY=$?$REPLY } 

Then to print the names of the files in the current working directory raw on 1 Column in that order:

$ print -rC1 -- *(Do+thunarOrder) .a .A .b .B a A b B 

(also note the Dotglob qualifier so dotfiles are not skipped as they would by default otherwise).

With the GNU implementation of ls, you can disable its own sorting with the -U/--sort=none option:

$ ls -nUd -- *(Do+thunarOrder) -rw-rw-r-- 1 1000 1000 0 Jan 26 16:29 .a -rw-rw-r-- 1 1000 1000 0 Jan 26 16:29 .A -rw-rw-r-- 1 1000 1000 0 Jan 26 16:29 .b -rw-rw-r-- 1 1000 1000 0 Jan 26 16:29 .B -rw-rw-r-- 1 1000 1000 0 Jan 26 16:29 a -rw-rw-r-- 1 1000 1000 0 Jan 26 16:29 A -rw-rw-r-- 1 1000 1000 0 Jan 26 16:29 b -rw-rw-r-- 1 1000 1000 0 Jan 26 16:29 B 

If you like to have uppercase characters before lowercase ones, you can use a Danish collation order where this appears to be the case:

$ LC_COLLATE=da_DK.UTF-8 $ print -rC1 -- *(Do+thunarOrder) .A .a .B .b A a B b 

Or you could switch to the C locale which generally sorts based on byte value, and A-Z in the ASCII charset happens to come before a-z.

0
4

For your main question: The good solution is probably to create a new LC_COLLATE definition (which might have to happen in the ls function as someone could call it with LC_ALL=... ls -lA...). I have no idea how to do this.

For the LC_ALL=C / LC_COLLATE=C part of the question: Usually there should not be a difference. There would only be one if LC_ALL had been set to a different value before. They do the same thing, LC_ALL=C just affects more configurations and has a higher priority.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.