Skip to main content

Note that when using range expressions like [a-z], letters of the other case may be included, depending on the setting of LC_COLLATE.

LC_COLLATE is a variable which determines the collation order used when sorting the results of pathname expansion, and determines the behavior of range expressions, equivalence classes, and collating sequences within pathname expansion and pattern matching.


Consider the following:

$ touch a A b B c C x X y Y z Z $ ls a A b B c C x X y Y z Z $ echo [a-z] # Note the missing uppercase "Z" a A b B c C x X y Y z $ echo [A-Z] # Note the missing lowercase "a" A b B c C x X y Y z Z 

Notice when the command echo [a-z] is called, the expected output would be all files with lower case characters. Also, with echo [A-Z], files with uppercase caracterscharacters would be expected.


Standard collations with locales such as en_US have the following order:

aAbBcC...xXyYzZ 
  • Between a and z (in [a-z]) are ALL uppercase letters, except offor Z.
  • Between A and Z (in [A-Z]) are ALL lowercase letters, except offor a.

See:

 aAbBcC[...]xXyYzZ | | from a to z aAbBcC[...]xXyYzZ | | from A to Z 

If you change the LC_COLLATE variable to C it looks as expected:

$ export LC_COLLATE=C $ echo [a-z] a b c x y z $ echo [A-Z] A B C X Y Z 

So, it's not a bug, it's a collation issue.


Instead of range expressions you can use POSIX defined character classes, such as upper or lower. They work also with different LC_COLLATE configurations and even with accented characters:

$ echo [[:lower:]] a b c x y z à è é $ echo [[:upper:]] A B C X Y Z 

Note that when using range expressions like [a-z], letters of the other case may be included, depending on the setting of LC_COLLATE.

LC_COLLATE is a variable which determines the collation order used when sorting the results of pathname expansion, and determines the behavior of range expressions, equivalence classes, and collating sequences within pathname expansion and pattern matching.


Consider the following:

$ touch a A b B c C x X y Y z Z $ ls a A b B c C x X y Y z Z $ echo [a-z] # Note the missing uppercase "Z" a A b B c C x X y Y z $ echo [A-Z] # Note the missing lowercase "a" A b B c C x X y Y z Z 

Notice when the command echo [a-z] is called, the expected output would be all files with lower case characters. Also, with echo [A-Z], files with uppercase caracters would be expected.


Standard collations with locales such as en_US have the following order:

aAbBcC...xXyYzZ 
  • Between a and z (in [a-z]) are ALL uppercase letters, except of Z.
  • Between A and Z (in [A-Z]) are ALL lowercase letters, except of a.

See:

 aAbBcC[...]xXyYzZ | | from a to z aAbBcC[...]xXyYzZ | | from A to Z 

If you change the LC_COLLATE variable to C it looks as expected:

$ export LC_COLLATE=C $ echo [a-z] a b c x y z $ echo [A-Z] A B C X Y Z 

So, it's not a bug, it's a collation issue.


Instead of range expressions you can use POSIX defined character classes, such as upper or lower. They work also with different LC_COLLATE configurations and even with accented characters:

$ echo [[:lower:]] a b c x y z à è é $ echo [[:upper:]] A B C X Y Z 

Note that when using range expressions like [a-z], letters of the other case may be included, depending on the setting of LC_COLLATE.

LC_COLLATE is a variable which determines the collation order used when sorting the results of pathname expansion, and determines the behavior of range expressions, equivalence classes, and collating sequences within pathname expansion and pattern matching.


Consider the following:

$ touch a A b B c C x X y Y z Z $ ls a A b B c C x X y Y z Z $ echo [a-z] # Note the missing uppercase "Z" a A b B c C x X y Y z $ echo [A-Z] # Note the missing lowercase "a" A b B c C x X y Y z Z 

Notice when the command echo [a-z] is called, the expected output would be all files with lower case characters. Also, with echo [A-Z], files with uppercase characters would be expected.


Standard collations with locales such as en_US have the following order:

aAbBcC...xXyYzZ 
  • Between a and z (in [a-z]) are ALL uppercase letters, except for Z.
  • Between A and Z (in [A-Z]) are ALL lowercase letters, except for a.

See:

 aAbBcC[...]xXyYzZ | | from a to z aAbBcC[...]xXyYzZ | | from A to Z 

If you change the LC_COLLATE variable to C it looks as expected:

$ export LC_COLLATE=C $ echo [a-z] a b c x y z $ echo [A-Z] A B C X Y Z 

So, it's not a bug, it's a collation issue.


Instead of range expressions you can use POSIX defined character classes, such as upper or lower. They work also with different LC_COLLATE configurations and even with accented characters:

$ echo [[:lower:]] a b c x y z à è é $ echo [[:upper:]] A B C X Y Z 
Fixed typo.
Source Link
dhag
  • 16.3k
  • 4
  • 57
  • 66

Note that when using range expressions like [a-z], letters of the other case may be included, depending on the setting of LC_COLLATE.

LC_COLLATE is a variable which determines the collation order used when sorting the results of pathname expansion, and determines the behavior of range expressions, equivalence classes, and collating sequences within pathname expansion and pattern matching.


Consider the following:

$ touch a A b B c C x X y Y z Z $ ls a A b B c C x X y Y z Z $ echo [a-z] # Note the missing uppercase "Z" a A b B c C x X y Y z $ echo [A-Z] # Note the missing lowercase "a" A b B c C x X y Y z Z 

Notice when the command echo [a-z] is called, the expected output would be all files with lower case characters. Also, with echo [A-Z], files with uppercase caracters would be expected.


Standard collations with locales wuchsuch as en_US have the following order:

aAbBcC...xXyYzZ 
  • Between a and z (in [a-z]) are ALL uppercase letters, except of Z.
  • Between A and Z (in [A-Z]) are ALL lowercase letters, except of a.

See:

 aAbBcC[...]xXyYzZ | | from a to z aAbBcC[...]xXyYzZ | | from A to Z 

If you change the LC_COLLATE variable to C it looks as expected:

$ export LC_COLLATE=C $ echo [a-z] a b c x y z $ echo [A-Z] A B C X Y Z 

So, it's not a bug, it's a collation issue.


Instead of range expressions you can use POSIX defined character classes, such as upper or lower. They work also with different LC_COLLATE configurations and even with accented characters:

$ echo [[:lower:]] a b c x y z à è é $ echo [[:upper:]] A B C X Y Z 

Note that when using range expressions like [a-z], letters of the other case may be included, depending on the setting of LC_COLLATE.

LC_COLLATE is a variable which determines the collation order used when sorting the results of pathname expansion, and determines the behavior of range expressions, equivalence classes, and collating sequences within pathname expansion and pattern matching.


Consider the following:

$ touch a A b B c C x X y Y z Z $ ls a A b B c C x X y Y z Z $ echo [a-z] # Note the missing uppercase "Z" a A b B c C x X y Y z $ echo [A-Z] # Note the missing lowercase "a" A b B c C x X y Y z Z 

Notice when the command echo [a-z] is called, the expected output would be all files with lower case characters. Also, with echo [A-Z], files with uppercase caracters would be expected.


Standard collations with locales wuch as en_US have the following order:

aAbBcC...xXyYzZ 
  • Between a and z (in [a-z]) are ALL uppercase letters, except of Z.
  • Between A and Z (in [A-Z]) are ALL lowercase letters, except of a.

See:

 aAbBcC[...]xXyYzZ | | from a to z aAbBcC[...]xXyYzZ | | from A to Z 

If you change the LC_COLLATE variable to C it looks as expected:

$ export LC_COLLATE=C $ echo [a-z] a b c x y z $ echo [A-Z] A B C X Y Z 

So, it's not a bug, it's a collation issue.


Instead of range expressions you can use POSIX defined character classes, such as upper or lower. They work also with different LC_COLLATE configurations and even with accented characters:

$ echo [[:lower:]] a b c x y z à è é $ echo [[:upper:]] A B C X Y Z 

Note that when using range expressions like [a-z], letters of the other case may be included, depending on the setting of LC_COLLATE.

LC_COLLATE is a variable which determines the collation order used when sorting the results of pathname expansion, and determines the behavior of range expressions, equivalence classes, and collating sequences within pathname expansion and pattern matching.


Consider the following:

$ touch a A b B c C x X y Y z Z $ ls a A b B c C x X y Y z Z $ echo [a-z] # Note the missing uppercase "Z" a A b B c C x X y Y z $ echo [A-Z] # Note the missing lowercase "a" A b B c C x X y Y z Z 

Notice when the command echo [a-z] is called, the expected output would be all files with lower case characters. Also, with echo [A-Z], files with uppercase caracters would be expected.


Standard collations with locales such as en_US have the following order:

aAbBcC...xXyYzZ 
  • Between a and z (in [a-z]) are ALL uppercase letters, except of Z.
  • Between A and Z (in [A-Z]) are ALL lowercase letters, except of a.

See:

 aAbBcC[...]xXyYzZ | | from a to z aAbBcC[...]xXyYzZ | | from A to Z 

If you change the LC_COLLATE variable to C it looks as expected:

$ export LC_COLLATE=C $ echo [a-z] a b c x y z $ echo [A-Z] A B C X Y Z 

So, it's not a bug, it's a collation issue.


Instead of range expressions you can use POSIX defined character classes, such as upper or lower. They work also with different LC_COLLATE configurations and even with accented characters:

$ echo [[:lower:]] a b c x y z à è é $ echo [[:upper:]] A B C X Y Z 
added 400 characters in body
Source Link
chaos
  • 49.4k
  • 11
  • 128
  • 147

Note that when using range expressions like [a-z], letters of the other case may be included, depending on the setting of LC_COLLATE.

LC_COLLATE is a variable which determines the collation order used when sorting the results of pathname expansion, and determines the behavior of range expressions, equivalence classes, and collating sequences within pathname expansion and pattern matching.


Consider the following:

$ touch a A b B c C x X y Y z Z $ ls a A b B c C x X y Y z Z $ echo [a-z] # Note the missing uppercase "Z" a A b B c C x X y Y z $ echo [A-Z] # Note the missing lowercase "a" A b B c C x X y Y z Z 

Notice when the command echo [a-z] is called, the expected output would be all files with lower case characters. Also, with echo [A-Z], files with uppercase caracters would be expected.


Standard collations with locales wuch as en_US have the following order:

aAbBcC...xXyYzZ 
  • Between a and z (in [a-z]) are ALL uppercase letters, except of Z.
  • Between A and Z (in [A-Z]) are ALL lowercase letters, except of a.

See:

 aAbBcC[...]xXyYzZ | | from a to z aAbBcC[...]xXyYzZ | | from A to Z 

If you change the LC_COLLATE variable to C it looks as expected:

$ export LC_COLLATE=C $ echo [a-z] a b c x y z $ echo [A-Z] A B C X Y Z 

So, it's not a bug, it's a collation issue.


Instead of range expressions you can use POSIX defined character classes, such as upper or lower. They work also with different LC_COLLATE configurations and even with accented characters:

$ echo [[:lower:]] a b c x y z à è é $ echo [[:upper:]] A B C X Y Z 

Note that when using range expressions like [a-z], letters of the other case may be included, depending on the setting of LC_COLLATE.

LC_COLLATE is a variable which determines the collation order used when sorting the results of pathname expansion, and determines the behavior of range expressions, equivalence classes, and collating sequences within pathname expansion and pattern matching.


Consider the following:

$ touch a A b B c C x X y Y z Z $ ls a A b B c C x X y Y z Z $ echo [a-z] # Note the missing uppercase "Z" a A b B c C x X y Y z $ echo [A-Z] # Note the missing lowercase "a" A b B c C x X y Y z Z 

Notice when the command echo [a-z] is called, the expected output would be all files with lower case characters. Also, with echo [A-Z], files with uppercase caracters would be expected.


Standard collations with locales wuch as en_US have the following order:

aAbBcC...xXyYzZ 
  • Between a and z (in [a-z]) are ALL uppercase letters, except of Z.
  • Between A and Z (in [A-Z]) are ALL lowercase letters, except of a.

See:

 aAbBcC[...]xXyYzZ | | from a to z aAbBcC[...]xXyYzZ | | from A to Z 

If you change the LC_COLLATE variable to C it looks as expected:

$ export LC_COLLATE=C $ echo [a-z] a b c x y z $ echo [A-Z] A B C X Y Z 

So, it's not a bug, it's a collation issue.

Note that when using range expressions like [a-z], letters of the other case may be included, depending on the setting of LC_COLLATE.

LC_COLLATE is a variable which determines the collation order used when sorting the results of pathname expansion, and determines the behavior of range expressions, equivalence classes, and collating sequences within pathname expansion and pattern matching.


Consider the following:

$ touch a A b B c C x X y Y z Z $ ls a A b B c C x X y Y z Z $ echo [a-z] # Note the missing uppercase "Z" a A b B c C x X y Y z $ echo [A-Z] # Note the missing lowercase "a" A b B c C x X y Y z Z 

Notice when the command echo [a-z] is called, the expected output would be all files with lower case characters. Also, with echo [A-Z], files with uppercase caracters would be expected.


Standard collations with locales wuch as en_US have the following order:

aAbBcC...xXyYzZ 
  • Between a and z (in [a-z]) are ALL uppercase letters, except of Z.
  • Between A and Z (in [A-Z]) are ALL lowercase letters, except of a.

See:

 aAbBcC[...]xXyYzZ | | from a to z aAbBcC[...]xXyYzZ | | from A to Z 

If you change the LC_COLLATE variable to C it looks as expected:

$ export LC_COLLATE=C $ echo [a-z] a b c x y z $ echo [A-Z] A B C X Y Z 

So, it's not a bug, it's a collation issue.


Instead of range expressions you can use POSIX defined character classes, such as upper or lower. They work also with different LC_COLLATE configurations and even with accented characters:

$ echo [[:lower:]] a b c x y z à è é $ echo [[:upper:]] A B C X Y Z 
Source Link
chaos
  • 49.4k
  • 11
  • 128
  • 147
Loading