0

I have this file.

If I open it in Total Commander with F3 and press S the proper content will be shown.

I tried to do the same thing in bash with iconv:

iconv -f ASCII -t UTF8 input.txt

but I got this:

iconv: illegal input sequence at position 0

If I do from CP850 or CP852:

iconv -f CP850 -t UTF8 input.txt

iconv -f CP852 -t UTF8 input.txt

I will have some unwanted characters in the output:

̦ŮŢŮ

How to have the requested content also in Linux Terminal? What encoding is used in Total Commander when it shows ASCII (DOS-charset)? Or is it a bug in iconv?

4
  • 2
    It's not ASCII so you can't convert from ASCII to anything else Commented Apr 21, 2022 at 10:10
  • Are you sure that that file is ASCII? Note: ASCII has only 128 characters. Could you provide a copy in the question (e.g. copied from TC?) Commented Apr 21, 2022 at 10:11
  • 1
    isn't the file already UTF-8? Commented Apr 21, 2022 at 10:19
  • okay, so how do we call the charset from which the old NortonCommander's panels were constructed? like █▓▒▒ ▀▌▄▀ ? I'm sure that the file is not in UTF-8 :) Commented Apr 21, 2022 at 10:30

2 Answers 2

3

It's not ASCII so you can't convert the file from ASCII to anything else. After some investigation, encoding CP437 appears to give a "good" visual representation. For future reference here's how I determined this.

# Workspace mkdir picture cd picture # Get the file curl http://tiborzsitva.szm.com/ascii/input.txt >x file x x: ISO-8859 text, with CRLF line terminators # Try and convert with every possible conversion for e in $(iconv -l | awk '{print $1}') do iconv -f "$e" -t utf8 <x >"x.$e" 2>"x.$e.error" done # Delete the failed conversion attempts (those with error reports) for f in x.* do [ -s "$f.error" ] && rm -f "$f" rm -f "$f.error" done # Link identical files together for f in x.* do c=$(cksum <"$f") cf="x.cksum.${c// /_}" [ -f "$cf" ] && ln -f "$cf" "$f" || ln -f "$f" "$cf" done rm -f x.cksum.* # See what each one looks like ls -l x.* less x.* # The first one (437) looks good so look for a nice encoding name iconv -l | grep -w 437 437 CP437 IBM437 CSPC8CODEPAGE437 

I would suggest that CP437 would do nicely

2
  • this is exactly I was looking for... :) thx But the bad thing here is that I never heard about CP437 :( CP852 was always "enough" for me Commented Apr 21, 2022 at 11:20
  • @user3719454 read man iconv and you'll find the option to list available character sets for conversion with iconv -l. After that it's pretty much down to visual inspection Commented Apr 21, 2022 at 11:43
3

ASCII is a 7-bit encoding, and your file starts with a bunch of bytes 0xdb, an 8-bit value.

If it's (partly) graphical, it's probably one of the 8-bit DOS codepages. I tried with CP850 and CP437, and the latter seems to give a sensible picture.

Makes sense, since CP437 is the original IBM PC code page and CP850 the Latin-1 one. The former has more drawing characters, like the combined single/double lines, and vertically halved boxes, both of which are replaced with some accented letters in CP850.

$ $ iconv -f cp437 -t utf8 < input.txt | head -10 █████████████████████████████████▀▀▀▀▀▀▀▀██▀▀▀▀▀▀▀▀████████████████████████████ ██████████████████████▀▀▀▀ ▄▄▄▄ ▄█▓▓▓▓█▌ ▄█▓▓▓▓█▄▄ ▀█████████████████████████ ███████████████▀▀ ▄▄▄▄▄▓█▓▓▓▒▒▐▌▐▓▓▒▒▒▒▓█▌▐█▓▒▒▒▒▒▒▀█ ▄▄▄▄ ▀██████████████████ ██████▀▀▀▀▀▀ ▄▄▄▀█▓██▓█▓▒▒▒▒░░░█ █▒░░░░▒▌░▓█▄░░░░░░▄█ █▓▒▒▀█▄ ▀▀▀██████████████ ██▀ ▄▄▄▓▒▄ █▓▓██▌▐▒█▓▒██░░░░░░▄█░▀█▄▄▄▄▀░ ░ ▀▀██▒▓▓█▌▐▌▒▒░░▓█▌▐█▄▄▄▄▄ ▀▀███████ ██ ███████ █▒▒▓▀▄▐░▓▒░█▀▄▄▄▀▀▀▀ ▀▀▀▀▀ ▐▓▓▓▒░▓█ █▓▒░░░▒▒▓▄ ▀█████ ██ ▓▓▓████▌▐░░▄▀▄ ▄▄▀▀ ░░ ░░░░ ▀▀▀█▒█ ▐█▄░░░░░░▒▓█ ▄▄ ██ ██ ▒▒▒▓████ ▓▄▀▀ ▀ ░ ░ ░░█▓▄▌ ▄░░ ░░██░░████░░ ░ ▀▀██▄▄▒▓█▌▐█▀ ██ ██ ░░░▒▓█▀ ░▒░ ░░░▒▓▒▓█ ▐▓▒░░▒▒▓█░░▓▓█▓░░█▓█▓ ▐▓░ ▀▀▀█▓ ██░ ██ ██ ▄▄▀▀ ▄▄▄█▓░ ░▒▓░ ░▒▓▒▓▒░▒▓ ▓▒▓▒░▓▓▒▓▒▒▒▒▓▒▒▒▓▒▓▒▌ ▀▄▄█▓███ ▀█ 

(Well, it doesn't seem to look that good here, on SE, but you get the idea.)

3
  • @roaima, nah, I should be saying that; you were faster. Commented Apr 21, 2022 at 10:46
  • @roaima, the creature lower down is waay more interesting than the stone arch at the top, but I don't want to copy more than that without permission. I think this far I can claim fair use for demonstration etc... Commented Apr 21, 2022 at 10:53
  • @ilkkachu thx a lot Commented Apr 21, 2022 at 11:08

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.