0

I have a .zip file whose filenames are not being correctly decoded and transcoded into unicode correctly.

While I do know what the filenames are meant to be, I don't know what encoding the .zip used for them. Is there an option I can pass to unzip to output the filenames as raw bytes (in hex)? Or is there any common programming language whose zip implementation would make this simple, rather than one which always uses its heuristics to try to transcode it for you?

1
  • 2
    The flag -U is meant for stopping the interpretation of UTF-8, but might work for you? Commented Jul 20, 2024 at 23:13

2 Answers 2

1

I have a program called zipdetails that dumps the full metadata of a zip file. The part that will interest you is it include an optional ASCII hex dump of all the items of metadata, including the filename.

For example, create a zip file with a UTF-8 filename

$ echo abcd >Café $ zip -X test.zip Café adding: Café (stored 0%) 

Dump the contents

$ perl zipdetails -v test.zip 0000 0004 50 4B 03 04 LOCAL HEADER #1 04034B50 0004 0001 0A Extract Zip Spec 0A '1.0' 0005 0001 00 Extract OS 00 'MS-DOS' 0006 0002 00 08 General Purpose Flag 0800 [Bit 11] 1 'Language Encoding' 0008 0002 00 00 Compression Method 0000 'Stored' 000A 0004 58 AD F5 58 Last Mod Time 58F5AD58 'Sun Jul 21 22:42:48 2024' 000E 0004 AC A4 8A 58 CRC 588AA4AC 0012 0004 05 00 00 00 Compressed Length 00000005 0016 0004 05 00 00 00 Uncompressed Length 00000005 001A 0002 05 00 Filename Length 0005 001C 0002 00 00 Extra Length 0000 001E 0005 43 61 66 C3 Filename 'Café' A9 ... 

You can use the --encoding option to tell zipdetails what encoding to use when it displays the filename. Note that this will not change the ASCII hex values shown for the filename.

1
  • That looks perfect, thanks! Commented Jul 21, 2024 at 23:49
0

It can be a little bit off topic, but I recently had an issue where I encoded everything on Windows and then needed to unpack it on Mac. So all the filenames were written with different hieroglyphs.

For such an issue there was also a question: How can I correctly decompress a ZIP archive of files with Hebrew names?

And to myself personally helped this app: https://en.bandisoft.com/bandizip.mac/

Free of use for 7 days (no paying information needed, I just extracted everything). Nothing else was working, and this app did autodetect the encoding

Maybe if you have enough space, you could extract your .zip archive using right encoding just like I did

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.