0

I have several csv-files, some of which are compressed but others are not, all in a 7z archive. I want to read the csv files and save the content in a database. However, whenever py7zlib attemts to read the data from a csv file that is actually not compressed, I get the error data error during decompression.

import os import py7zlib scr = r'Y:\PathtoArchive' z7file = 'ArchiveName.7z' with open(os.path.join(scr,z7file),'rb') as f: archive = py7zlib.Archive7z(f) names = archive.filenames for mem in names: obj = archive.getmember(mem) print obj.compressed # prints None for uncompressed data try: data = obj.read() except Exception as er: print er # prints data error during decompression # whenever obj.compressed is None 

The error happens in

File "C:\Anaconda\lib\site-packages\py7zlib.py", line 608, in read data = getattr(self, decoder)(coder, data, level) File "C:\Anaconda\lib\site-packages\py7zlib.py", line 671, in _read_lzma return self._read_from_decompressor(coder, dec, input, level, checkremaining=True, with_cache=True) File "C:\Anaconda\lib\site-packages\py7zlib.py", line 646, in _read_from_decompressor tmp = decompressor.decompress(data) ValueError: data error during decompression 

So, how can I extract uncompressed data from a 7z-Archive?

2
  • 1
    Does the compressed file have a password? Commented May 25, 2016 at 9:19
  • no. none of the files have a password. Commented May 25, 2016 at 9:25

2 Answers 2

2

Though I couldn't really figure out what the problem seemed to be, I found a workaround that solved the ultimate goal to obtain the data from csv-files from a 7z-archive. 7-zip comes with a command line tool. Communicating with that tool via the subprocess module, I could automatically extract the files that I wihsed to extract without any problems

import subprocess import py7zlib archiveman = r'c:\Program Files\7-zip\7z' # 7z.exe comes with 7-zip archivepath = r'C:\Path\to\archive.7z' with open(archivepath,'rb') as f: archive = py7zlib.Archive7z(f) names = archive.filenames for name in names: _ = subprocess.check_output([archiveman, 'e', archivepath, '-o{}'.format(r'C:\Destination\of\copy'), name]) 

The different commands that can be used with 7z can be found here.

Sign up to request clarification or add additional context in comments.

Comments

1

You can try another library, py7zr, which also supports 7zip archive compression, decompression, encryption and decryption. https://pypi.org/project/py7zr

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.