113

I am trying to make a script for unzipping all the .tar.gz files from folders in one directory. For example, I will have a file which it calls ( testing.tar.gz). Then if I do manually, I can press to "extract here" then the .tar.gz file will create a new file, and it calls testing.tar. Finally, if I repeat the process of pressing "extract here", the .tar file prodcudes me all the .pdf files.

I wonder that how can I do it, and I have my code here and it seems doesn't realty work tho.

import os import tarfile import zipfile def extract_file(path, to_directory='.'): if path.endswith('.zip'): opener, mode = zipfile.ZipFile, 'r' elif path.endswith('.tar.gz') or path.endswith('.tgz'): opener, mode = tarfile.open, 'r:gz' elif path.endswith('.tar.bz2') or path.endswith('.tbz'): opener, mode = tarfile.open, 'r:bz2' else: raise ValueError, "Could not extract `%s` as no appropriate extractor is found" % path cwd = os.getcwd() os.chdir(to_directory) try: file = opener(path, mode) try: file.extractall() finally: file.close() finally: os.chdir(cwd) 
3
  • 2
    Unless there is a point in using Python, it sounds like a job best fit for a shell script. Commented Jun 17, 2015 at 9:48
  • 1
    extractall takes the target directory as a parameter, no need to chdir back and forth Commented Nov 29, 2016 at 18:15
  • It could be, that if you chdir your path to compressed file will outdate. Commented Sep 9, 2022 at 10:02

7 Answers 7

177

Why do you want to "press" twice to extract a .tar.gz, when you can easily do it once? Here is a simple code to extract both .tar and .tar.gz in one go:

import tarfile if fname.endswith("tar.gz"): tar = tarfile.open(fname, "r:gz") tar.extractall() tar.close() elif fname.endswith("tar"): tar = tarfile.open(fname, "r:") tar.extractall() tar.close() 
Sign up to request clarification or add additional context in comments.

10 Comments

It is because, look like the file I have is (.tar.gz). But the unzipping process, it has to be extract from (.tar.gz) to (.gz) then, extract once more will give out the information I need like .pdf file etc
and your code isn't working : if (fname.endswith("tar.gz")): NameError: name 'fname' is not defined
@Alex fname would be a string that is your filename.
@Alex fname is the string of the filename that you are trying to un-tar. files = [f for f in os.listdir('.') if os.path.isfile(f)] for fname in files: # do something, e.g. the above "if-elif" code.
@Matthew You can use the path parameter in the extractall() command e.g. tar.extractall(path="/new/dir/location"). You can have more control too, e.g. if you need to extract only a few files inside the tar file using extract(). For more control, please take a look at the man page. docs.python.org/3/library/tarfile.html
|
64

If you are using python 3, you should use shutil.unpack_archive that works for most of the common archive format.

shutil.unpack_archive(filename[, extract_dir[, format]])

Unpack an archive. filename is the full path of the archive. extract_dir is the name of the target directory where the archive is unpacked. If not provided, the current working directory is used.

For example:

def extract_all(archives, extract_path): for filename in archives: shutil.unpack_archive(filename, extract_path) 

4 Comments

Is there anyway to control the name of the extracted file.
when the user has no root permission, tarfile cannot run, but shutil can.
Finding the one line of python code that does what I need with minimum fuss sparks joy - thanks! I predict python will be the last programming language.
@suraj-subramanian, the extract path will contain the new name. For example, if filename was "hello.tar.gz", extract_path might be "/tmp/my_name_here"
9

Using context manager:

import tarfile <another code> with tarfile.open(os.path.join(os.environ['BACKUP_DIR'], f'Backup_{self.batch_id}.tar.gz'), "r:gz") as so: so.extractall(path=os.environ['BACKUP_DIR']) 

Comments

5

If you are using python in jupyter-notebook and in a linux machine, the below will do:

!tar -xvzf /path/to/file.tar.gz -C /path/to/save_directory 

! enables the command to be run in the terminal.

Comments

1

The following worked for me for a .tar.gz file. It will extract files in your specified destination:

import tarfile from os import mkdir from os.path import isdir src_path = 'path/to/my/source_file.tar.gz' dst_path = 'path/to/my/destination' # create destination dir if it does not exist if isdir(dst_path) == False: mkdir(dst_path) if src_path.endswith('tar.gz'): tar = tarfile.open(src_path, 'r:gz') tar.extractall(dst_path) tar.close() 

Comments

0

You can execute a shell script from Python using envoy:

import envoy # pip install envoy if (file.endswith("tar.gz")): envoy.run("tar xzf %s -C %s" % (file, to_directory)) elif (file.endswith("tar")): envoy.run("tar xf %s -C %s" % (file, to_directory)) 

Comments

-3

When I ran your program, it worked perfectly for a tar.gz and a .tgz file, it didn't give me the correct items when I opened the zip, but .tbz was the only one that raised an error. I think you used the wrong method to unpack a .tbz because the error said I had an incorrect file type, but I didn't. One way you could solve the .zip issue is to us os.command() and unzip it with a command line (depending on your os) because it returned a _MACOSX folder with nothing inside of it even though I entered the path correctly. The only other error I encountered was that you used improper syntax for raising an error.
This is what you should have used:

raise ValueError("Error message here") 

You used a comma and no parenthesis. Hope this helps!

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.