How to unzip a piped zip file (from "wget -qO-")?

Question

Any ideas on how to unzip a piped zip file like this:

wget -qO- http://downloads.wordpress.org/plugin/akismet.2.5.3.zip

I wished to unzip the file to a directory, like we used to do with a normal file:

wget -qO- http://downloads.wordpress.org/plugin/akismet.2.5.3.zip | unzip -d ~/Desktop

While the question is valid, if you are using git to work with WordPress, there is now a Git mirror of each of them. Ignore my comment if its not your case :) Otherwise save yourself the problems of figuring out how to use such a path to automate your installation and head over to use Git submodule/Composer using github.com/wp-plugins — renoirb
– renoirb, Commented Dec 17, 2014 at 18:07
zip requires random access to work. It cannot read incrementally from a pipe -- which is why the zsh-based answer creates a temporary file, not trying to work as a pipe. — Charles Duffy
– Charles Duffy, Commented May 13, 2022 at 19:52
usually you only want to write successful response to stdout. see also: write http error body to stderr — milahu
– milahu, Commented Mar 9, 2024 at 19:31

ruario · Accepted Answer · 2014-04-16 11:41:03Z

The ZIP file format includes a directory (index) at the end of the archive. This directory says where, within the archive each file is located and thus allows for quick, random access, without reading the entire archive.

This would appear to pose a problem when attempting to read a ZIP archive through a pipe, in that the index is not accessed until the very end and so individual members cannot be correctly extracted until after the file has been entirely read and is no longer available. As such it appears unsurprising that most ZIP decompressors simply fail when the archive is supplied through a pipe.

The directory at the end of the archive is not the only location where file meta information is stored in the archive. In addition, individual entries also include this information in a local file header, for redundancy purposes.

Although not every ZIP decompressor will use local file headers when the index is unavailable, the tar and cpio front ends to libarchive (a.k.a. bsdtar and bsdcpio) can and will do so when reading through a pipe, meaning that the following is possible:

wget -qO- http://downloads.wordpress.org/plugin/akismet.2.5.3.zip | bsdtar -xvf- -C ~/Desktop

I have a .zip-file here that contains files with executable permissions. When I download and pipe into bsdtar, the exec bits get thrown away. When I download to disk and extract with bsdtar or unzip then, the exec bits are honoured.
What is the rationale behind including a directory (index) at the end of the archive? Where is to read about that?
@pmor Look up the history of the ZIP filetype. It's because when creating a zip file, you may not know until the end where all the files have come from. Going back to insert a header at the start of a file you've already written is a challenge I suspect Phil Katz may have preferred to avoid.
@pmor it allows you to add/remove/view individual files easily. Zip files are never solid archives like tar.* or the default options of RAR and 7Z, each file is compressed separately and you can extract only the single file you need

Saftever · Accepted Answer · 2018-10-11 12:07:59Z

28

BusyBox's unzip can take stdin and extract all the files.

wget -qO- http://downloads.wordpress.org/plugin/akismet.2.5.3.zip | busybox unzip -

The dash after unzip is to use stdin as input.

You can even,

cat file.zip | busybox unzip -

But that's just redundant of unzip file.zip.

If your distro uses BusyBox by default (e.g. Alpine), just run unzip -.

answered Oct 11, 2018 at 12:07

Saftever

7937 silver badges11 bronze badges

4 Comments

mrts Over a year ago

Busybox 1.22.0 fails with Archive: - unzip: lseek: Illegal seek in Debian. What version of Busybox did you use?

Saftever Over a year ago

v1.27.2 on Ubuntu 18.10

ian Over a year ago

This didn't work for me on Alpine 3.10 (via Docker). (Not ragging on you, I think it's a useful answer and that comments about working/non-working versions are also helpful)

pts Over a year ago

unzip in some versions of BusyBox (e.g. 1.27.2) doesn't support Zip64, thus it works only for member files smaller than 4 GiB.

dungeon_master · Accepted Answer · 2018-06-26 03:06:59Z

19

just use zcat

wget -qO- http://downloads.wordpress.org/plugin/akismet.2.5.3.zip | zcat >> myfile.txt

This will only extract first file. You will see this error message "gzip: stdin has more than one entry--rest ignored" after the first file is extracted.

edited Jun 26, 2018 at 3:06

dungeon_master

32 bronze badges

answered Sep 14, 2017 at 9:58

lanzalibre

3802 silver badges8 bronze badges

4 Comments

SepGuest Over a year ago

This is an O <-- Remember

mhogerheijde Over a year ago

This was what I was looking for. Some files I curl now and again are just single files zipped (don't know why, they're not particularly large) and I don't have control over them being in this format. Using zcat was the solution for me here!

APaul Over a year ago

Annoying gotcha - this only works with GNU zcat/gzip, NOT BSD gzip

Salem Over a year ago

zcat works perfectly.

SebMa · Accepted Answer · 2021-12-05 12:07:10Z

14

While the following will not work in bash, it will work in zsh. Since many zsh users may end up here, it may still be useful:

% unzip =( wget -qO- http://downloads.wordpress.org/plugin/akismet.2.5.3.zip ) Archive: /tmp/zshLCod6x creating: akismet/ inflating: akismet/admin.php inflating: akismet/akismet.css inflating: akismet/akismet.gif inflating: akismet/akismet.js inflating: akismet/akismet.php inflating: akismet/legacy.php inflating: akismet/readme.txt inflating: akismet/widget.php %

As you can notice the temporary downloaded zip file has been deleted straight away :

% ls /tmp/zshLCod6x ls: cannot access '/tmp/zshLCod6x': No such file or directory %

edited Dec 5, 2021 at 12:07

SebMa

4,9651 gold badge39 silver badges51 bronze badges

answered Nov 14, 2013 at 22:14

Ian Robertson

1,3581 gold badge13 silver badges14 bronze badges

4 Comments

Raúl Salinas-Monteagudo Over a year ago

Note that this will anyway download the full file before running unzip, which is not the original question.

Ian Robertson Over a year ago

True. Unfortunately, the zip file format puts its "central directory" at the end of the file, and the unzipping algorithm first reads that directory before processing the files. Hence, a true piping solution that correctly unzips isn't really a possibility. (This is also a problem for web applications that want to process large uploaded zip files - it cannot be done in a streaming fashion.)

Raúl Salinas-Monteagudo Over a year ago

While it is true that there is an index at the end of the file, containing "authoritative" information on which files have been deleted from the archive (without the need to regenerate it at each deletion), I can successfully extract a simple ZIP in a pipelined way with bsdtar, because there are headers indeed preceding each file. bsdtar would probably give bad results in case the archive has been modified ("phantom" files would appear, since it is not known till the end of the archive which ones are the latest version).

APaul Over a year ago

Very neat - i had never seen that form of process substitution in zsh before zsh.sourceforge.io/Intro/intro_7.html

Leon · Accepted Answer · 2019-11-01 02:01:37Z

11

wget -q -O tmp.zip http://downloads.wordpress.org/plugin/akismet.2.5.3.zip && unzip tmp.zip && rm tmp.zip

edited Nov 1, 2019 at 2:01

answered Aug 22, 2011 at 3:22

Leon

2,9045 gold badges22 silver badges32 bronze badges

4 Comments

Roger Over a year ago

The use of && is better once the next command only starts if the previous finished successfully. Thanks

Raúl Salinas-Monteagudo Over a year ago

This is not extracting de zip in a piped manner. With your proposal you need to use more disk space, and wear it out (important in SSD if the files are big). It is also more efficient to directly parallelise the download and the extraction.

MarSoft Over a year ago

Also, -qO- -O tmp.zip is tautologic: you pass -O - and then -O tmp.zip which is pointless here.

Eugene Dounar Over a year ago

The question specifically asks for unzip from pipe. This answer uses temporary files instead, which may not work on read-only filesystems or other specific use-cases

Sam Cantrell · Accepted Answer · 2011-08-20 14:59:08Z

5

I'd take a look at funzip (http://www.info-zip.org/mans/funzip.html). The man page for it notes,

...filter for extracting from a ZIP archive in a pipe

Sorry I don't have an example, but it looks like it does come with the Linux unzip utility.

answered Aug 20, 2011 at 14:59

Sam Cantrell

5856 silver badges19 bronze badges

1 Comment

Raúl Salinas-Monteagudo Over a year ago

It only dumps the FIRST FILE. funzip without a file argument acts as a filter; that is, it assumes that a ZIP archive (or a gzip'd(1) file) is being piped into standard input, and it extracts the first member from the archive to stdout.

pts · Accepted Answer · 2021-01-28 14:00:59Z

Reposting my answer:

I wrote a Python (2.x) script to do streaming extraction of ZIP archives, you can get it from here: https://raw.githubusercontent.com/pts/unzip_scan/master/unzip_scan.py . Usage: cat file.zip | sh unzip_scan.py -.

adamency · Accepted Answer · 2024-04-15 23:41:19Z

Another solution if you already have unzip, you should also have funzip which comes from the same [unzip] package.

This utility is made for reading from pipes/stdin. However it seems very primitive and can apparently extract only the first file from the archive (as per the manpage).

Anyway for the sake of completeness on the context of the question, here is a way I found to do it with funzip for a single-file .zip archive:

curl -sL https://<url_to_my_archive>.zip | funzip - > <my_extracted_file>

replace <url_to_my_archive> and <my_extracted_file> with your values.

because to extract other files you have to wait for the directory entry at the end. Just avoid piping zip files to unzip. And funzip was already answered previously
Where do you see something piping zip files to unzip ? Secondly, no, a funzip actual usage example was never provided in this thread, it was simply mentioned.

Collectives™ on Stack Overflow

How to unzip a piped zip file (from "wget -qO-")?

8 Answers 8

4 Comments

4 Comments

4 Comments

4 Comments

4 Comments

1 Comment

Comments

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

8 Answers 8

4 Comments

4 Comments

4 Comments

4 Comments

4 Comments

1 Comment

Comments

2 Comments

Linked

Related