9

initramfs archives on Linux can consist of a series of concatenated, gzipped cpio files.

Given such an archive, how can one extract all the embedded archives, as opposed to only the first one?

The following is an example of a pattern which, while it appears to have potential to work, extracts only the first archive:

while gunzip -c | cpio -i; do :; done <input.cgz 

I've also tried the skipcpio helper from dracut to move the file pointer past the first cpio image, but the following results in a corrupt stream (not at the correct point in the input) being sent to cpio:

# this isn't ideal -- presumably would need to rerun with an extra skipcpio in the pipeline # ...until all files in the archive have been reached. gunzip -c <input.cgz | skipcpio /dev/stdin | cpio -i 
2
  • Perhaps you misunderstood or the verbiage is just misleading, but the CPIO archives are not each compressed. The overall set of concatenated archives is altogether gzip'd. Commented Sep 29, 2023 at 17:06
  • @sherrellbc, ...in hindsight (and as the answer I added to this question indicates I've learned since asking), that's a distinction without a difference. gunzip handles one stream and a series of streams concatenated together the exact same way. Commented Sep 29, 2023 at 17:16

5 Answers 5

8

gunzip needs to be run only once (consuming all input), whereas cpio should be run once per embedded archive, like so:

gunzip -c <input.cgz | while cpio -i; do :; done 
8
/usr/lib/dracut/skipcpio $your-initrd-img | zcat | cpio -id --no-absolute-file-names 

or else

/usr/lib/dracut/skipcpio $your-img | gunzip -c | cpio -id 

(in FreeBSD there is no --no-absolute-file-names option to cpio)

This small program skipcpio if part of dracut package. But you can download the code (skipcpio.c) and compile it even under FreeBSD.

You need this when extracting dracut created initrd images, at least under RedHat-powered distros, like Fedora. It places a file called "early_cpio" into the image, so extracting your initramfs in a normal way known before will not work.

3
  • 1
    Ahh! You'll note that I mention skipcpio in my question -- it's one of the things I tried before asking -- but I assumed that it expected a cpio stream as input (as the name would imply), not a gzip stream. That said, I think the answer I already have is a better fit for the question ("[H]ow can one extract all the embedded archives[...]?"), since it extracts all cpio archives, as opposed to only the second one in the stream. Commented Apr 27, 2016 at 13:52
  • Of course, of course. Mine only relates to dracut-created archives, I even suspect, to RedHat-derived distros... Well, dracut IS powered by RH, is it not?? I've just put it here for further usage by those who might need this "weird" kind of thing. Like myself, when I needed to add modules to an existing initramfs to make a system boot in the first place, so running dracut to create one was out of question... Neither was it possible to chroot into that system from a Live CD, because I don't know a ZFS-aware Live CD distribution, and my Linux installation is ZFS-based. Commented Apr 28, 2016 at 10:54
  • How to get the early_cpio file? Commented Sep 9, 2023 at 22:50
8

You can do this manually with dd skip=. On my Ubuntu 20.04, I can look at the first part (offset 0 blocks) with

# dd if=/boot/initrd.img-5.4.0-45-generic skip=0 | file - /dev/stdin: ASCII cpio archive (SVR4 with no CRC) 

and then see the contents

# dd if=/boot/initrd.img-5.4.0-45-generic skip=0 | cpio -it . kernel kernel/x86 kernel/x86/microcode kernel/x86/microcode/AuthenticAMD.bin 62 blocks 

The second part is 62 blocks farther

# dd if=/boot/initrd.img-5.4.0-45-generic skip=62 | file - /dev/stdin: ASCII cpio archive (SVR4 with no CRC) 

and again just a simple cpio archive, but larger this time

# dd if=/boot/initrd.img-5.4.0-45-generic skip=62 | cpio -it kernel kernel/x86 kernel/x86/microcode kernel/x86/microcode/.enuineIntel.align.0123456789abc kernel/x86/microcode/GenuineIntel.bin 5868 blocks 

Now skip 5868 + 62 blocks into the initramfs

# dd if=/boot/initrd.img-5.4.0-45-generic skip=5930 | file - /dev/stdin: LZ4 compressed data (v0.1-v0.9) 

This time it is a compressed stream, so

# dd if=/boot/initrd.img-5.4.0-45-generic skip=5930 | lz4cat | file - /dev/stdin: ASCII cpio archive (SVR4 with no CRC) 

And again we found the next (and final) cpio archive

# dd if=/boot/initrd.img-5.4.0-45-generic skip=5930 | lz4cat | cpio -it ... lots of output usr/share/plymouth/themes/spinner/watermark.png usr/share/plymouth/ubuntu-logo.png var var/cache var/cache/fontconfig var/cache/fontconfig/383ee5b3-5437-4bdc-87f6-cf314658a7c0-le64.cache-7 var/cache/fontconfig/575cffd4-ae01-4067-914f-7545fe566c1b-le64.cache-7 var/cache/fontconfig/CACHEDIR.TAG var/cache/fontconfig/c467a813-186f-476e-880a-3770402989a9-le64.cache-7 var/cache/fontconfig/d912fc4e-f5b6-456d-a86d-e4c3ccbbefe9-le64.cache-7 var/lib var/lib/dhcp 450460 blocks 

Although, this only works, if the first streams are uncompressed. Otherwise, cpio wouldn't report the size into the initramfs, but of the uncompressed part.

3

Debian with the packages amd64-microcode / intel-microcode packages installed seems to use some kind of mess of an uncompressed cpio archive containing the CPU microcode followed by a gzip compressed cpio archive with the actual initrd contents. The only way I've ever been able to extract it is by using binwalk (apt install binwalk), which can both correctly list the structure:

binwalk /path/to/initrd 

example output:

host ~ # binwalk /boot/initrd.img-5.10.0-15-amd64 DECIMAL HEXADECIMAL DESCRIPTION -------------------------------------------------------------------------------- 0 0x0 ASCII cpio archive (SVR4 with no CRC), file name: "kernel", file name length: "0x00000007", file size: "0x00000000" 120 0x78 ASCII cpio archive (SVR4 with no CRC), file name: "kernel/x86", file name length: "0x0000000B", file size: "0x00000000" 244 0xF4 ASCII cpio archive (SVR4 with no CRC), file name: "kernel/x86/microcode", file name length: "0x00000015", file size: "0x00000000" 376 0x178 ASCII cpio archive (SVR4 with no CRC), file name: "kernel/x86/microcode/.enuineIntel.align.0123456789abc", file name length: "0x00000036", file size: "0x00000000" 540 0x21C ASCII cpio archive (SVR4 with no CRC), file name: "kernel/x86/microcode/GenuineIntel.bin", file name length: "0x00000026", file size: "0x00455C00" 4546224 0x455EB0 ASCII cpio archive (SVR4 with no CRC), file name: "TRAILER!!!", file name length: "0x0000000B", file size: "0x00000000" 4546560 0x456000 gzip compressed data, has original file name: "mkinitramfs-MAIN_dTZaRk", from Unix, last modified: 2022-06-14 14:02:57 37332712 0x239A6E8 MySQL ISAM compressed data file Version 9 

and extract the separate parts:

binwalk -e /path/to/initrd 

example output:

host ~ # binwalk -e /boot/initrd.img-5.10.0-15-amd64 DECIMAL HEXADECIMAL DESCRIPTION -------------------------------------------------------------------------------- 0 0x0 ASCII cpio archive (SVR4 with no CRC), file name: "kernel", file name length: "0x00000007", file size: "0x00000000" 120 0x78 ASCII cpio archive (SVR4 with no CRC), file name: "kernel/x86", file name length: "0x0000000B", file size: "0x00000000" 244 0xF4 ASCII cpio archive (SVR4 with no CRC), file name: "kernel/x86/microcode", file name length: "0x00000015", file size: "0x00000000" 376 0x178 ASCII cpio archive (SVR4 with no CRC), file name: "kernel/x86/microcode/.enuineIntel.align.0123456789abc", file name length: "0x00000036", file size: "0x00000000" 540 0x21C ASCII cpio archive (SVR4 with no CRC), file name: "kernel/x86/microcode/GenuineIntel.bin", file name length: "0x00000026", file size: "0x00455C00" 4546224 0x455EB0 ASCII cpio archive (SVR4 with no CRC), file name: "TRAILER!!!", file name length: "0x0000000B", file size: "0x00000000" 4546560 0x456000 gzip compressed data, has original file name: "mkinitramfs-MAIN_dTZaRk", from Unix, last modified: 2022-06-14 14:02:57 37332712 0x239A6E8 MySQL ISAM compressed data file Version 9 

This'll give you the separate parts in separate files, and now you can finally extract the proper cpio archive:

host ~ # ls -l _initrd.img-5.10.0-15-amd64.extracted insgesamt 187M drwxr-xr-x 3 root root 4,0K 14. Jun 17:53 cpio-root/ -rw-r--r-- 1 root root 114M 14. Jun 17:53 mkinitramfs-MAIN_dTZaRk -rw-r--r-- 1 root root 39M 14. Jun 17:53 0.cpio -rw-r--r-- 1 root root 35M 14. Jun 17:53 mkinitramfs-MAIN_dTZaRk.gz 
host ~/_initrd.img-5.10.0-15-amd64.extracted # mkdir extracted host ~/_initrd.img-5.10.0-15-amd64.extracted # cd extracted host ~/_initrd.img-5.10.0-15-amd64.extracted/extracted # cat ../mkinitramfs-MAIN_dTZaRk | cpio -idmv --no-absolute-filenames [...] 
host ~/_initrd.img-5.10.0-15-amd64.extracted/extracted # ll insgesamt 28K lrwxrwxrwx 1 root root 7 14. Jun 17:55 bin -> usr/bin/ drwxr-xr-x 3 root root 4,0K 14. Jun 17:55 conf/ drwxr-xr-x 7 root root 4,0K 14. Jun 17:55 etc/ lrwxrwxrwx 1 root root 7 14. Jun 17:55 lib -> usr/lib/ lrwxrwxrwx 1 root root 9 14. Jun 17:55 lib32 -> usr/lib32/ lrwxrwxrwx 1 root root 9 14. Jun 17:55 lib64 -> usr/lib64/ lrwxrwxrwx 1 root root 10 14. Jun 17:55 libx32 -> usr/libx32/ drwxr-xr-x 2 root root 4,0K 14. Jun 16:02 run/ lrwxrwxrwx 1 root root 8 14. Jun 17:55 sbin -> usr/sbin/ drwxr-xr-x 8 root root 4,0K 14. Jun 17:55 scripts/ drwxr-xr-x 8 root root 4,0K 14. Jun 17:55 usr/ -rwxr-xr-x 1 root root 6,2K 14. Jan 2021 init* 
1

If you have zstd and cpio, here's another alternative program to extract all:

cpio_all() ( IFS="+ /-" o=${2:-0} dd "if=$1" skip=$o | zstd -dc --pass-through | cpio -idvm 2>.t set -- "$1" $o $(tail -n1 .t | cut "-d+ /-" -f1) rm .t [ -n "$3" ] && cpio_all "$1" $(( $2 + $3 )) ) 

Running cpio_all /boot/initrd.img-5.4.0-45-generic will extract all cpio archives in any compression format.

1
  • I'd strongly suggest using mktemp to create guaranteed-unique temporary filenames rather than hardcoding something like .t -- otherwise one is prone to symlink attacks, and also can't safely run multiple copies of the function concurrently. (And separately, it'd be safer to quote all expansions -- no point to leaving out quotes and thus telling the shell to do word-splitting and globbing when those operations aren't intended/expected). Commented Dec 20, 2024 at 16:24

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.