cpio - ratfactor

After telling us that "GNU info rules and man drools," the man page goes on to give us this description:

GNU cpio copies files between archives and directories. It supports the following archive formats: old binary cpio, old portable cpio, SVR4 cpio with and without checksum, HP cpio, and various tar formats.

So I guess the question is: what’s the difference between tar and cpio?

Well, when I turned to "A System V Tape Archiver: cpio" in my 1994 printing of Unix Power Tools, it started with:

There was a time when people used to debate whether the BSD tar (tape archiver) or the System V cpio (copy in/out) was the better file archive and backup program. At this point, there’s no question. No one ships out cpio archives over the net. tar is widespread, and because there are free versions available, including GNU tar, there’s no reason why you should have to read a cpio archive from someone else.

That entry was written by Tim O’Reilly.

Clearly BSD’s tar won that one and is still the dominant file archiving solution on UNIX-likes.

But as the Wikipedia page for cpio points out:

The use of cpio by the RPM Package Manager, in the initramfs program of Linux kernel 2.6, and in Apple’s Installer (pax) make cpio an important archiving tool.

There is also some extremely great information in the answers to this superuser.com question.

Invoking cpio

The synopsis section of the man page looks like ASCII art. But it boils down to -o for out (create), -i for in (extract), and -p for pass-through (which is apparently one of the remaining positive traits of cpio).

The difference between just -o and -p is that -o "reads a list of file names from the standard input and create on the standard output" whereas -p "reads a list of file names from the standard input and copy them to the specified directory."

Either way, it wants the file list on STDIN.

$ echo * | cpio -o > foo.cpio
cpio: foo foo.cpio foo2 foobar: Cannot stat: No such file or directory

I'm gonna set up the simplest example I can, though:
1 block

Errr…oh, looks like cpio wants them newline separated? Let’s see, ls will give us that:

$ ls | cpio -o > foo.cpio
1 block

$ file foo.cpio
foo.cpio: cpio archive

That looks promising. And now can we extract it? Again, this old school tool expects the input file to come from STDIN.

$ mkdir foo_out
$ cd foo_out
$ cpio -i < ../foo.cpio
1 block
$ ls
foo  foo2  foo_out  foobar

I'm gonna set up the simplest example I can, though:

There we go.

Now, let’s see what all of this "pass-through" business is about. For this, we’ve gotta use the info manual. Grrr. It’s very well written, but…grrrr. (Okay, I’m willing to admit that I simply haven’t put the time into learning Emacs keybindings (which info shares) and that’s something I should remedy.)

Here’s the online manual on gnu.org regarding the pass-through option:

cpio copies files from one directory tree to another, combining the copy-out and copy-in steps without actually using an archive. It reads the list of files to copy from the standard input; the directory into which it will copy them is given as a non-option argument.

Oh! So this option isn’t about archiving at all. This turns cpio into a copy tool in its own right. There are actually a ton of options we can use with the pass-through option to specify how we want to copy the files.

But I’m gonna set up the simplest example I can:

$ touch foo{1..3} bar{1..3}
$ mkdir archive
$ find -name 'foo*'
./foo1
./foo2
./foo3

$ find -name 'foo*' | cpio -p archive
0 blocks
$ ls archive
foo1  foo2  foo3

Okay, not super impressive, sure. But because it’s taking a file list, we can copy all found files in subdirectories as well.

First, I’ll setup a minimal multi-directory file structure to get our files from:

$ mkdir more_stuff
$ touch more_stuff/foo{4..9} more_stuff/bar{4..9}
$ tree
.
|-- archive2
|-- bar1
|-- bar2
|-- bar3
|-- foo1
|-- foo2
|-- foo3
`-- more_stuff
    |-- bar4
    |-- bar5
    |-- bar6
    |-- bar7
    |-- bar8
    |-- bar9
    |-- foo4
    |-- foo5
    |-- foo6
    |-- foo7
    |-- foo8
    `-- foo9

2 directories, 18 files

Now we’ll do the same thing as before, but this time we’ll need to add the -d option to make directories as needed.

$ mkdir archive2
$ find -name 'foo*' | cpio -p -d archive2
0 blocks
$ tree archive2
archive2
|-- foo1
|-- foo2
|-- foo3
`-- more_stuff
    |-- foo4
    |-- foo5
    |-- foo6
    |-- foo7
    |-- foo8
    `-- foo9

1 directory, 9 files

There we go, all found "foo" files safely archived in another directory. I love that any file list on STDIN will do, so this is an extremely versitile ability.

mt

You’ll be pleased to know that the cpio package also comes with the mt tool for controlling your magnetic tape drives!

The man page for mt even points to a current GitHub-hosted repo: https://github.com/iustin/mt-st.

A bug?

I also noticed that there is an outstanding bug noted for Slackware (slackware-current circa 2001) that cpio --sparse causes data corruption! A fix was submitted, but if it’s still in the TODO list…look out for that, I guess.

Note to self

To get tree to display only 7-bit ASCII characters (which I find much more dependable to store in portable documents like this long-term), I’m actually using this invocation:

$ tree --charset=ascii

Well, until next time, happy hacking!