Yay! A "simple" utility! This is more what I had in mind when I started this PkgBlog. :-)
Wikipedia’s Bzip2 page has some good general information about this utility. I also found some interesting discussion on lwn.net: bzip.org changes hands.
Let’s see what we have in the package: 5 executables, three of which are shell scripts, and all of which have man pages.
And - holy cow! - a comprehensive HTML manual (47 pages on my terminal in lynx) installed at
/usr/doc/bzip2-1.0.6/manual.html
.
Nice!
Using it
After a quick glance at the man page, I determined that compressing, examining, and decompressing a text file would work like this:
$ du -b foo.txt 219106 foo.txt $ bzip2 foo.txt $ du -b foo.txt.bz2 60958 foo.txt.bz2 $ bzcat foo.txt.bz2 | grep cow was a large statue of something that looked like a cow under a wooden bridge it looked like the cow was frozen in place $ bunzip2 foo.txt.bz2 $ head foo.txt One morning, as Lucy was playing with a mousetrap and a pair of tin soldiers...
Here we can see that the original text file was 219Kb and bzip2
compressed it down to 60Kb.
I was able to decompress the file to STDOUT in order to search it with grep
and then decompressed the file back to its original name.
Note that both bunzip2
and bzcat
are symlinks to bzip2
and when called with these names, the executable will perform these specific behaviors.
(You can also just call bzip2
with the appropriate options.)
I see also that I could have used the script bzgrep
to search in the compressed file (it also has the intelligence to detect if the file is actually compressed, first).
The bzmore
and bzdiff
scripts are the exact same concept - and I imagine they’d be handy to have if you found yourself dealing with bzip2-compressed files on a daily basis.
The last executable is interesting: bzip2recover
.
bzip2recover
The man page for bzip2
explains that the compression is done in blocks and that file integrety is checked with CRCs.
If you have a file that is big enough, you can recover good blocks from a partially-corrupted bzip2 compressed file.
Out of curiosity, I made my text file larger by concatenating several copies of it with cat
.
Then I zipped the result and tried the bzip2recover
utility on it. (I didn’t bother trying to corrupt part of it first.)
$ cat foo.txt foo.txt foo.txt foo.txt foo.txt > foobig.txt $ bzip2 foobig.txt $ bzip2recover foobig.txt.bz2 bzip2recover 1.0.6: extracts blocks from damaged .bz2 files. bzip2recover: searching for block boundaries ... block 1 runs from 80 to 790000 block 2 runs from 790049 to 1228560 block 3 runs from 1228609 to 1228648 (incomplete) bzip2recover: splitting into blocks writing block 1 to `rec00001foobig.txt.bz2' ... writing block 2 to `rec00002foobig.txt.bz2' ... bzip2recover: finished $ ls -l total 524 -rw-r--r-- 1 dave users 153581 Jun 8 15:00 foobig.txt.bz2 -rw-r--r-- 1 dave users 98761 Jun 8 15:01 rec00001foobig.txt.bz2 -rw-r--r-- 1 dave users 54834 Jun 8 15:01 rec00002foobig.txt.bz2
Neat! The "incomplete" block 3 seems to have been a false alarm since the recovered second block contained the entire rest of the document.
Conclusion
In true Unix fashion, this is a tool that does one thing and seems to do it well. I’ll let you decide its comparative merits vs. the other available compression tools.
At the very least, this project contains great documentation and seems to be quite complete.
Until next time, happy hacking, Slackers!