PkgBlog: btrfs-progs

Playing with the binary tree filesystem
Package series/name: a/btrfs-progs-v4.5.3-x86_64-1
Official release: source and package
Blog entry created: 2019-05-26
Tagged: series_a core-system filesystem storage
Note
This page should be named "btrfs-progs" to match the package name. I’ve fixed my script and the title, but I’m not changing the file name because Cool URIs don’t change.

The next package from "series a" is btrfs, which is a filesystem. This should be interesting.

I could just read about it from the official wiki on kernel.org:

"Btrfs is a modern copy on write (CoW) filesystem for Linux aimed at implementing advanced features while also focusing on fault tolerance, repair and easy administration."

But the whole point of this Slackware package blog is to get my hands dirty with the tools on this system. And I’m gonna get dirty right away by taking a bite out of the swap partition on my laptop to make room for an experimental partition for playing with file systems. (See the section at the bottom of this page for a long-winded explanation of why I’m going this route instead of using an alternative location.)

Feel free to skip the next two sections to get to where I actually use btrfs.

Using half of my swap space for file system experimentation

This turns out to be really easy. First, let’s disable swapping on the existing swap partition:

$ sudo -i
# fdisk -l | grep swap
/dev/sda2     206848   8595455   8388608     4G Linux swap
# swapoff /dev/sda2

swapoff probably took about 20 seconds to run.

Now I’ll use cfdisk (like fdisk but with a much friendlier TUI interface I much prefer because I don’t need to deal with partitions very often).

screenshot of cfdisk as described below

In the above screenshot, you can see that I deleted the existing 4Gb partition table and in its place, created a new 3Gb swap partition (the new sda2) and a 1Gb partition (sda5).

Now I’ll get the new, smaller swap space set up. Make the swap space:

# mkswap /dev/sda2
mkswap: /dev/sda2: warning: wiping old swap signature.
Setting up swapspace version 1, size = 3.5 GiB (3794087936 bytes)
no label, UUID=2bf65dea-2104-4e3e-97dc-47dd643f0b64

And tell Linux to use it:

# swapon /dev/sda2

Now we can use the new file system on the 1Gb partition at /dev/sda5.

# fdisk -l /dev/sda
Disk /dev/sda: 238.5 GiB, 256060514304 bytes, 500118192 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: AEB90E73-8FA3-4134-A42B-9F5D9DAEA751

Device         Start       End   Sectors   Size Type
/dev/sda1       2048    206847    204800   100M EFI System
/dev/sda2     206848   6498303   6291456     3G Linux swap
/dev/sda3    8595456 498008063 489412608 233.4G Linux filesystem
/dev/sda4  498008064 500105215   2097152     1G Windows recovery environment
/dev/sda5    6498304   8595455   2097152     1G Linux filesystem

Partition table entries are not in disk order.

Well, not quite.

What’s interesting is that though fdisk sees the sda5 partition, lsblk doesn’t and it doesn’t show up if I use ls to view the contents of /dev. Hmmmm…​

It turns out that the kernel doesn’t always know when you’ve repartitioned a drive.

This very helpful answer on superuser.com did the trick: just run partprobe. Sure enough, lsblk now shows /dev/sda5!

I think I’m finally ready to make a btrfs file system!

About partition types

By the way, before the leave the subject of partitions, there is one thing I have often wondered: since creating a partition doesn’t actually create the filesystem on that partition, why do we need to set the partition type at all? Does it even matter what I set the partition type to?

So this led me to the Wikipedia entries for the GUID Partition Table (GPT, newer) and Master Boot Record Partition types (MBR, older).

Generally speaking, these partition types are just informational and it’s up to the operating system to do with this information as it pleases.

I learned that the original MBR file system types were defined by IBM and then Microsoft and SCO for early DOS, XENIX, OS/2, and (later) Windows NT, etc.

If you’ve installed a fair number of Linux installations, you might recognize the two main Linux file system types by their hex numbers: 82 for "Linux Swap" and 83 for "Linux File System".

You can see a big list on the Wikipedia entry Discoverable Partitions Specification.

To answer the question: the partition types are just informational. They let Windows say, "hey, that’s my partition!" and for a Linux installer program to say, "oh, look, I see a Linux swap partitiion, would you like to use it?"

I’d heard horror stories about Windows eating non-Windows partitions in dual-boot situations. Not setting correct partition types is a good way to make this more likely to happen.

Making the btrfs file system!

To make the filesystem, we use mkfs.btrfs:

# mkfs.btrfs /dev/sda5
btrfs-progs v4.5.3
See http://btrfs.wiki.kernel.org for more information.

Detected a SSD, turning off metadata duplication.  Mkfs with -m dup if you want to force metadata duplication.
Performing full device TRIM (1.00GiB) ...
Label:              (null)
UUID:               8ee0d705-3963-4a69-8382-e72f265fdaed
Node size:          16384
Sector size:        4096
Filesystem size:    1.00GiB
Block group profiles:
  Data:             single            8.00MiB
  Metadata:         single            8.00MiB
  System:           single            4.00MiB
SSD detected:       yes
Incompat features:  extref, skinny-metadata
Number of devices:  1
Devices:
   ID        SIZE  PATH
    1     1.00GiB  /dev/sda5

Looks good!

(For real usage, you’ll probably want to at least read man mkfs.btrfs to see what the options are - btrfs file systems can span devices and other neat things like that.)

Now I’ll create a mount point and mount it:

# mkdir /mnt/btrfs
# mount /dev/sda5 /mnt/btrfs/

Now I can use the btrfs command on the mounted filesystem:

# btrfs filesystem show /mnt/btrfs/
Label: none  uuid: 8ee0d705-3963-4a69-8382-e72f265fdaed
	Total devices 1 FS bytes used 192.00KiB
	devid    1 size 1.00GiB used 132.00MiB path /dev/sda5

Other informational commands are btrfs filesystem df for space usage and btrfs filesystem du.

Can we store files on it? I’ve heard that’s pretty important.

# cd /mnt/btrfs/
# cat > foo.txt
This is my text file. It is not perfect, but pretty nice.
Line two.
# cat foo.txt
This is my text file. It is not perfect, but pretty nice.
Line two.

Wooo, it reads and writes!

Features

I don’t plan to run any benchmarks or otherwise measure the performance and space benefits of btrfs, but what else can we do?

Subvolumes

One feature of btrfs is subvolumes. Let’s create one and see what it looks like:

# btrfs subvolume create subvolume1
Create subvolume './subvolume1'
# ls
foo.txt  subvolume1/
# cat > subvolume1/cheese.txt
This file is about cheese.
# tree
.
├── foo.txt
└── subvolume1
    └── cheese.txt

1 directory, 2 files

Interesting. So far, this subvolume looks and works just like a directory.

Heck, even file thinks it’s a directory:

# file subvolume1/
subvolume1/: directory

So what’s the point?

Well, you can actually mount these directory-looking subvolumes like any other filesystem. I have no doubt there are reasons you’d want to do that.

But thing I’m interested in are subvolume snapshots.

Snapshots

So we have our subvolume called subvolume1 in our btrfs filesystem mounted at /mnt/btrfs.

We’re going to need a file more substantial than the magnificent cheese.txt. Let’s make a 200Mb file real quick stackoverflow to the rescue:

# dd if=/dev/zero of=subvolume1/megacheese bs=4k iflag=fullblock,count_bytes count=200M
51200+0 records in
51200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 0.0976606 s, 2.1 GB/s
# ls -l subvolume1/
total 204804
-rw-r--r-- 1 root root        61 Jun  1 19:10 cheese.txt
-rw-r--r-- 1 root root 209715200 Jun  1 19:24 megacheese

That’s one big cheese.

Creating a snapshot is extremely simple (and also lightweight and fast):

# btrfs subvolume snapshot subvolume1 subvolume2
Create a snapshot of 'subvolume1' in './subvolume2'

Now we have a second subvolume that is a snapshot of the first. It contains exactly the same data and appears to take up exactly the same amount of space:

# du -a
4	./foo.txt
4	./subvolume1/cheese.txt
204800	./subvolume1/megacheese
204804	./subvolume1
4	./subvolume2/cheese.txt
204800	./subvolume2/megacheese
204804	./subvolume2
409628	.

Ah, but it’s not! The df utility isn’t fooled:

# df .
Filesystem     1K-blocks   Used Available Use% Mounted on
/dev/sda5        1048576 221808    715520  24% /mnt/btrfs

The second instance of our 200Mb file doesn’t actually take up another 200Mb.

The btrfs utility can tell us some additional information:

# btrfs filesystem df .
Data, single: total=336.00MiB, used=200.25MiB
System, single: total=4.00MiB, used=16.00KiB
Metadata, single: total=120.00MiB, used=352.00KiB
GlobalReserve, single: total=16.00MiB, used=0.00B
# btrfs filesystem show .
Label: none  uuid: 8ee0d705-3963-4a69-8382-e72f265fdaed
	Total devices 1 FS bytes used 200.61MiB
	devid    1 size 1.00GiB used 460.00MiB path /dev/sda5

But though the data is clearly de-duped, they’re not the same file. We can modify the original and the snapshot will remain untouched (or vice versa):

# cat >> subvolume1/cheese.txt
This is a gouda new line with cheddar!

# cat subvolume1/cheese.txt
This file is about cheese.
This is a gouda new line with cheddar!

# cat subvolume2/cheese.txt
This file is about cheese.

The snapshot volume did not get the change.

Okay, fine, so that’s neat and all, but it’s hard to measure the effects of changing a two line text file.

Let’s alter the 200Mb megacheese:

# cat >> subvolume1/megacheese
some text at the bottom of this big file of zeroes

Clearly they are different files now:

# ls -l */megacheese
-rw-r--r-- 1 root root 209715251 Jun  1 20:13 subvolume1/megacheese
-rw-r--r-- 1 root root 209715200 Jun  1 19:24 subvolume2/megacheese
# md5sum */megacheese
50e211b41cdaed0ed7f51bb451e32433  subvolume1/megacheese
3566de3a97906edb98d004d6b947ae9b  subvolume2/megacheese

But if we examine the filesystem with df again, we can see that the disk usage has only increased from 221,808Kb 221,812Kb - a rise of just 4Kb!

Filesystem     1K-blocks   Used Available Use% Mounted on
/dev/sda5        1048576 221812    715516  24% /mnt/btrfs

So clearly only tiny amounts of additional data are needed to store the changed portions of this file. This is really neat.

One idea is that you can mount a btrfs filesystem as your root (/) filesystem and use snapshots to be able to roll your whole system back to any point you want if something goes wrong by mounting the snapshot as the new root filesystem.

SUSE Linux even uses btrfs as its default root filesystem because it supports using snapshots in this fashion.

It’s easy to see how this would be incredibly handy for all sorts of system experimentation tasks.

What’s in the Slackware package?

It’s not big, clocking in at 4Mb uncompressed.

There are 11 binary tools including btrfs and mkfs.btrfs, which we’ve seen in action.

  • sbin/btrfs

  • sbin/btrfs-convert

  • sbin/btrfs-debug-tree

  • sbin/btrfs-find-root

  • sbin/btrfs-image

  • sbin/btrfs-map-logical

  • sbin/btrfs-select-super

  • sbin/btrfs-show-super

  • sbin/btrfs-zero-log

  • sbin/btrfstune

  • sbin/mkfs.btrfs

Developers can find C headers in usr/include/btrfs/.

There are man pages for all of the tools and more. In particular, check out the manual 5 and 8 entries:

man 5 btrfs
man 8 btrfs

Learning more about btrfs

The official Wiki on kernel.org is not a bad place to start. There are links to a ton of great material in the Articles, presentations, podcasts section on the main page:

Also check out the wiki’s Sysadmin Guide for an excellent pratical overview:

I really enjoyed this excellent 2009 article, A short history of btrfs on lwn.net:

I initially learned how to create snapshots from this linux.com article:

If you want to see the very latest btrfs stuff as it happens, check out the mailing list:

Why cannibalize my swap space for this?

I moved this section clear down here at the end because it’s completely tangential. :-)

Why not use a separate, dedicated drive for this experiment? Well, because I don’t have one.

We’re between houses at the moment, having moved across the United States (thus the four month pause in this blog) and most of my stuff is in storage. All I’ve got is my laptop running Slackware 14.2, a mechanical keyboard, and a trackball. It turns out this is all I actually need, though I do miss having the screen real estate of my multi-monitor desktop setup.

Anyway, the point is, I don’t have any spare drives sitting around.

I don’t see any reason I couldn’t use a USB flash drive, either (as far as I know, you can put any kind of file system on them, though vfat seems to be the most universal. I’ve got a handful of those sitting around, but they all have Really Important Things on them, so I would need to buy a new one for my experiments.

There happens to be a Best Buy (wikipedia) next to the comic book store we were going to visit today, so I figured maybe I could stop in there and pick up a cheap USB flash drive. We also needed some replacement USB micro cables for charging the family’s devices (I’m not going to name names, but somebody here likes to chew on them). Two birds with one stone.

Today may be the last time I step into a Best Buy.

USB cables were in an unlabeled aisle next to the kid’s toys at the back of the store. They were asking $26 for a single cable.

I shit you not. Twenty-six United States Dollars. These weren’t even gold-plated, hand-woven super-cables. Just plain black plastic things in generic-looking boxes. It took me a while to come to terms with what I was seeing.

I never even saw any USB flash drives.

I can buy affordable cables and flash drives all day long at my freaking grocery store. What the hell, Best Buy? Why do you even exist?

Oh well, at least my kid bought a toy. And the comic book store did not disappoint.

So no USB drive.

And I know virtual machines are popular for this sort of thing. But I say "bare metal or go home"! Ha ha, no, it’s just that setting up a VM is another project for another day. I want to keep this focused on the task at hand.

And why attack the swap partition and not the much larger space available on the rest of my SSD? Because, as I mentioned, this laptop is my only computer right now. I have backups, but if I screw up my main partition, I’m going to find myself re-installing Slackware and restoring from backup. No thanks. I’m not touching my main partition.