BTRFS

Although most people are only familiar with NTFS and FAT32 file systems, most people don't know that there are tons of alternatives to these.

In this article, I'm going to dive into what a filesystem is and some cool features of BTRFS that make it appealing to some people.

File Systems

The primary job of a file system is to control how data is written to disk. What I mean by this is that on a storage device, data is stored in ones and zeroes, there needs to be some way to distinguish where one file's data ends, and another's begins.

Data allocation is another huge thing file systems handle. When you write data to disk, the space the file is going to take needs to be allocated. "Defragmenting" is the process of looking for files with multiple allocated areas on disk (a process of the filesystem), and putting them all one after another.

Illustration of a file becoming "fragmented"
In this picture, you can see that file D on the disk is deleted. File B is then made, along with file G. Later on (5), when file F needs to become larger, it gets fragmented, as there's no free space on disk directly after it.

Filesystems also define what is or isn't allowed in a filename, along with the maximum filename length.

Furthermore, they also handle storing metadata about files. This could be the username of the person who made the file, the time the file was created, or the last time the file was opened.

Lastly, the file system handles the extremely important task of making sure that the data's integrity isn't jeopardized. For example, each program may try to write data to disk slightly differently. The file system must make sure all the data is stored consistently. If the power is cut to a device while it's on, the filesystem has routines to make sure that data isn't lost.

What About BTRFS?

BTRFS is a new(ish) file system that is backed by many huge companies including Facebook, Intel, Netgear and Oracle.

This filesystem is considered a "journaling file system". This type of filesystem records the intentions what the filesystem wants to do, then makes the change to the file on disk, then deletes the journal entry it just completed. The reason for this is that if the power gets cut while it's attempting to write, it's normally possible to either undo, or complete the write when turning the system back on . This is awesome, because it decreases the chances of data loss.

Although BTRFS has a ton of cool features, I'm only going to cover some of the best ones here.

Copy On Write

Copy on write is a pretty cool feature, as it doesn't overwrite data unlike some other file systems. The way that this works is that when a file is modified, the file is reallocated, and written to disk. The best part of these types of file systems is that if the power is disconnected during a write on this type of system, the computer can look at the journal, and use the old allocated space of the file to restore it. Although this won't keep the latest save data, as it may only be partially written, only the most recent save would be lost.

Snapshotting

BTRFS also supports writable and read-only snapshots. This is pretty cool because you could set up a computer using BTRFS, and set a read-only snapshot. This would allow the user to restore a working, fresh install of the operating system, so that if anything ever keeps the OS from booting, the fresh install could just be restored. This is great, as it could be used for more daily uses to. One could set a read-only snapshot of their disk daily, and if they get a virus that deletes (or encrypts like WannaCry) all their data, they could just restore the disk back to how it was yesterday.

Data Deduplication

This is a feature that I recently learned about. Data deduplication allows multiple copies of the same file to take up almost no extra physical space on disk.

Let's assume that I have a 1GB file on my disk, and I copy it. It'd be assumed that copying that one file, would result in the 1GB original, and a 1GB copy of it. BTRFS avoids this by taking the duplicated file, and only copying the pointer (which is the information about where the file is on disk). This means that since the two files are the same, it won't actually attempt to copy the file (which could take a long time). Instead, it will take the more efficient approach of copying the smaller than 1MB pointer, as it will still lead to the same data on disk.

Compression

BTRFS also allows the user to set the disk to be compressed. This makes all data stored on disk to be compressed before it is written. This results in a slower access rate of the disk, but more data can be stored on the disk.

Wrapping Up

All in all, BTRFS is a pretty awesome file system in my opinion. I would like to mention that it's still being developed, and not all features of it are 100% stable. I'm personally so hopeful that this file system gets more widely adopted.

Refer to their Status page here to see an up to date listing of stability of features.

Leave a Reply