Disclaimer: I take no responsibility for any data loss resulting from following this guide. Try to understand what you are doing at all times and don’t copy-paste commands without reading carefully.
BTRFS is a great filesystem for holding data. It is a modern copy-on-write filesystem featuring enhanced data integrity mechanisms to avoid silent data corruption, snapshots that can be synchronized remotely and transparent data compression.
As with any other COW filesystem, the cost to be paid comes in terms of write performance. Because every write entails also a read and copy operation, COW filesystems perform worse for loads where there are frequent writes, such as some database applications.
The most professional way to be protected against a drive failure is to use RAID, which BTRFS supports natively. This obviously has a monetary price and for many humble architectures a big performance impact. For this reason on this article we will focus on the non RAID case.
When the drive fails
We self hosters and data hoarders sooner or later have to face the situation where we have to recover our data from a hard drive that is failing. All drives fail sooner or later.
This can be a dramatic situation, definitely not fun at all. BTRFS is complex and the restoration process can be confusing. It is important to keep calm, try not to get frustrated and think that BTRFS tries really hard to avoid losing or corrupting data. If BTRFS doesn’t let us mount the partition or if the system becomes read only is for a good reason.
Normally the situation is produced by hardware failure, so our first goal is to get the data out as soon as we can before the drive completely dies.
Depending on the scope of the damage we can be in different situations
- If the data itself is corrupted, parts of the files will have bad or inaccessible data. If this is the case, we will see checksum errors in the log. To some extent, BTRFS is capable of recovering some of the information due to its inherent redundancy, we can do this by scrubbing.
[ 1901.435050] BTRFS error (device sda1): bdev /dev/sda1 errs: wr 0, rd 0, flush 0, corrupt 11382, gen 0 [ 1901.435062] BTRFS error (device sda1): unable to fixup (regular) error at logical 13787648000 on dev /dev/sda1
- If the journal is inconsistent, BTRFS will not let us mount the drive with write permissions. This situation occurs whenever a write operation is interrupted and the journal log doesn’t have enough information to guarantee data integrity. This could be OK or very bad depending on what data we are talking about. If the hardware is not failing but we just had an unfortunate power cut, we have the option of accepting the data loss. In order to throw away the incomplete transactions we use
btrfs rescue zero-log.
parent transid verify failed on 31302336512 wanted 62455 found 62456 parent transid verify failed on 31302336512 wanted 62455 found 62456 parent transid verify failed on 31302336512 wanted 62455 found 62456
- In case that the superblock is affected, we will not be able to mount at all. The superblock is the root of the filesystem tree and contains the information that the operating system needs to mount the unit. This is normally easy to fix because BTRFS saves extra copies at different locations, so we might be able to mount using a copy of a superblock, or even run
btrfs rescue super-recoverto try to fix it.
[11152.189762] BTRFS: failed to read chunk tree on sda [11152.196224] BTRFS: open_ctree failed
- Also we could be in the situation where the filesystem metadata is damaged even though the actual data might be intact. This means that BTRFS doesn’t know about the existence of files or parts of files so we have no way to access them. In this scenario, if we react quickly, we can scan the whole disk and try to rebuild the filesystem metadata tree with
btrfs rescue chunk-recover. Needless to say this is both risky and very slow.
[19409.487603] BTRFS error (device sda1): bad tree block start 11106478115207782198 875249664 [19412.395884] BTRFS error (device sda1): bad tree block start 11106478115207782198 875249664
Let us see what would be a general procedure.
Rule zero is of course to have backups. This will allow us to sleep well at night and handle a bad drive situation with a much cooler head. I can’t stress this enough: have at least three copies in two different locations. Everything will be easier and less stressful when a drive fails, which will happen.
Then, rule number one is to monitor your hard drive’s health. This is also critical because normally you will get the warning at least 24 or 48 hours before total failure so you have a good chance of getting your data out of there before it is too late. Hard drives don’t completely fail from one day to the other but we need to pay attention to them.
If you do not have a backup or you want to try to rescue data that was modified after the time of the last backup, keep reading.
Procedure if the drive can be mounted
So you received the warning, the first thing to unmount the drive and turn it off before it keeps degrading. Then, look for this post and study your strategy before starting the following steps. Your drive might have just a few hours left.
First, mount your drive in read only mode, and try to copy your data out normally. If it is the root filesystem, boot from a live CD and proceed to repair from there. Keep a terminal open with the kernel logs
dmesg -w, and watch out for errors like this
[386229.214384] BTRFS warning (device sda1): sda1 checksum verify failed on 568344576 wanted B77C6306 found D884C20D level 0 [386229.223445] BTRFS warning (device sda1): sda1 checksum verify failed on 568344576 wanted B77C6306 found BB7B11CF level 0
Some files might fail copying if they are in a damaged section, or some might appear to copy fine but throw errors in the kernel logs. Copy your main folders one by one. Try to copy first the most valuable stuff and take notes of what might be corrupted from the information in the logs.
Next, let’s try to repair as much as we can. For this we will first try scrubbing with
btrfs scrub. This is will check for data integrity using checksums and will try to recover the damaged data. Scrubbing is considered safe and is usually the first thing to try.
Run it with
# btrfs scrub start /mnt
, and follow the progress with
# btrfs scrub status /mnt
This typically takes a couple hours
This will fix as much as it can, but might not be able to fix all the issues. Try again to copy to safety whatever had errors before. Probably we will have fixed many files and maybe even all of them.
If you get errors by
inode in the logs like this
[ 5488.731343] BTRFS warning (device sda1): csum failed root 5 ino 40913 off 28815360 csum 0xf4702fd5 expected csum 0xfe7c816f mirror 1 [ 5488.731830] BTRFS warning (device sda1): csum failed root 5 ino 40913 off 28815360 csum 0xf4702fd5 expected csum 0xfe7c816f mirror 1 [ 5488.732189] BTRFS warning (device sda1): csum failed root 5 ino 40913 off 28815360 csum 0xf4702fd5 expected csum 0xfe7c816f mirror 1
, you can see what file it corresponds to if the metadata is not damaged with
# btrfs inspect-internal inode-resolve 40913 /mnt
If it does not mount
If the superblocks are damaged the partition will not mount. Try first to fix the partition as above with
btrfs scrub, then you might be able to mount it and proceed with the steps above.
In the case scrubbing was not enough, we can try to use a backup of the root tree in read only mode, which doesn’t alter the data and is completely safe.
# mount -o usebackuproot /dev/sdXY /mnt
Try to save as much as you can
Those steps normally work well enough. In case the above didn’t suffice, then there is no completely safe way of getting the data back. We have to try to get as much as we can and keep in mind that what we recover could very well be corrupted at least partially.
The best thing to do at this point is to run
# btrfs restore /dev/sdXY /mnt/
This is completely safe and will try its best to mount a read only version of the data in
/mnt that is as sound as possible. For instance, a file could be mounted without integrity errors but be in an old version from an old snapshot. Still worth trying to get out this information before trying out potentially destructive tools.
If we are still not able to mount normally, we can now run
btrfs rescue super-recover, which will try to restore the superblock from a good copy. This is not completely safe.
As mentioned before, if your metadata was corrupt there is a chance that files or part of files that are not damaged are not seen by the filesystem. In this scenario, we can use
btrfs rescue chunk-recover /dev/sdXY to scan the whole drive contents and try to rebuild the metadata trees. This will take very long specially for big drives, and could result in some of the data being wrongly restored.
The absolute last resort
We are used to doing
fsck or filesystem check as soon as we see something weird on ext4. Well, don’t do this in BTRFS.
btrfs check should be the last resort as it will try hard to restore the filesystem and there is a very high chance that it will make things worse.
While the above commands will very rarely cause any more damage, and some such as scrubbing or restore are totally safe, this will very likely mess things up. We have to understand this before following our ext4 instincts.
# btrfs check --repair /dev/sdXY
Verify your copies
After moving all the data to a safe place, we will probably want to compare it with our backups to see what information is missing from the backup.
In general, the information in our backups will be more trustworthy than the one we tried to save from the failing drive, so ideally we only want to update the backup with the new data that was added or modified since the last copy.
In order to do this, the following commands will be handy. The first command will only compare file names
$ rsync --dry-run -ri --delete --ignore-existing /copy/ /old-backup/
, and this one will compare the checksum of each file in both folders
$ rsync --dry-run -ri --delete --checksum /copy/ /old-backup/
Naturally the latter can take a while. Neither of those commands modify any actual data, if you want to proceed, remove the
I hope this post helps understand what a good strategy might look like and make sense of the variety of tools and options for recovery that BTRFS has to offer.