btrfs, linux

How to recover a BTRFS partition

Disclaimer: I take no responsibility for any data loss resulting from following this guide. Try to understand what you are doing at all times and don’t copy-paste commands without reading carefully.

BTRFS is a great filesystem for holding data. It is a modern copy-on-write filesystem featuring enhanced data integrity mechanisms to avoid silent data corruption, snapshots that can be synchronized remotely and transparent data compression.

As with any other COW filesystem, the cost to be paid comes in terms of write performance. Because every write entails also a read and copy operation, COW filesystems perform worse for loads where there are frequent writes, such as some database applications.

The most professional way to be protected against a drive failure is to use RAID, which BTRFS supports natively. This obviously has a monetary price and for many humble architectures a big performance impact. For this reason on this article we will focus on the non RAID case.

When the drive fails

We self hosters and data hoarders sooner or later have to face the situation where we have to recover our data from a hard drive that is failing. All drives fail sooner or later.

This can be a dramatic situation, definitely not fun at all. BTRFS is complex and the restoration process can be confusing. It is important to keep calm, try not to get frustrated and think that BTRFS tries really hard to avoid losing or corrupting data. If BTRFS doesn’t let us mount the partition or if the system becomes read only is for a good reason.

Normally the situation is produced by hardware failure, so our first goal is to get the data out as soon as we can before the drive completely dies.

Depending on the scope of the damage we can be in different situations

  • If the data itself is corrupted, parts of the files will have bad or inaccessible data. If this is the case, we will see checksum errors in the log. To some extent, BTRFS is capable of recovering some of the information due to its inherent redundancy, we can do this by scrubbing.
  • If the journal is inconsistent, BTRFS will not let us mount the drive with write permissions. This situation occurs whenever a write operation is interrupted and the journal log doesn’t have enough information to guarantee data integrity. This could be OK or very bad depending on what data we are talking about. If the hardware is not failing but we just had an unfortunate power cut, we have the option of accepting the data loss. In order to throw away the incomplete transactions we use btrfs rescue zero-log.
  • In case that the superblock is affected, we will not be able to mount at all. The superblock is the root of the filesystem tree and contains the information that the operating system needs to mount the unit. This is normally easy to fix because BTRFS saves extra copies at different locations, so we might be able to mount using a copy of a superblock, or even run btrfs rescue super-recover to try to fix it.
  • Also we could be in the situation where the filesystem metadata is damaged even though the actual data might be intact. This means that BTRFS doesn’t know about the existence of files or parts of files so we have no way to access them. In this scenario, if we react quickly, we can scan the whole disk and try to rebuild the filesystem metadata tree with btrfs rescue chunk-recover. Needless to say this is both risky and very slow.

Let us see what would be a general procedure.

Be prepared

Rule zero is of course to have backups. This will allow us to sleep well at night and handle a bad drive situation with a much cooler head. I can’t stress this enough: have at least three copies in two different locations. Everything will be easier and less stressful when a drive fails, which will happen.

Then, rule number one is to monitor your hard drive’s health. This is also critical because normally you will get the warning at least 24 or 48 hours before total failure so you have a good chance of getting your data out of there before it is too late. Hard drives don’t completely fail from one day to the other but we need to pay attention to them.

SMART errors in NextCloudPi

If you do not have a backup or you want to try to rescue data that was modified after the time of the last backup, keep reading.

Procedure if the drive can be mounted

So you received the warning, the first thing to unmount the drive and turn it off before it keeps degrading. Then, look for this post and study your strategy before starting the following steps. Your drive might have just a few hours left.

First, mount your drive in read only mode, and try to copy your data out normally. If it is the root filesystem, boot from a live CD and proceed to repair from there. Keep a terminal open with the kernel logs dmesg -w, and watch out for errors like this

Some files might fail copying if they are in a damaged section, or some might appear to copy fine but throw errors in the kernel logs. Copy your main folders one by one. Try to copy first the most valuable stuff and take notes of what might be corrupted from the information in the logs.

Next, let’s try to repair as much as we can. For this we will first try scrubbing with btrfs scrub. This is will check for data integrity using checksums and will try to recover the damaged data. Scrubbing is considered safe and is usually the first thing to try.

Run it with

, and follow the progress with

This typically takes a couple hours

This will fix as much as it can, but might not be able to fix all the issues. Try again to copy to safety whatever had errors before. Probably we will have fixed many files and maybe even all of them.

If you get errors by inode in the logs like this

, you can see what file it corresponds to if the metadata is not damaged with


If it does not mount

If the superblocks are damaged the partition will not mount. Try first to fix the partition as above with btrfs scrub, then you might be able to mount it and proceed with the steps above.
In the case scrubbing was not enough, we can try to use a backup of the root tree in read only mode, which doesn’t alter the data and is completely safe.

Try to save as much as you can

Those steps normally work well enough. In case the above didn’t suffice, then there is no completely safe way of getting the data back. We have to try to get as much as we can and keep in mind that what we recover could very well be corrupted at least partially.

The best thing to do at this point is to run btrfs restore

This is completely safe and will try its best to mount a read only version of the data in /mnt that is as sound as possible. For instance, a file could be mounted without integrity errors but be in an old version from an old snapshot. Still worth trying to get out this information before trying out potentially destructive tools.

If we are still not able to mount normally, we can now run btrfs rescue super-recover, which will try to restore the superblock from a good copy. This is not completely safe.

As mentioned before, if your metadata was corrupt there is a chance that files or part of files that are not damaged are not seen by the filesystem. In this scenario, we can use btrfs rescue chunk-recover /dev/sdXY to scan the whole drive contents and try to rebuild the metadata trees. This will take very long specially for big drives, and could result in some of the data being wrongly restored.

The absolute last resort

We are used to doing fsck or filesystem check as soon as we see something weird on ext4. Well, don’t do this in BTRFS. btrfs check should be the last resort as it will try hard to restore the filesystem and there is a very high chance that it will make things worse.

While the above commands will very rarely cause any more damage, and some such as scrubbing or restore are totally safe, this will very likely mess things up. We have to understand this before following our ext4 instincts.

Verify your copies

After moving all the data to a safe place, we will probably want to compare it with our backups to see what information is missing from the backup.

In general, the information in our backups will be more trustworthy than the one we tried to save from the failing drive, so ideally we only want to update the backup with the new data that was added or modified since the last copy.

In order to do this, the following commands will be handy. The first command will only compare file names

, and this one will compare the checksum of each file in both folders

Naturally the latter can take a while. Neither of those commands modify any actual data, if you want to proceed, remove the --dry-run parameter.

Conclusion

I hope this post helps understand what a good strategy might look like and make sense of the variety of tools and options for recovery that BTRFS has to offer.

Author: nachoparker

Humbly sharing things that I find useful [ github dockerhub ]

8 Comments on “How to recover a BTRFS partition

  1. Thanks for your very good article. Unfortunately I found this too late. I already restored my broken btrfs from a backup.

    I saw that you recommend to run “btrfs scrub” if you can’t mount the fs. But scrub only runs on a mounted filesystem. At least on my Debian unstable with Kernel 4.19 and btrfs-progs 4.20.2

  2. I was in a panic and found this article. Your suggestion fixed what I thought was the end of my partition:
    btrfs rescue zero-log

  3. I found this article in the last moment before to reformat my disk.
    btrfs restore /dev/sdXY /mnt/ Worked for me !! Great.
    Now I have doubts about how to copy /mnt to the original hard disk.
    Can you help me?
    Thanks, You saved me more than 2 months of work.

  4. Wow! This should be documented as part of btrfs manual. I would like to ask a condition, though. In my current case, there’s no error when scrub or check –repair, I can even balance perfectly. However on every mount the corrupt value is never 0, while wr, rd, flush do (gen also not 0 sometimes, but it’s of very small value under 10). Is that a signal for an actual problem or something I can just ignore? Thanks.

  5. Thank you a lot.
    In my case ‘mount -o usebackuproot’ helped resolve my problem with unmountable file system.
    I could not fix it with any of ‘btrfs check’ and ‘btrfs rescue’ options.
    Upgrade of btrfs-progs from 4.15.1 to 5.1 (Ubuntu 18.04 Bionic to 19.10 Eoan) also did not help.
    After the umount, the system had become mountable in usual way, and I reinserted this ‘crashed’ volume back into Synology DSM which had happily seen it green again.

  6. I am experimenting with btrfs on a bunch of loop devices for two hours now. After few mounts with -o degraded, scrub showed few thousands of ‘uncorrectable errors’. Guys – you should definitely try ZFS instead. Last week I finally replaced two failing disks of two-way mirror setup with over a thousand corrected errors on both disks, with no data loss at all.

    And for massive storage I use LizardFS with chunkservers backed by ZFS storage, with ec(4,2) for personal data and xor5 for other (non-critical) data.

    Please keep in mind that Btrfs was originally an Oracle filesystem, developed to compete with ZFS.

    Take care,
    Marcin.

Leave a Reply

Your email address will not be published. Required fields are marked *