Good morning all, in today’s episode of “What I learned during work hours”…
I was playing around with wxHexEditor and realised that if something catastrophic happened, I would really struggle with any data recovery if I lost the inode tables for any drive.
A quick duckle pointed me to e2image, which says in the man:
It is a very good idea to create image files for all file systems on a system and save the partition layout (which can be generated using the fdisk -l command) at regular intervals — at boot time, and/or every week or so.
I couldn’t find any prebuilt solutions for this online, so I wrote a systemd service and timer to do this for me. I save the fdisk to a text file, run e2image on a couple drives, and compress it all together in a dated 7z that can get uploaded via rsync or Mega or Dropbox etc.
The metadata image from a 500gb drive is 8gb, but compresses down to 40mb. Backup takes a couple minutes.
Unfortunately this does not work with my raid drives, but they are RAID1 so already resilient.
Apparently I was being a derp somehow. …Anyways,
My RAID drives are 16TB, e2image of this is 125gb, and 7z’d it comes down to just 63mb.
I’ll post the service, timer, and backup script in a comment, let me know if you can spot anywhere for improvements!
How important this is will depend on your file system of choice.
testdisk
can recover many broken partition tables, and most file systems keep multiple copies of the necessary tables on disk just in case something goes wrong.There’s a good chance you can recover an ext4 partition with
mkfs.ext4 -S
if you know the offset on disk (and didn’t specify any alternative options). It’ll recreate the file system structure on disk but it’ll leave most data intact. I’ve had to resort to doing this and it took a little finetuning of the options (make backups of the drives you recover!), but the data came back as if nothing ever happened.If you’re afraid of data loss,
mkfs.ext4
will allow you to enable up to two backups of the superblock as well, right in the middle of the file system.Other filesystems like btrfs and NTFS also have backup superblocks of course. Things become more complicated when you start spreading your filesystem across drives (in a non-redundant fashion) but all in all I don’t think taking snapshots of a drive’s structure is that important, as long as you have your normal backup procedures.
Now, your LUKS headers, those you may want to back up! If you fat finger a
dd
and overwrite the LUKS header, there’s basically no recovering that data. If the system is still running you could dump the key and try to manually reconstruct the header, but if you turn your computer off after clobbering the LUKS header there’s no getting your data back, no matter how many recovery phrases and key files you may have saved!This can also be used to your advantage, i.e. by storing the LUKS header on a removable drive so that there is no way to decrypt your drive even if someone knows your password (and neither can you if you lose the flash drive!).
Great tips, thanks!
I’m using ext4 across everything I think.
Can you enable superblocks after you’ve already formatted the drive?
Fdisk saves the offsets so keeping a record of that at least sounds like a good idea.
There’s a good chance you have plenty of backup superblocks already.
Try running
sudo dumpe2fs /dev/your-partition-here
or, lacking thatmke2fs -n /dev/your-partition-here
(make sure to specify -n so you don’t overwrite your filesystem) and look for backup superblock offsets in the output. There’s a good chance you have a whole bunch of them spread throughout the disk.
This is why I love having luks covering my entire system disk. If I want to upgrade the system with a new drive or move the drive to a different pc or sell it or dispose of it I just dd the first couple of gigs to obliterate the luks header.
It’s obviously essential to have a backup strategy, of course, but full disk encryption is the only way to go for me.
Fantastic. I’m following!
The script takes the drives as arguments:
$ pwd /usr/lib/systemd/system $ cat drive_backup.service [Unit] Description=backup fdisk + e2image Wants=drive_backup.timer [Service] Type=oneshot ExecStart=/usr/bin/backup_meta_data.sh /dev/sdc1 /dev/sdb1 [Install] WantedBy=multi-user.target
Set to run at 3:40am every day, but probably could be once weekly really.
$ cat drive_backup.timer [Unit] Description=timer to run drive backup Requires=drive_backup.service [Timer] Unit=drive_backup.service OnCalendar=*-*-* 03:40:00 [Install] WantedBy=timers.target
Should be fairly self-explanatory.
$ cat /usr/bin/backup_meta_data.sh #!/bin/bash working_dir=/home/st/drive_recovery/working backup_dir=/home/st/drive_recovery backup_date=$(date +%Y%m%d-%H%M) mkdir -p $working_dir sudo fdisk -x > $working_dir/$backup_date.fdisk for var in "$@" do clean=$(echo $var | sed 's;/;-;g') sudo e2image $var $working_dir/$backup_date.$clean done sudo 7z a $backup_dir/$backup_date.archive $working_dir/"$backup_date"* sudo rm $working_dir/"$backup_date"*
May I point out that all a RAID1 does is sync the blocks between two drives. It won’t protect against writing something dumb that would mess up the filesystem, it will just dutifully sync it.
You should be able to back up ext data from a filesystem on a RAID array, unless I’m confused about what e2image actually does. Are you trying to use it on the underlying drive devices by any chance? You have to point it at the RAID device on top of them, something like /dev/md1 rather than /dev/sda1.
This sounds like a good extra backup to have but don’t let it lull you into a false sense of security. It may help recover from a very specific kind of mistake but the recovery may be very specific as well. It’s not file backup.
Oh you’re right it does work… well fuck knows what I was doing wrong before.
Yeah this is a backup in case I like, mv file to /dev/sda1 or something.
Not a backup of the files, but a backup of the structure.
I’m really curious as to why go to all this trouble instead of using a proper file level backup and restore solution.
instead of using a proper file level backup
Backups do not solve everything.
For example once I had a bad cable, and it did a kinda sneaking silent damage. Let’s say 5 or 50 broken files every day. And only after some weeks I noticed some of them, and there was hardly a chance to identify them each day. And sometimes there was damage to the file system, too. It took a while find the root cause.
Today I use ZFS with redundancy and it does the recovery all by itself and my sleep is so much better :-)
Ok time to investigate ZFS
“Proper backups” imply that you have multiple backups and a backup strategy. That could mean, for instance, that you would do a full backup, then an incremental/differential backup each week and keep one backup for each month. A bad cable would cause you trouble, no doubt, but the impact would be lessened by having multiple backups points spread over months.
Redundancy is not backup. Read that again.
Redundancy is important for system resilience, but backup is crucial for continuity. Every filesystem is subject to bugs and ZFS is not special. Here’s an article from a couple of days ago. If you’re comfortable with no backups just because you have redundancy, more power to you. I wouldn’t be.
I wasn’t saying backups are useless or something.
I was saying there are situations that backups can’t solve.
Sure, all the work you do between the moment of the filesystem failure and the last backup is gone. There’s nothing that can be done to mitigate that fact, other that more frequent backups and/or a synchronized (mirror) system.
Backups are just a simple way to keep you from having to explain to your partner that you lost all the pictures and videos you took along the years.
For fun and learning. It’s just another tool to go with file level backup.
And the backup for this is 40mb and really fast, but backing up files even when compressed would be hundreds of GB, maybe terabytes, and then you’re paying for that amount of storage online somewhere, uploading for hours…
Picture this: you open and edit one of your documents and save it.
The filesystem promptly allocates some blocks and updates the inodes. Maybe the inode table changed, maybe not. Repeat for some other files. Now your “inode backup” has a completely different picture of what is going on on your disk. If you try to recover the disk using it, all you will achieve is further corruption of the filesystem.
e2image
There’s a good reason for the 2 in the name.
Today we have ext4, and ZFS of course.
e2image - Save critical ext2/ext3/ext4 file system metadata to a file