This part is either very easy or distinctly less so, depending on whether the file you are trying to recover is more than 12 blocks long.
10.1 Short files
If the file was no more than 12 blocks long, then the block numbers of all its data are stored in the inode: you can read them directly out of the
stat output for the inode. Moreover,
debugfs has a command which performs this task automatically. To take the example we had before, repeated here:
debugfs: stat Inode: 148003 Type: regular Mode: 0644 Flags: 0x0 Version: 1 User: 503 Group: 100 Size: 6065 File ACL: 0 Directory ACL: 0 Links: 0 Blockcount: 12 Fragment: Address: 0 Number: 0 Size: 0 ctime: 0x31a9a574 -- Mon May 27 13:52:04 1996 atime: 0x31a21dd1 -- Tue May 21 20:47:29 1996 mtime: 0x313bf4d7 -- Tue Mar 5 08:01:27 1996 dtime: 0x31a9a574 -- Mon May 27 13:52:04 1996 BLOCKS: 594810 594811 594814 594815 594816 594817 TOTAL: 6
This file has six blocks. Since this is less than the limit of 12, we get
debugfs to write the file into a new location, such as
debugfs: dump /mnt/recovered.000
Of course, this can also be done with
fsgrab; I’ll present it here as an example of using it:
# fsgrab -c 2 -s 594810 /dev/hda5 > /mnt/recovered.000 # fsgrab -c 4 -s 594814 /dev/hda5 >> /mnt/recovered.000
fsgrab, there will be some garbage at the end of
/mnt/recovered.000, but that’s fairly unimportant. If you want to get rid of it, the simplest method is to take the
Size field from the inode, and plug it into the
bs option in a
dd command line:
# dd count=1 if=/mnt/recovered.000 of=/mnt/resized.000 bs=6065
Of course, it is possible that one or more of the blocks that made up your file has been overwritten. If so, then you’re out of luck: that block is gone forever. (But just imagine if you’d unmounted sooner!)
10.2 Longer files
The problems appear when the file has more than 12 data blocks. It pays here to know a little of how UNIX filesystems are structured. The file’s data is stored in units called `blocks’. These blocks may be numbered sequentially. A file also has an `inode’, which is the place where information such as owner, permissions, and type are kept. Like blocks, inodes are numbered sequentially, although they have a different sequence. A directory entry consists of the name of the file and an inode number.
But with this state of affairs, it is still impossible for the kernel to find the data corresponding to a directory entry. So the inode also stores the location of the file’s data blocks, as follows:
- The block numbers of the first 12 data blocks are stored directly in the inode; these are sometimes referred to as the direct blocks.
- The inode contains the block number of an indirect block. An indirect block contains the block numbers of 256 additional data blocks.
- The inode contains the block number of a doubly indirect block. A doubly indirect block contains the block numbers of 256 additional indirect blocks.
- The inode contains the block number of a triply indirect block. A triply indirect block contains the block numbers of 256 additional doubly indirect blocks.
Read that again: I know it’s complex, but it’s also important.
Now, the current kernel implementation (certainly for all versions up to and including 2.0.30) unfortunately zeroes all indirect blocks (and doubly indirect blocks, and so on) when deleting a file. So if your file was longer than 12 blocks, you have no guarantee of being able to find even the numbers of all the blocks you need, let alone their contents.
The only method I have been able to find thus far is to assume that the file was not fragmented: if it was, then you’re in trouble. Assuming that the file was not fragmented, there are several layouts of data blocks, according to how many data blocks the file used:
0 to 12
The block numbers are stored in the inode, as described above.
13 to 268
After the direct blocks, count one for the indirect block, and then there are 256 data blocks.
269 to 65804
As before, there are 12 direct blocks, a (useless) indirect block, and 256 blocks. These are followed by one (useless) doubly indirect block, and 256 repetitions of one (useless) indirect block and 256 data blocks.
65805 or more
The layout of the first 65804 blocks is as above. Then follow one (useless) triply indirect block and 256 repetitions of a `doubly indirect sequence’. Each doubly indirect sequence consists of a (useless) doubly indirect block, followed by 256 repetitions of one (useless) indirect block and 256 data blocks.
Of course, even if these assumed data block numbers are correct, there is no guarantee that the data in them is intact. In addition, the longer the file was, the less chance there is that it was written to the filesystem without appreciable fragmentation (except in special circumstances).
You should note that I assume throughout that your blocksize is 1024 bytes, as this is the standard value. If your blocks are bigger, some of the numbers above will change. Specifically: since each block number is 4 bytes long, blocksize/4 is the number of block numbers that can be stored in each indirect block. So every time the number 256 appears in the discussion above, replace it with blocksize/4. The `number of blocks required’ boundaries will also have to be changed.
Let’s look at an example of recovering a longer file.
debugfs: stat Inode: 148004 Type: regular Mode: 0644 Flags: 0x0 Version: 1 User: 503 Group: 100 Size: 1851347 File ACL: 0 Directory ACL: 0 Links: 0 Blockcount: 3616 Fragment: Address: 0 Number: 0 Size: 0 ctime: 0x31a9a574 -- Mon May 27 13:52:04 1996 atime: 0x31a21dd1 -- Tue May 21 20:47:29 1996 mtime: 0x313bf4d7 -- Tue Mar 5 08:01:27 1996 dtime: 0x31a9a574 -- Mon May 27 13:52:04 1996 BLOCKS: 8314 8315 8316 8317 8318 8319 8320 8321 8322 8323 8324 8325 8326 8583 TOTAL: 14
There seems to be a reasonable chance that this file is not fragmented: certainly, the first 12 blocks listed in the inode (which are all data blocks) are contiguous. So, we can start by retrieving those blocks:
# fsgrab -c 12 -s 8314 /dev/hda5 > /mnt/recovered.001
Now, the next block listed in the inode, 8326, is an indirect block, which we can ignore. But we trust that it will be followed by 256 data blocks (numbers 8327 through 8582).
# fsgrab -c 256 -s 8327 /dev/hda5 >> /mnt/recovered.001
The final block listed in the inode is 8583. Note that we’re still looking good in terms of the file being contiguous: the last data block we wrote out was number 8582, which is 8327 + 255. This block 8583 is a doubly indirect block, which we can ignore. It is followed by up to 256 repetitions of an indirect block (which is ignored) followed by 256 data blocks. So doing the arithmetic quickly, we issue the following commands. Notice that we skip the doubly indirect block 8583, and the indirect block 8584 immediately (we hope) following it, and start at block 8585 for data.
# fsgrab -c 256 -s 8585 /dev/hda5 >> /mnt/recovered.001 # fsgrab -c 256 -s 8842 /dev/hda5 >> /mnt/recovered.001 # fsgrab -c 256 -s 9099 /dev/hda5 >> /mnt/recovered.001 # fsgrab -c 256 -s 9356 /dev/hda5 >> /mnt/recovered.001 # fsgrab -c 256 -s 9613 /dev/hda5 >> /mnt/recovered.001 # fsgrab -c 256 -s 9870 /dev/hda5 >> /mnt/recovered.001
Adding up, we see that so far we’ve written 12 + (7 * 256) blocks, which is 1804. The `stat’ results for the inode gave us a `blockcount’ of 3616; unfortunately these blocks are 512 bytes long (as a hangover from UNIX), so we really want 3616/2 = 1808 blocks of 1024 bytes. That means we need only four more blocks. The last data block written was number 10125. As we’ve been doing so far, we skip an indirect block (number 10126); we can then write those last four blocks.
# fsgrab -c 4 -s 10127 /dev/hda5 >> /mnt/recovered.001
Now, with some luck the entire file has been recovered successfully.