-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discoverable Partitions Specification: Provide per-filesystem-type GUIDs #132
Comments
This information is already in the super block, there's no need to duplicate it |
The problem is that different filesystems have their superblocks at different places. What prevents the same image from having a BTRFS superblock, an XFS superblock, and an ext4 superblock? |
If a filesystem never writes to data before its first superblock, a rule like “first superblock wins” would be an option. |
That's not a valid image, so why would it matter? Just don't build broken images, such errors have nothing to do with the specification |
What about mutable partitions like |
@tytso am I correct about ext4? |
Again, all of that is out of scope. The purpose of this spec is to establish common ways to identify the purpose of a GPT partition, as in, its mountpoint, essentially. What consitutes a broken filesystem, how to detect that, etc etc, is all out of scope, implementation-specific details and belongs somewhere else. |
The gpt-auto logic already has a policy concept (see systemd.image-policy(7) man page). It won't allow you to restrict the set of file systems we'll accept, but it does allow you to restrict the encryption/authentication requirements. And that's what you have to anyway for a properly secure system: because kernel file systems are not really validated against rogue fs images you have to have some form of block-level authentication logic in place, and if you have that the ambiguity issue goes away. Also note that the gpt-auto logic is very restrictive, it has a short allowlist of file systems it looks for (which are ext4, btrfs, xfs, erofs, f2fs, squashfs, vfat) which should be reasonable well mantained. It won't remove the ambiguity, but it does lock down the attack surface. There's a TODO list item somewhere to extend the policy language to make further restrictions on fs type when probing. happy to take a patch for that. |
One perspective on this is to contrast with traditional fstab, which while it does support
It's not just about pre-generated images but whatever we happen to find on disk for mutable data partitions, right? I guess the question though really is: Is ambiguity possible today accidentally? It doesn't seem impossible, but before we become concerned about it some investigation is probably needed. |
Exactly!
Indeed so!
I’m concerned about both accidental ambiguity (util-linux/util-linux#1305, caused by a previous ZFS filesystem that hadn’t been overwritten) and intentional ambiguity introduced by a malicious, unprivileged user who has full read-write access to some directory on the filesystem. An unprivileged user can’t directly access the disk, but they can exert substantial influence over its contents, especially if the filesystem doesn’t do encryption. For instance, if I write a 900GiB file to a 1TiB ext4 filesystem that repeats the same sector over and over, most sectors on the disk will have that value. I expect that a more clever attacker can use knowledge of the allocation algorithms to exert even more control. Attackers have a lot of experience doing exactly this with memory allocators. My understanding is that libblkid will refuse to pick a filesystem if there is ambiguity. In this case, a fairly nasty persistent DoS could result, which might not be recoverable without data loss. There are a few countermeasures I can think of:
Personally, I think that this specification should implement either option 1 or option 2, and that other tools should also implement option 3. |
option 3 is not so easy. modern file systems maintain additional copies of the superblock at various offsets, and the first one is not necessarily on sector 0. btrfs does that for example, it puts it a few MiB inside the disk, and then adds copies in logarithmically growing distances. if you'd declare that these file systems should never consider the other superblocks then they kinda lose the reason they exist in the first place... i am not convinced that placing any info the the gpt partition metadata would be wise, because that's not protected cryptographically. you'd make things worse by trying to make them better. dm-verity/dm-crypt/dm-integrity is the root of trust on disk for us, and that only covers partition contents, not partition metadata. I guess what you could do is define a "secure envelope" partition type or something like that, that would take the first and last sector of a dm-verity/dm-crypt/dm-integrity partition (i.e. inside of it, covered by the protections) and would carry the info you are looking for. but that would probably be a hard sell, since you'd then have to use dm-linear or so on top of the dm-verity/dm-crypt/dm-integrity to chop off the beginning and the end sector again before you can mount things. I wonder if this is actually really a problem. i.e. can you actually create an fs image that both qualifies as valid ext4 and valid btrfs or so? And moreover, could an unpriv user actually create that just by writing files to the fs? |
For encrypted mutable partitions, I think it would be best to store the filesystem type in the LUKS metadata. This is about unencrypted partitions only, which are common in e.g. virtualized workloads where encryption is done by the host. |
One can at least create an image that is sufficiently close to this to fool blkid. |
Perhaps, but that's blkid's problem to deal with, and the kernel's to refuse such images. If one wants to create malicious images, they can create malicious GPT GUIDs too. This is wildly out of scope, and there's no real use case. |
To the best of my knowledge, ext4 and XFS have no idea where the other puts their superblocks, much less where e.g. ZFS puts its superblocks. If I write something that looks like a ZFS superblock to a file on my ext4 filesystem, the only thing I know of preventing it from being placed where it looks like an actual ZFS superblock is chance. Ideally, all filesystems would use the same locations for their superblocks to prevent collisions like this, but they do not. |
Any corruption that could clobber the first superblock could just as easily clobber the partition tables, other filesystem metadata, or both. The other superblocks are indeed useful for data recovery scenarios, but filesystem probing I don’t think it is safe or necessary to use them. If additional fault tolerance is needed, storing the data in the partition table (which has a backup copy) or the LUKS header (without which the device is undecryptable and unrecoverable) are other options. |
Sure, but again: completely irrelevant for anything happening here. The kernel has a filesystem development mailing list, I'd suggest to raise these issues there, and filesystem developers should be able to help clarify those concerns: [email protected] |
last time i looked luks metadata is not integrity protected in any way. hence about as trustable as the gpt partition table. tbh, I find the problem not particularly interesting if the system doesn't do encryption or integrity protection. If you don't do that then most security guarantees are gone anyway, at least in my view of the world. |
What about the virtualized case? |
The Discoverable Partitions Specification provides specific GUIDs for each role a filesystem plays and for the architecture of the filesystem. However, it does not specify the filesystem type. This means that a tool must probe the filesystem images to determine which filesystem is present, instead of being able to just call the appropriate mount tool directly. Not only does this make it harder to write tools that dissect disk images, it also leads to potential ambiguity. Different filesystems have superblocks at different locations, so the same image might be valid for more than one filesystem. This is extremly unlikely to happen by accident, but it is not impossible, and it is conceivable that an attacker could somehow be able (through ordinary filesystem operations and unprivileged ioctls) to cause a filesystem image of one type to become at least probe-able as another.
The text was updated successfully, but these errors were encountered: