Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Physical block size detection incorrect for AF SAT drives #13

Closed
GreatEmerald opened this issue Aug 4, 2015 · 83 comments
Closed

Physical block size detection incorrect for AF SAT drives #13

GreatEmerald opened this issue Aug 4, 2015 · 83 comments

Comments

@GreatEmerald
Copy link

Currently the script uses blockdev --getpbsz to detect the physical block size. However, this is incorrect for Advanced Format drives using SCSI-to-ATA Translation (SAT; includes USB enclosures and such), as detailed here: http://nunix.fr/index.php/linux/7-astuces/65-too-hard-to-be-above-2tb

The correct way to determine it is by using either hdparm -I or smartctl -d sat -a. Since the former doesn't need explicit specification that it's a SAT drive, it's probably better to use that, like so:

BLOCK_SIZE=$(sudo hdparm -I /dev/$DEVICE | grep "Physical Sector size" | awk '{print $(NF-1)}')
@JElchison
Copy link
Owner

Thanks for the report! This is helpful.

This is reminiscent of issue #5, in which @barak suggested that we move (from lshw) to blockdev. I like his argument that blockdev is an essential package on Ubuntu, so it's pretty much guaranteed to be present (though it's not part of the POSIX standard). Right now, the only other non-POSIX commands used are printf and xxd, I believe.

Do you have any ideas on how to solve your problem using a more portable solution than hdparm? It seems undesirable to make users install the hdparm package even if they don't have an AF SAT drive. A hacky solution would be lazy environmental validation, such that the script doesn't exit until (1) it knows that it needs hdparm; and (2) it doesn't have hdparm (vs. failing at script init if it doesn't have hdparm). Ideas?

Also, philosophical question: Does your defect report better belong against blockdev instead of format-udf? If blockdev is reporting an incorrect value, then it seems most proper to pursue a fix there, instead of here.

@GreatEmerald
Copy link
Author

Honestly, I think the real bug is in the kernel. I think the canonical way to get the block size on Linux should be via:

cat /sys/class/block/$DEVICE/queue/physical_block_size

However, for AF SAT it, too, reports 512. So clearly the kernel itself is confused as well. I'll see if I can open a ticket in the kernel bug tracker for this.

@JElchison
Copy link
Owner

That's great. In the meantime, I'd like to keep this issue open as a placeholder.

Once a kernel ticket has been opened, would you mind adding a link to it in this thread?

Many thanks.

@GreatEmerald
Copy link
Author

Filed a bug report here: https://bugzilla.kernel.org/show_bug.cgi?id=102271

@ksterker
Copy link

ksterker commented Jan 6, 2016

Just for the record, I ran into the same issue. I guess it's fine to fix the root cause of the error, but a word of caution in the Readme or other prominent location would have saved me a lot of trouble.

FWIW, I looked up the correct sector size using hdparm manually and verified that blockdev indeed reported it wrong, then set BLOCKSIZE to 512 in the script. Worked like a charm!

@JElchison
Copy link
Owner

thanks, all.

fyi, i plan on getting back to addressing this (and other outstanding issues) in the upcoming weeks. just finished graduate school in december. thanks for your patience!

@JElchison
Copy link
Owner

FYI, 107cf1c adds a new option -b BLOCK_SIZE which allows manually overriding the OS-detected block size. It doesn't fix the root issue here (still waiting on the Linux kernel fix), but it does make things slightly less painful.

Also, @ksterker, thanks for the tip: The new usage (and README) highlights the kernel bug.

@GreatEmerald
Copy link
Author

The kernel devs said that they can't do anything about it, because they'd have to use libusb, and that's not available from the kernel.
So I guess another tool is a way to go in any case. Or maybe someone could write to the linux-usb mailing list and see if they can come up with a solution.

Although, in any case, as per #12 there's a reason to just hardcode it to 512 for Windows (or use logical size?).

@pali
Copy link

pali commented Jan 9, 2017

The kernel devs said that they can't do anything about it, because they'd have to use libusb, and that's not available from the kernel.

Can you give me link to this discussion, specially need for libusb? This seems like misunderstanding for me as libusb is just userspace library which uses kernel api for direct access to usb bus. It exist because only kernel can access hw. Kernel can of course access usb HW.

Although, in any case, as per #12 there's a reason to just hardcode it to 512 for Windows (or use logical size?).

No, there is a no good reason to hardcode value to any specific constant.

@pali
Copy link

pali commented Jan 9, 2017

Anyway, UDF uses logical block size, not physical. So it is not needed to know physical block size for formatting.

@JElchison
Copy link
Owner

Physical block size is required to enable UDF support on Windows XP. See Pieter's article linked on the readme (you may need the Wayback Machine). The block size calculated by format-udf is used for both the formatting operation and the creation of the false MBR.

@pali
Copy link

pali commented Jan 9, 2017

UDF 2.60 specification says:

http://www.osta.org/specs/pdf/udf260.pdf

Logical Block Size - The Logical Block Size for a Logical Volume shall be set to the logical sector size of the volume or volume set on which the specific logical volume resides.

Uint32 LogicalBlockSize in struct LogicalVolumeDescriptor - Interpreted as specifying the Logical Block Size for the logical volume identified by this Logical Volume Descriptor. This field shall be set to the largest logical sector size encountered amongst all the partitions on media that constitute the logical volume identified by this Logical Volume Descriptor. Since UDF requires that all Volumes within a Volume Set have the same logical sector size, the Logical Block Size will be the same as the logical sector size of the Volume.

@pali
Copy link

pali commented Jan 9, 2017

Physical block size is required to enable UDF support on Windows XP.

Ah :-( It is really truth? And not logical block size?

See Pieter's article linked on the readme (you may need the Wayback Machine).

http://web.archive.org/web/20151103171649/http://sipa.ulyssis.org/2010/02/filesystems-for-portable-disks/

I do not see there requirement about physical block size.

The block size calculated by format-udf is used for both the formatting operation and the creation of the false MBR.

New version of mkudffs autodetect blocksize (via BLKSSZGET) if it is not passed via command option.

@JElchison
Copy link
Owner

JElchison commented Jan 10, 2017

Sorry, @pali, I don't think I'm explaining myself very well. Let me try again.

There are 3 different block sizes we're discussing here:

  1. Disk physical block size. This is an artifact of the disk, governed by the disk controller. Not controlled by the user. Reported by blockdev --getpbsz or /sys/block/sda/queue/physical_block_size.
  2. Disk logical block size. This is an artifact of the disk, governed by the kernel. Not controlled by the user. Reported by blockdev --getss or /sys/block/sda/queue/logical_block_size. Usually 512 bytes.
  3. File system block size. This is an artifact of the file system (not the underlying disk), specified by the user at file system format time.

All of these 3 values can be different, even for the same disk.

format-udf.sh uses UDF v2.01 (not the latest v2.60), for reasons specified in the README. However, both v2.01 and v2.60 use the same term (LogicalBlockSize) to describe item 3 above. It's important to note that the use of the adjective Logical here is with respect to the file system, not the disk. In other words, it's important not to confuse the UDF LogicalBlockSize with the disk's logical block size. Those can be (and often are) different.

The udftools package provides mkudffs, which is the de facto tool for formatting a UDF file system in Linux. The -b,--blocksize option lets you specify item 3 above when formatting a disk. It has no bearing on items 1 or 2. The version of udftools on my current Ubuntu 16.04.1 LTS machine is 1.0.0b3-14.4. That version defaults to a block size of 2048 bytes unless otherwise specified. It's completely possible that newer versions default to something else.

I fully acknowledge that a UDF drive formatted to be fully compliant with the spec has a file system block size (item 3) set to the disk logical block size (item 2). Your quotes above from the spec are accurate (though I'm targeting v2.01).

Be reminded that the goal of format-udf.sh is to output is a drive that can be used for reading/writing across multiple operating system families. If a user is interested in a fully spec-compliant format for use on a single OS, then he/she should use the native formatting tool (mkudffs or newfs_udf) and specify the file system block size (item 3) set to the disk logical block size (item 2). However, this will be insufficient for a cross-platform UDF drive that works on Linux, OS X, and Windows.

From Pieter's article, for Windows XP,

the UDF block size must match the block size of the underlying device

I interpreted this to refer to the disk physical block size (item 1). (I believe this is the point you're contesting, submitting that it should instead be the disk logical block size, which it item 2.) I verified the use and behavior of the disk physical block size (item 1) in my lab over 2 years ago when I first authored format-udf.sh. It's completely possible that I made (and repeated) this mistake. However, with the number of tests that I ran at the time, I find it unlikely. Unfortunately, I have no Windows XP servers in my lab at the moment, so I'm unable to re-validate. Thus, I cannot refute your claim.

However, there is another (more significant) reason that format-udf.sh relies on disk physical block size (item 1) instead of disk logical block size (item 2). UDF v2.01 itself has a limit of 2^32 blocks. If the disk logical block size (item 2, which is usually 512 bytes) is used in formatting and in the partition table, then the maximum disk size supported by format-udf.sh will most often be 2 TiB. This is unacceptable for many modern (larger) drives. Out of convention, many disk manufacturers still respect the 2^32-block limit, which means their only practical way for crafting larger disks is--you guessed it--increasing the disk physical block size (item 1).

Therefore, for format-udf.sh to accomplish its goal of providing a true cross-platform read/write disk (and not artifically capped at 2 TiB), it must use the (sometimes larger) disk physical block size (item 1), which comes at a cost (as you correctly point out) that it technically isn't fully spec-compliant. format-udf.sh is a tool for pragmatists. Theorists who prefer spec compliance will need to give up dreams of a true cross-platform read/write disk, and stick to the native formatting tools.

Sorry for the verbosity, but I hope this explains my position more clearly.

@pali
Copy link

pali commented Jan 10, 2017

Understood now.

Anyway, I understand UDF specification that blocksize must be disk logical blocksize (2) (if disk needs to be compliant to UDF spec). Correct or not?

mkudffs from udftools prior to version 1.1 (or 1.0) has lot of problems and generates invalid UDF filesystem. I can imagine that it can cause problems for other operating systems. So it could be good to use mkudffs version 1.1 (or better last version from git) and retest all operating systems without "hacks" like MBR or with different block sizes.

Anyway, if you have testing env for different operating systems, can you test newfs_udf tool from UDFClient project? http://www.13thmonkey.org/udfclient/ I remember that some disks formatted by newfs_udf from UDFClient was recognized correctly by Win7, but not when formatted by mkudffs. Maybe we are really facing problems in mkudffs and not always in other operating systems.

With number of blocks you are right, limit is I think 2^32-1 (not full 2^32). But increasing block size has also disadvantages (like in other FS) that some data (or metadata) will always store on disk full block size. Maybe some thresholds could be used to start increasing blocksize when number of blocks is max number -- and not to use big blocksize on smaller disks (to not waste disk space)?

@JElchison
Copy link
Owner

I understand UDF specification that blocksize must be disk logical blocksize (2) (if disk needs to be compliant to UDF spec). Correct or not?

To be honest, I'm still not 100% sure. The -b option in mkudffs sets the volume's logicalBlockSize to the disk's disc->blocksize, which is set to the value passed in by the user. Note how the v2.01 spec uses slightly different terminology than you and I have used in this discussion:

Logical Sector Size - The Logical Sector Size for a specific volume shall be the same as the physical sector size of the specific volume.
Logical Block Size - The Logical Block Size for a Logical Volume shall be set to the logical sector size of the volume or volume set on which the specific logical volume resides.

The way that I read that makes it sound like the spec is calling for the volume's block size to be set to the disk's logical sector size, which should be set to the disk's physical sector (block) size.

mkudffs from udftools prior to version 1.1 (or 1.0) has lot of problems and generates invalid UDF filesystem.

Correct. I've been glad to see you pick up maintenance of udftools. And I'm even more glad that Debian/Ubuntu package maintainers have picked up your edits.

So it could be good to use mkudffs version 1.1 (or better last version from git) and retest all operating systems without "hacks" like MBR or with different block sizes.

Agreed. I have always conducted my testing against mainstream packages for udftools and OS X. Debian stable still uses 1.0.0b3-14.3, but (as you know) Ubuntu has picked up 1.2-1build1 as of Yakkety.

My testing resources are currently allocated for other projects (and also I'm about to have a baby), but I agree that it would be good to test format-udf.sh with udftools 1.2. I captured this in #33.

I should mention that all of my initial testing was conducted without any MBR/hacks. In fact, the addition of the MBR was the outcome of having performed my initial testing.

can you test newfs_udf tool from UDFClient project?

Is this the same suite included in OS X? (I would conjecture yes, based on the number of BSD references.) If so, then I've already conducted implicit testing. I've observed minor differences, but largely consistent behavior. See #11.

I'm particularly interested in udfclient's ongoing release where they claim to be working on a functional fsck. That would be huge for the credibility of UDF. I had barely started porting the Solaris implementation, but haven't gotten very far yet.

Maybe some thresholds could be used to start increasing blocksize when number of blocks is max number -- and not to use big blocksize on smaller disks (to not waste disk space)?

If users of format-udf.sh are concerned about block-level efficiency, they're always welcome to use the -b BLOCK_SIZE switch to specify their own. I don't recall having seen any disks <= 2 TiB with a physical block size > 512. Most folks, I've found, are more interested in using their entire disk vs. truncating it but having better block efficiency.

@pali
Copy link

pali commented Jan 10, 2017

In 2.01 spec is also written:

physical sector - A sector [1/5.9] given by a relevant standard for recording [1/5.10]. In this specification, a sector [1/5.9] is equivalent to a a logical sector [3/8.1.2].

So it not such clear! [3/8.1.2] is reference to ECMA 167 standard where is:

1/5.5 logical sector - The unit of allocation of a volume.
3/8.1.2 Logical sector - The sectors of a volume shall be organised into logical sectors of equal length. The length of a logical sector shall be referred to as the logical sector size and shall be an integral multiple of 512 bytes. The logical sector size shall be not less than the size of the smallest sector of the volume. Each logical sector shall begin in a different sector, starting with the sector having the next higher sector number than that of the last sector constituting the previous, if any, logical sector of the volume. The first byte of a logical sector shall be the first byte of the sector in which it begins, and if the size of this sector is smaller than the logical sector size, then the logical sector shall comprise a sequence of constituent sectors with consecutive ascending sector numbers.

So still I'm not sure...

Is this the same suite [newfs_udf] included in OS X?

No, what I saw OS X has own closed implementation of UDF and does not use UDFClient. So UDFClient's newfs_udf implementation should be different.

Debian stable packages will never be updated (stable means that package versions are stable). But both Ubuntu (in some version) and Debian (testing) have packages for newfs_udf (in udfclient) and mkudffs (in udftools). Currently mkudffs 1.2 does not work on 32bit systems for formatting disks above 4GB (problem with Large File Support), this will be fixed in mkudffs 1.3 (64bit systems do not have this problem).

I'm particularly interested in udfclient's ongoing release where they claim to be working on a functional fsck.

Author has interested to that but did not have much more time to implement it yet... Last year I was contacted by student who want to implement fsck as part of thesis so maybe there will be something...

I had barely started porting the Solaris implementation, but haven't gotten very far yet.

That is useless. I already looked at it years ago and it supported only UDF 1.2 and used Solaris kernel drivers where was functionality implemented... So for systems without Solaris kernel it means to reimplement whole functionality and that is probably more work as implement fsck from scratch. (And UDF 1.2 is not enough!)

I don't recall having seen any disks <= 2 TiB with a physical block size > 512.

All my 512GB and 1TB disks have physical block size of 4096, so they are not rare. (But I'm using ext4 on them...)

@JElchison
Copy link
Owner

Thanks for your comments and additional data points, @pali. I will terminate my udf-fsck project given your guidance.

I am still leaving this issue (#13) open as a placeholder to track https://bugzilla.kernel.org/show_bug.cgi?id=102271. We're also waiting for @GreatEmerald to respond to your request.

@GreatEmerald
Copy link
Author

I was referring to the bug tracker, and @pali has already seen that. I don't feel like starting a thread about it in the Linux mailing list, since I don't think I know as much as you two about the whole issue.

From what I recall of my own testing, Windows 10 simply did not work with UDF volumes with FS block size other than 512. I tested that with an external HDD which has a physical block size of 4096, and thus by some logic should have used 4096 block size, but no dice. I also haven't really used any MBR hacks, and Win10 worked fine with that.

Maybe it would be a good idea to revive the wiki page with test results, but also make certain to add the tested OS versions. Because "Windows" can mean 10 or 95, with wildly different results. Same with macOS etc. And not everyone cares about XP support, or macOS support, etc., one may just want to have something that works on a given Windows version and a given Linux version.

@pali
Copy link

pali commented Jan 10, 2017

I forwarded bug to mailing list and CCed you to be informed about status.

@JElchison
Copy link
Owner

@pali Is there a mailing list link that you can post here in this thread?

@pali
Copy link

pali commented Jan 10, 2017

Discussion is available e.g. in this archive:
http://www.spinics.net/lists/linux-usb/index.html#151780

@pali
Copy link

pali commented May 20, 2017

Now I looked at this logical vs physical block size problem again and I think whole block addressing in UDF should be according to LBA. Which means that logical block size of disk should be used, and not physical block size! Basically all read/write operations in disks implementations work with LBA and physical block size is just hints for disk tools to align partitions/buffer for better performance...

This @GreatEmerald's comment just prove it:

From what I recall of my own testing, Windows 10 simply did not work with UDF volumes with FS block size other than 512. I tested that with an external HDD which has a physical block size of 4096, and thus by some logic should have used 4096 block size, but no dice.

As today, basically all disks have logical block size 512 (and physical 4096). In past there were HDDs which operated with logical block size of 4096, therefore LBA was 4096 and it caused problems as disk partition utilities, simple bootloaders and other small programs has hardcoded logical block size to 512.

And I bet that in Pieter's article is "block size" mean to be logical block size, for LBA addressing.

Therefore I think that default block size for mkudffs should be value from logical block size (blockdev --getss). And for 2TB+ disks it would make sense to set it at least to 1024 for ability to format whole disk.

New mkudffs already uses default block size (if not specified on cmdline) from logical block size of disk.

Note that MBR table (and also GPT structures) works with LBA and therefore depends on logical block size, not physical!

@GreatEmerald
Copy link
Author

Also of note is that in Windows 10 Creators update, MS finally implemented mounting multiple partitions in one device, just like it is in Linux. So for 2 TiB+ drives, it might make sense to make multiple partitions at 2 TiB in size.

Oh, and last I tried, my UDF-partitioned SD card contents were shown in Windows 10, but it was read-only. Performing the disk check resulted in all files being relocated to lost+found and the card became read-write. Not sure why.

@pali
Copy link

pali commented May 20, 2017

Also of note is that in Windows 10 Creators update, MS finally implemented mounting multiple partitions in one device

It is really confirmed by MS?

Because current MS implementation is very special. For removable disks MBR/GPT is optional, but if present only first partition is used. For non-removable disks MBR/GPT is required, but then all partitions are used.

So if you do not create MBR/GTP and format whole disks to some filesystem, then it is recoginzed by Windows only if disk itself if marked as "removable".

And that flag "removable" is part of USB mass storage protocol, so some flash disk can announce "removable" and some not.

So for 2 TiB+ drives, it might make sense to make multiple partitions at 2 TiB in size.

In MBR are stored 32bit pairs (first LBA, num of LBA blocks). So for 512 LBA disks you can can have maximally 2TB long parition and every partition must start before 2TB offset. Which means you can use maximally 4TB disks. First parition would be 2TB long and starts somewhere at begining and second parition would be 2TB long and starts before 2TB offset.

PS: It is not exaclty 2TB, but just (2^32-1)*512B ~= 2046GB and due to MBR header and alignment first partition would be smaller...

In GPT are pairs (first LBA, last LBA) but those are 64bit which means that for 512 LBA disks every partition must starts and also ends in first 8ZB (1ZB = 2^70B).

Oh, and last I tried, my UDF-partitioned SD card contents were shown in Windows 10, but it was read-only. Performing the disk check resulted in all files being relocated to lost+found and the card became read-write. Not sure why.

Make sure you are formatting by mkudffs at version 1.3, older versions do not prepare UDF partitions correctly.

@GreatEmerald
Copy link
Author

Pretty sure. It was a removable drive I tried, listed in the safely remove hardware list. Clicking to remove one removes both at the same time. I usually use GPT for everything these days.

@JElchison
Copy link
Owner

The logical sector size of the drive governs the internal structure of the file system, and the file system driver must adapt to this.

In theory, you are correct. In practice, operating systems have done silly (non-adaptive) things like hard-coding 512-byte file system block sizes when mounting HDDs. This is what @pali's XP research indicated.

a 2048 sector size UDF file system is not applicable to hard drives

In theory, you are correct. In practice, udftools defaulted to a block size of 2048 until @pali made a recent update to it. Plenty of people have contacted me over the years asking questions why their 2048-byte UDF HDD doesn't mount on their choice OS.

It's particularly bad for Windows, which is both a closed-source OS and doesn't provide a CLI method for specifying mount parameters (such as file system block size). Both *nix and macOS systems provide a way to specify the file system block size you wish to use.

To cover for cases when operating systems don't do the theoretically optimal thing, I still think there's no better substitute for testing.

@jmyreen
Copy link

jmyreen commented May 28, 2017

Did the 2048-byte UDF HDD ever work even on Linux? I don't see how that is possible, or then the Linux UDF driver does something non-standard. All the guides I have come across tell that you should use the --blocksize=512 switch with mkudffs when formatting a HDD.

In theory, you are correct

I don't think @pali's XP findings negate anything I said about the requirement that the file system driver and the drive must agree upon the sector size. XP just didn't adapt. It didn't follow the rules, and that's because Windows XP doesn't know how to handle 4k sectors.

Don't get me wrong. I do think testing is important, and should be done as thouroghly as possible. On the other hand, I don't think it's worth the trouble to test something that's not according to spec.

@pali
Copy link

pali commented May 28, 2017

Did the 2048-byte UDF HDD ever work even on Linux?

In past (before 2.6.30 kernel) only UDF with 2048 block size worked automatically, independent of disk sector size. Currently Linux kernel can mount and use UDF with probably any block size on HDD with 512 byte sector size. IIRC udf.ko is doing detection of UDF block size based on VRS and AVDP blocks.

@pali
Copy link

pali commented Jun 13, 2017

I did some investigation for UDF support in linux kernel.

Prior to 2.6.30 only UDF blocksize 2048 is tried (independent of logical disk sector size). Since 2.6.30 and prior to 4.11 is tries logical sector size and 2048 (as fallback). Since 4.11 it tries logical sector size and fallback to any valid blocksize between logical sector size and 4096.

In any case (also prior to 2.6.30) it is possible to manually specify UDF blocksize via bs= mount parameter.

@pali
Copy link

pali commented Jun 13, 2017

So UDF in linux kernel fully ignores physical block size of device.

@jmyreen
Copy link

jmyreen commented Jun 14, 2017

Since 4.11 [Linux] tries logical sector size and fallback to any valid blocksize between logical sector size and 4096.

The Linux kernel allows the UDF "internal block size" to be a multiple of disk's logical sector size, and employs this heuristic to detect what the block size is. We cannot expect any other OS (Windows) to do so, because it is against the UDF specification. If a UDF formatted disk is to be portable over operating systems, the formatting tool should not use any other block size than the disk's logical sector size. Unfortunately, this means the maximum UDF file system size for a typical hard disk is 2 TB.

@pali
Copy link

pali commented Jun 14, 2017

If somebody has free 2TB+ HDD, please try to format it as UDF on Windows (quick format is enough). We would see then what Windows generate... I tried this on Windows in qemu, but after qemu written 117MB (yes megabytes!) to disk image, qemu paused and crashed (yes qemu! not windows).

@jmyreen
Copy link

jmyreen commented Jun 14, 2017

Of course, if a 2TB+ disk has 4k logical sectors (i.e. it is a 4k native Advanced Format disk, not 512e), then the maximum size of the UDF file system is 16 TB. Note that FAT32 is affected by the same problem: the maximum volume size is 2 TB with 512 byte sectors, and 16 TB with 4k sectors. It seems the vendors of external drives have adopted two strategies to deal with this, either

  • using 4k disks (possibly faking 4k with the USB chipset in the enclosure), or
  • pre-formatting the disk with NTFS and including an NTFS file system driver for Mac

It's quite hard to find information on whether big disks are 4k native or not. My guess is (based on web search) that they are actually quite rare, and mainly SCSI disks.

@pali
Copy link

pali commented Jun 14, 2017

I mean classic disks with 512 byte logical sectors. At least Windows in qemu allowed me to start formatting 4TB partition (GPT scheme) to UDF. But as I wrote qemu paused and crashed, so I do not know what would be result from Windows.

Yes, when UDF blocksize is 512 and logical sector size is also 512, then upper limit is 2TB (minus one sector...). But question is what Windows do with larger disks with GPT scheme which support larger partitions and with UDF?

@jmyreen
Copy link

jmyreen commented Jun 15, 2017

@pali I created a ~3 TB virtual disk in VirtualBox and attached it to a Windows 10 virtual machine. The result is not encouraging: the format program says "QuickFormatting 2,9 TB", but the end result is a file system of size 881 GB, which is suspiciously close to 2.9 T modulo 2 T. Seems the Windows format program is not at all prepared for big disks; it really should have printed an error message telling that the partition is too large, instead of doing this strange thing. Formatting to NTFS works, though.

When I tried reducing the volume size on the disk to 1.9 TB (leaving the rest of the disk unpartitioned), everything worked OK: "Format complete. 1.91 TB total disk space."

udf-3tb

@pali
Copy link

pali commented Jun 15, 2017

So... after formatting 3TB disk, UDF filesystem has only 881GB total space (also in Explorer)? Then it just prove fact 2TB+ disks (with 512b sectors) are unsuitable for UDF on Windows. Which also means that GPT (fake header) is not needed at all as MBR is enough.

@JElchison
Copy link
Owner

Thanks to everyone for continuing this conversation!

On @jmyreen's latest tests: My understanding (which may be incorrect) is that UDF itself uses 32-bit block numbers internally, which seems to be the limiting factor in play here.

If UDF is limited to 32-bit block numbers, then I don't think there's any partition format (such as GPT) that can extend the max capacity beyond 2^32 blocks.

Source: https://sites.google.com/site/udfintro/

@pali
Copy link

pali commented Jun 15, 2017

Yes, udf uses 32bit block numbers, this is a limit. But if you use larger block size, then you can increase maximal size of formatted filesystem. I just wanted to see what would Windows do in this case...

Anyway, there is another tool for formatting disks under windows, which support also UDF. See this blog: https://fsfilters.blogspot.com/2011/10/testing-minifilter-on-more-filesystems.html

@jmyreen
Copy link

jmyreen commented Jun 15, 2017

@pali I tried the Diskpart tool mentioned in the blog post you linked to. The result is the same: no error message ("DiskPart successfully formatted the volume."), but the final file system size is partition size modulo 2 T. My guess is that both Format.exe and Diskpart get as input the number of sectors on the disk, truncate it to 32 bits and uses the truncated (unsigned) result as input to the formatting code. Maybe both tools use the same library code internally?

@pali
Copy link

pali commented Jun 15, 2017

It is possible that both tools uses same library... Anyway both format and diskpart support specifying size of block/sector (do not know correct parameter name). Can you try to test if it is possible to specify it also for UDF and if it is working?

@jmyreen
Copy link

jmyreen commented Jun 15, 2017

I tried with unit sizes 4096, 2048, and 1024, but all failed. Specifying unit=512 worked (not shown in the screenshot). This is in line with the UDF spec, which only allows a block size equal to the logical sector size of the disk. I couldn't find any error message supposedly giving more information in Event Viewer.

udf-blocksize

@pali
Copy link

pali commented Jun 15, 2017

Ok, thanks for testing! I think now we can definitely say that Windows does not support UDF with block size different from logical sector size of disk, which means it does not support UDF on partition which has more then 2^32-1 sectors (for 512/4096 disks more then 2TB).

Original bug report is about incorrect detection of physical block size, and it is a problem in kernel (see relevant mailing list discussion). Moreover for both Windows and Linux it is irrelevant as UDF needs to have block size same as logical sector size of disk.

Therefore this bug can be closed and scripts needs to be fixed to use logical sector size instead of physical...

@GreatEmerald
Copy link
Author

Well, the purpose of this issue has rather shifted from physical block size detection to defaulting to logical block size + updating documentation about the change.

@JElchison
Copy link
Owner

Please review pull request #35

@pali
Copy link

pali commented Jun 18, 2017

Meanwhile I implemented new option --bootarea=preserve|erase|mbr in mkudffs for specifying how to fill UDF bootarea: https://github.com/pali/udftools/commits/master For hard disks default would be erase, to clean all headers of all previous filesystems. Option mbr would then put similar "fake" partition. I have also experimented with gpt, it is working, but I do not see any reason why to use it as it has no benefits for now (and just cause problems with last UDF anchor)...

@jmyreen
Copy link

jmyreen commented Jun 18, 2017

From the new README:

Many operating systems will only attempt one block size (usually whatever the mount utility defaults to). For example, In order to mount a UDF device, Windows seems to require that the UDF file system use a block size equal to the logical block size. If your block size isn't the OS's default, then auto-mounting likely will not work on your OS.

It's the UDF specification that requires that the UDF file system use a block size equal to the logical block size. You can't really blame Microsoft for adhering to the UDF specification – it's the right thing to do for the Windows implementation to enforce this. That Linux supports a file system block size that differs from the block size of the disk is a non-standard extension. That older Linux UDF kernel driver versions only supported 2048 byte blocks on hard disks was clearly a bug.

It would make life much easier for the users of the UDF formatting and mount utilities if the programs just did the right thing, and we would simply forget about the -b option. Formatting and mounting should use the (logical) block size of the device, because otherwise we really can't call the resulting file system UDF anymore. Has anybody ever been able to come across a case where deviating from this rule has been necessary? That is, using a file system internal block size equal to the block size of the drive would cause an incompatibility between systems? Why would you want to create a file system that doesn't follow the specification?

There may be situations where the -b option is needed: if the operating system is unable to report the block size of the drive. Note that what started this whole discussion, Linux reporting ("lying") the wrong physical block size doesn't count, because the physical block size is irrelevant here. There is a mention that "macOS is known to report the incorrect block size in certain scenarios as well." Is this a documented fact, and if it is, is there something that could be done to work around the problem?

If the -b option were to stay, it should (in my opinion) be hidden in "expert" sections of the documentation, with ample warnings that the user should really understand what he or she is doing.

@pali
Copy link

pali commented Jun 18, 2017

Specifying block size is needed when you are creating/formatting disk image stored in local file (which is later "flashed" or cloned to real disk device).

@jmyreen
Copy link

jmyreen commented Jun 18, 2017

Good point. Though this also falls into the "really know what you are doing" category.

@JElchison JElchison mentioned this issue Jun 18, 2017
@JElchison
Copy link
Owner

Great catch, @jmyreen. The discussion about the Linux kernel bug is now irrelevant. I have removed references to it from the README and from the usage, and have added an "expert warning" to the usage.

I attempted to find my message history describing macOS reporting a suspect block size, but I cannot find it. Perhaps I handled that discussion outside of a GitHub Issue. Thus, I have removed the macOS reference as well.

Let me know if these changes meet your liking. (You can review it again on the same pull request, #35.)

@Wowfunhappy
Copy link

Wowfunhappy commented May 4, 2021

Is this a documented fact, and if it is, is there something that could be done to work around the problem?

...am I misunderstanding Pali's table above? It seems to me that if you had a 4K native disk, you'd want to set the UDF block size to 512 for compatibility with Windows XP, since according to the table, XP needs 512 but doesn't mind if the physical size doesn't match.

What am I missing?

@pali
Copy link

pali commented May 4, 2021

if you had a 4K native disk

It means that disk logical block size is 4096.

And according to that (my) table there is no option how to use such disk as UDF on Windows XP. From that test can be seen that Windows XP does not support UDF on Native 4K disk.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants