You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, we seem to have discovered today when validating a bag generated by another BagIt implementation (bagit-python) that hidden files -- or those starting with ., such as .keep -- are not handled by this library. Essentially, the bag included hidden files in the manifest and payload, but the library didn't find them, resulting in a failed completeness check. I believe the following line is the source of the issue. Because it's also used by the manifest! method, hidden files would also not be included in the manifest when the bag is generated by the library.
A fix would likely involve using Dir.glob with a specific pattern that would include these; the Ruby docs suggest using '{*,.*}', so for this library we might use '**/{*,.*}'. Dir[] might also still work with the updated pattern. There is a File::FNM_DOTMATCH flag, but it would also cause . and .. to be included (which we don't want).
In reviewing the spec, I don't see any indication that these files should be ignored or not included. If people seem to agree this is an issue, I'm happy to open a PR with some added/updated tests. If I'm incorrect somewhere in this analysis, I'm also happy to be pointed in the right direction.
Update: In thinking more about this, this may be considered a breaking change for current users of the library, as the bags they previously generated using the library would now be considered invalid by the "fixed" version. That would have to be handled somehow, assuming this change is a wanted improvement.
The text was updated successfully, but these errors were encountered:
@ssciolla Thank you for reporting this! I agree with your reasoning and a PR would be awesome.
Maybe we could add some metadata indicating that the bag has been created by a new version of the library in bag-info.txt and change the validation operation if it is the newer version?
Thanks for the response, @little9! I will start working on a PR.
Your idea about keying off the version could work. My colleague also found a discussion in bagit-java about hidden files from years ago. I think this comment is interesting: LibraryOfCongress/bagit-java#31 (comment)
We could play it safe for the immediate future by adding an option to Bag that indicates whether to consider or look for hidden files, defaulting to false. Then you could have a minor release with this change, and then maybe change the default in a later major release (if there's a will to do that). You may also need some logic to handle validation when the setting is false, but there are hidden files in the manifest, maybe just raising warnings there. I'm going to start with this approach, but keep your idea in mind as well.
Hi, we seem to have discovered today when validating a bag generated by another BagIt implementation (bagit-python) that hidden files -- or those starting with
.
, such as.keep
-- are not handled by this library. Essentially, the bag included hidden files in the manifest and payload, but the library didn't find them, resulting in a failed completeness check. I believe the following line is the source of the issue. Because it's also used by themanifest!
method, hidden files would also not be included in the manifest when the bag is generated by the library.bagit/lib/bagit/bag.rb
Lines 39 to 41 in 4a7fb6d
A fix would likely involve using
Dir.glob
with a specific pattern that would include these; the Ruby docs suggest using'{*,.*}'
, so for this library we might use'**/{*,.*}'
.Dir[]
might also still work with the updated pattern. There is aFile::FNM_DOTMATCH
flag, but it would also cause.
and..
to be included (which we don't want).In reviewing the spec, I don't see any indication that these files should be ignored or not included. If people seem to agree this is an issue, I'm happy to open a PR with some added/updated tests. If I'm incorrect somewhere in this analysis, I'm also happy to be pointed in the right direction.
Resources
Update: In thinking more about this, this may be considered a breaking change for current users of the library, as the bags they previously generated using the library would now be considered invalid by the "fixed" version. That would have to be handled somehow, assuming this change is a wanted improvement.
The text was updated successfully, but these errors were encountered: