Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement ZstdZarrCompressor #149

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

mkitti
Copy link
Member

@mkitti mkitti commented Jun 25, 2024

This implements ZstdZarrCompressor which wraps around CodecZstd as a package extension.

Part of the complication of using package extensions is getting a reference to new types defined in the extension. I created a mechanism by which you could specify the compressor as a string, which would then lookup the type from a dictionary.

I'm also wondering if there might be a general way to wrap TranscodingStreams codecs into Zarr compressors.

@coveralls
Copy link

coveralls commented Jun 25, 2024

Pull Request Test Coverage Report for Build 9654302116

Details

  • 27 of 34 (79.41%) changed or added relevant lines in 3 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage decreased (-0.3%) to 88.316%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/Compressors.jl 4 11 36.36%
Files with Coverage Reduction New Missed Lines %
src/Compressors.jl 1 83.08%
Totals Coverage Status
Change from base Build 8981180163: -0.3%
Covered Lines: 839
Relevant Lines: 950

💛 - Coveralls

@coveralls
Copy link

coveralls commented Jun 25, 2024

Pull Request Test Coverage Report for Build 9655786902

Details

  • 25 of 32 (78.13%) changed or added relevant lines in 3 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage decreased (-0.4%) to 88.291%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/Compressors.jl 4 11 36.36%
Files with Coverage Reduction New Missed Lines %
src/Compressors.jl 1 83.08%
Totals Coverage Status
Change from base Build 8981180163: -0.4%
Covered Lines: 837
Relevant Lines: 948

💛 - Coveralls

@coveralls
Copy link

coveralls commented Jun 25, 2024

Pull Request Test Coverage Report for Build 9656097799

Details

  • 26 of 32 (81.25%) changed or added relevant lines in 3 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage decreased (-0.3%) to 88.397%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/Compressors.jl 5 11 45.45%
Files with Coverage Reduction New Missed Lines %
src/Compressors.jl 1 84.62%
Totals Coverage Status
Change from base Build 8981180163: -0.3%
Covered Lines: 838
Relevant Lines: 948

💛 - Coveralls

@coveralls
Copy link

coveralls commented Jun 25, 2024

Pull Request Test Coverage Report for Build 9657292182

Details

  • 32 of 32 (100.0%) changed or added relevant lines in 3 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.5%) to 89.135%

Totals Coverage Status
Change from base Build 8981180163: 0.5%
Covered Lines: 845
Relevant Lines: 948

💛 - Coveralls

@mkitti
Copy link
Member Author

mkitti commented Jun 25, 2024

An alternative to the string lookup for the compressor, would be to just pass in CodecZstd.ZstdCompressor directly, specifically an instance created by CodecZstd.ZstdFrameCompressor(). Via a conversion mechanism, we could wrap that into a Zarr.Compressor.

Comment on lines +16 to +19
struct ZstdZarrCompressor <: Zarr.Compressor
compressor::CodecZstd.ZstdCompressor
decompressor::CodecZstd.ZstdDecompressor
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't support multithreaded use IIUC. I think this should be like

Zarr.jl/src/Compressors.jl

Lines 129 to 131 in f436713

struct ZlibCompressor <: Compressor
clevel::Int
end
where the struct only contains the parameters of the codec.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially wrote it like this, but then I was thinking about all the other potential parameters, even if they do not need to be serialized. I think what we should implement is the ability to copy a compessor.

Frankly, I'm somewhat confused about why one actually needs to serialize the compression level into the array metadata. You do not need that information to decompress the data.

@lazarusA
Copy link

bump

@nhz2
Copy link
Member

nhz2 commented Dec 19, 2024

I've been working on this in https://github.com/nhz2/ChunkCodecs.jl/tree/main/ChunkCodecLibZstd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants