Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Buffer::bit_slice losing length with byte-aligned offsets #6707

Merged
merged 7 commits into from
Nov 17, 2024

Conversation

itsjunetime
Copy link
Contributor

Which issue does this PR close?

A part of #3478; necessary for #6690, which closes the aforementioned issue.

Rationale for this change

If bit_slice is called with a given length, the returned buffer should have the specified length.

What changes are included in this PR?

The bit_slice function itself is updated with the fix, along with a unit test to ensure the fix works. I've ensured the test fails without my changes.

I also changed the 'minimum overhead' value in an encoding test that went down due to this change.

This also updates the MSRV of this crate in Cargo.toml to 1.75, as that was (and still is, with this PR) the effective MSRV of these crates (as found by cargo-msrv). This was necessary since I wanted to use usize::div_ceil, and that was stabilized in 1.73, but clippy was complaining that I couldn't use it since the crate's msrv was 1.62.

Are there any user-facing changes?

Yes, a bug fix that could change user behavior.

@github-actions github-actions bot added arrow Changes to the arrow crate arrow-flight Changes to the arrow-flight crate labels Nov 8, 2024
@itsjunetime
Copy link
Contributor Author

Ah, it looks like cargo msrv find tries to find the minimum available version that would work for all the crates in your workspace, including e.g. the parquet bin target and its clap dependency. I'll fix that and use bit_util::ceil instead.

Copy link
Contributor

@tustvold tustvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the MSRV changes, especially to object_store which is a separate workspace entirely, but the buffer change looks good

Cargo.toml Outdated
@@ -74,7 +74,7 @@ include = [
"Cargo.toml",
]
edition = "2021"
rust-version = "1.62"
rust-version = "1.70"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made this change because that is the effective MSRV of the majority of the crates in this workspace. It was outdated and when I disabled rust-version requirements and ran cargo msrv find in each of the crates in this repo, these were the values I found.

I know it's not directly related to what this PR was intended for, but since it tripped me up when I tried to use something outside the MRSV, I figured it was worth the time to go through each crate and make sure the MSRVs are actually accurate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sorry for being dense but I still don't understand this change

I made this change because that is the effective MSRV of the majority of the crates in this workspace. It was outdated and when I disabled rust-version requirements and ran cargo msrv find in each of the crates in this repo, these were the values I found.

Since the msrv CI check is passing on main

https://github.com/apache/arrow-rs/actions/runs/11862785735/job/33062922708

Does that mean the CI check is not running accurately?

@github-actions github-actions bot added parquet Changes to the parquet crate and removed object-store Object Store Interface labels Nov 12, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @itsjunetime -- the code in this PR looks good to me, but I am not at all sure about the MSRV changes

I am feeling quite bad about the amount of time it takes to review these PRs. It would really help I think to try and keep them as focused as possible so each PR / review had a single set of changes so that if we get hung up on / have issues with one change (e.g. the MSRV) we don't also delay the rest of the PR getting in

@@ -122,8 +122,6 @@ jobs:
uses: ./.github/actions/setup-builder
- name: Install cargo-msrv
run: cargo install cargo-msrv
- name: Downgrade arrow dependencies
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to have been added as part of #5385

Cargo.toml Outdated
@@ -74,7 +74,7 @@ include = [
"Cargo.toml",
]
edition = "2021"
rust-version = "1.62"
rust-version = "1.70"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sorry for being dense but I still don't understand this change

I made this change because that is the effective MSRV of the majority of the crates in this workspace. It was outdated and when I disabled rust-version requirements and ran cargo msrv find in each of the crates in this repo, these were the values I found.

Since the msrv CI check is passing on main

https://github.com/apache/arrow-rs/actions/runs/11862785735/job/33062922708

Does that mean the CI check is not running accurately?

@itsjunetime itsjunetime force-pushed the june/fix_buffer_bitslice branch from 2219445 to 5c49b6f Compare November 17, 2024 00:19
@github-actions github-actions bot removed the parquet Changes to the parquet crate label Nov 17, 2024
@itsjunetime
Copy link
Contributor Author

I've removed the MSRV changes - I thought that was going to be a simple change initially, then it ended up branching out to a lot of other things. I'm going to make a separate issue and PR here to fix that specifically.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @itsjunetime -- this makes sense to me

@tustvold tustvold merged commit b1f5c25 into apache:master Nov 17, 2024
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate arrow-flight Changes to the arrow-flight crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants