-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking Issue for enum access in offset_of #120141
Comments
Noting one potential limitation to offset_of with enums: without a way to set the discriminant, it cannot be used to manually initialize an enum. Blog post explainer of what I mean: https://onevariable.com/blog/pods-from-scratch/ Zulip thread for discussion: https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/Prior.20discussions.20for.20manually.20building.20an.20enum.3F/near/441992612 |
Stabilization Report - Abstract
Feature SummaryThe macro computes field offsets for both This macro accepts two argument. The first is the source type that // crate a
#![feature(offset_of_enum)]
#![feature(offset_of_nested)]
struct AStruct {
pub field: u8,
field2: u32,
}
enum AEnum {
Variant1(u8),
Variant2 {
field: u8
},
Variant3(AStruct),
}
struct ANestingStruct(AStruct);
// readings as of Rust 1.78.0
// DO NOT EXPECT THESE ASSERTIONS TO WORK ACROSS VERSIONS
fn main() {
assert_eq!(offset_of!(AStruct, field), 4);
assert_eq!(offset_of!(AStruct, field2), 0);
assert_eq!(offset_of!(AEnum, Variant2.field), 1);
assert_eq!(offset_of!(AEnum, Variant3.0.field), 8);
assert_eq!(offset_of!(ANestingStruct, 0.field), 4);
} By respecting privacy, suppose that in another crate // crate b
#![feature(offset_of_enum)]
#![feature(offset_of_nested)]
fn main() {
// these still work
assert_eq!(offset_of!(AStruct, field), 4);
assert_eq!(offset_of!(AEnum, Variant2.field), 1);
assert_eq!(offset_of!(AEnum, Variant3.0.field), 8);
// this does not compile because field2 is private here
// assert_eq!(offset_of!(AStruct, field2), 0);
// this does not compile because 0 is private here
// assert_eq!(offset_of!(ANestingStruct, 0.field), 4);
} DocumentationThe rustdoc of our Test cases and edge casesThe basic test cases for
Unresolved questionsThere has been discussion around in-place construction of |
I discovered that in Given that |
@rfcbot fcp merge |
Team member @Amanieu has proposed to merge this. The next step is review by the rest of the tagged team members: Concerns:
Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! cc @rust-lang/lang-advisors: FCP proposed for lang, please feel free to register concerns. |
I don't think this is a blocker: the discriminant isn't encoded as a real field with an offset, so it can't be represented with |
@rfcbot concern consistent-future-syntax Given that this item (unlike nested fields) isn't as urgent, and that we don't have any syntax for this elsewhere in the language (unlike nested fields), I'd like to make sure that either:
(We currently have pattern syntax for extracting fields, but the syntax proposed for |
The same syntax could be used for the proposed field_of! macro, RFC 3318. It could possibly be used for the proposed Outside of reflection-related macros, the syntax |
Marking lang nominated to discuss how we want the future landscape to look here, and how we want to provide this functionality. (There's enough of a use case for this; the question is syntax.) |
Hey @joshtriplett, is this scheduled for a meeting agenda? If possible, I'd love to observe that meeting! I'm still very interested in the ability to construct enums in-place for deserialization reasons |
It'll be in an upcoming meeting, but we don't have it scheduled for a particular date. cc @traviscross who would have a better idea of timing. |
With At See: https://rust.godbolt.org/z/jz8qc1Pqv Basically, we can emulate In-place construction is done by initializing fields in-place using /// initializes `dst` to be valid AnEnum::X containing the fields already initialized in `dst`
/// # Safety
/// - `dst` must be properly aligned pointing to an `AnEnum`
/// - `dst` must be valid for writes
/// - The value pointed at by `dst` must have all fields of variant `X` initialized and valid.
unsafe fn set_discriminant_x(dst: NonNull<()>) {
// All the reads that immediately write the same value to the same location
// get eliminated with alteast `opt-level=1` leaving only the discriminant write, if any.
let value = AnEnum::X {
a: dst.byte_add(offset_of!(AnEnum, X.a)).cast::<u32>().read(),
b: dst.byte_add(offset_of!(AnEnum, X.b)).cast::<u64>().read(),
c: dst.byte_add(offset_of!(AnEnum, X.c)).cast::<Box<str>>().read(),
};
dst.cast::<AnEnum>().write(value);
}
enum AnEnum {
X{ a: u32, b: u64, c: Box<str> },
Y([u8; 12]),
} Note: Ideally a derive macro is generating the above function. By using a function pointer you can set the discriminant in a type erased fashion as well. Of course, it would still be great to have dedicated support for manipulating discriminant. Full Zulip thread about the workaround: https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/.E2.9C.94.20Workarounds.20for.20setting.20enum.20discriminants/near/470331454 |
My take: I don't know any other place in the language where we might want to use this kind of syntax, though I can imagine it (e.g., MIR has similar paths of this kind). If we want certainty, though, the only syntax that we can be SURE will not conflict with anything else would be to use pattern matching. Have folks explored pattern matching and what that might look like? I was imagining something like "supply a pattern with exactly one binding" (e.g., |
(Personally though I feel reasonably good about the proposed |
Just confirming what @exrok said - I don't think "set discriminant" functionality should block this FCP, I'm working on a follow-up RFC for that, and that RFC would build on top of this one. Edit: I've opened a PR for this RFC: rust-lang/rfcs#3727 |
I think for stabilization, In particular, I seems like this macro (if applicable to #![feature(offset_of_enum)]
use std::convert::Infallible;
use std::mem::offset_of;
use std::mem::MaybeUninit;
fn main() {
let mut u = MaybeUninit::<Option<(u8, Infallible)>>::uninit();
let p: *mut Option<(u8, Infallible)> = u.as_mut_ptr();
unsafe {
let p: *mut u8 = p
.byte_add(offset_of!(Option<(u8, Infallible)>, Some.0.0))
.cast();
p.write(42);
}
} Footnotes |
Also, perhaps someone could clarify the general process that is being followed here:
|
To be clear, the compiler already optimizes away uninhabited variants to some extent (specifically, they are skipped when assigning niched tags), but currently it does ensure there is sufficient storage even for uninhabited variants. This is the enum version of the fact that So the question is whether one can assume that Regarding the procedural question, I think such extensions have been done with team FCPs in the past. However, it is a user-visible change, and in particular given the subtle question around whether the offsets are guaranteed to be in-bounds, an RFC might not be a bad idea -- it'd probably make sense to discuss that together with rust-lang/rfcs#3727. |
If in-place initialization of enums is desirable then declaring usage involving uninhabited types to be UB seems too costly to me. It would greatly complicate most uses of in-place enum construction and make it much harder to do soundly in the first place, for dubious benefit. The problematic cases are in generic code (e.g., construction of Once discovered, the only proper fix is to add a special case code path to every function that constructs an enum in-place, detecting annd avoiding the out-of-bounds writes. I’m aware of two options and they both suck:
Not only do both options kinda suck when they’re triggered, but they also add considerable extra complexity which should never be exercised by a correct program (one which does not panic or get stuck in an unproductive infinite loop). It’s a pure gotcha that makes already subtle unsafe code more verbose and thus harder to audit. |
Agreed, that's what I meant to say but failed to make explicit. :) |
IMO theres a huge difference between the two. For structs, normal users don’t really care, because the struct itself will be uninhabited, too, anyway. Thus it’s okay to just make the unsafe code easier, make For union Foo {
unit: (),
other: (u8, Infallible),
}
fn main() {
let mut x = Foo { unit: () };
x.other.0 = 42;
} which doesn’t even require unsafe (since Rust 1.37). Furthermore, for tuples, unions, structs, existing For enums, when the resulting type is inhabited, there is an actual cost to the increased size and alignment, and the vast majority of users – especially all existing users – have zero benefits from this reserved extra space in – say – a
I believe this is completely solvable. For example: I also believe that .byte_add(offset_of!(Option<(u8, Infallible)>, Some.0.0)).cast() above I was already quite dissatisfied because I had no way of connecting the type of the (nested) field with the result type of the But Also, coming back to the process, all of this discussion seems quite out of place in a tracking issue. Footnotes
|
Better ways to express manual in-place construction would be an interesting addition to the language. However, it's definitely out of scope for this tracking issue, and it doesn't really help use cases such as postcard-forth that have good reasons to work in terms of offsets. Regarding More generally, the proposed enum layout optimization would complicate far more than just manual in-place construction. As already discussed way back in #49298 (comment), there's useful compiler optimizations (on polymorphic MIR) that would want to do essentially the same thing, and these aren't allowed to introduce an early panic to prevent the out-of-bounds writes. It would also be annoying if |
I think this is simply false, in the same way that is false that
I agree that it can complicate things… I’m just not convinced that any of it would necessarily become too complicated. |
With size_of and align_of, it’s typically possible and even easy to do whatever you want to do without caring what the exact value is (transmuted can break, but it’s hard to do something with transmutes that relies only on the size and isn’t also broken because other aspects of layout are undefined). Similarly, But a fallible |
Hopefully you only ever call this inline assembly if by then you know that the variant is inhabited. You can do
Maybe In such a system, there’s an even better answer to
Const evaluation already has certain mechanisms for delayed panics, powering cases like e.g. fn main() {
foo(0);
}
/// panics if index > 0
fn foo(index: usize) {
if index > 0 {
let s: &'static String = &([][index]);
println!("{s}");
}
} so a similar mechanism could power a use-case of inserting a |
I'm not sure I understand all of the offset_of discussion above so I won't comment on those parts, but at least for manual enum construction, as discussed in rust-lang/rfcs#3727, we could perhaps make the use of If the only sound way to construct the enum "from scratch" requires
In the case of |
imo any kind of workaround by making trait MyReflect {
// this should reflect all variants, since sometimes you are
// using the variants for things other than constructing enum
// values, e.g. generating a JSON schema for serde.
const DISCRIMINANTS: &'static [Discriminant<Self>];
}
#[derive(MyReflect, MyConstructInPlace)]
pub enum E {
// the proc-macros *can't see* the definition of SomeType,
// so have no way of knowing if it's uninhabited, so they can't
// generate code that does `discriminant_of` or `offset_of` since
// those would be compile-time errors.
A(SomeType),
B,
} |
@programmerjake but that's kind of my point - it shouldn't be able to generate that. If a variant shouldn't ever exist, we can't receive it on the wire, which means we don't need the schema anyway. You wouldn't want to be able to construct the uninhabited variant through the reflection system either. I think derives like MyReflect would be able to have attrs like |
well, then what's it supposed to do for things like where there's no way to tell at macro expansion time or even at compilation time pre-monomorphization from the crate defining pub enum E2<T: Trait> {
A(T),
B(<T as Trait>::Type),
} imo this severely restricts what proc-macros and general generic code can do for not much gain. |
i think it would be preferable to have enums not make the delete-uninhabited-variants'-fields optimization and, for when people want uninhabited types to not take any space, instead have a built-in type #![feature(specialization)]
/// implemented by the compiler for all uninhabited types
trait Uninhabited {}
trait ShrinkIfUninhabitedSpec {
type Type: ?Sized + BorrowMut<Self>;
const fn new(self) -> Self::Type
where
Self: Sized,
Self::Type: Sized;
fn into_inner(v: Self::Type) -> Self
where
Self: Sized,
Self::Type: Sized;
}
default impl<T: ?Sized> ShrinkIfUninhabitedSpec for T {
type Type = T;
fn new(self) -> T
where
Self: Sized,
{
self
}
fn into_inner(v: Self::Type) -> Self
where
Self: Sized,
{
v
}
}
impl<T: ?Sized + Uninhabited> ShrinkIfUninhabitedSpec for T {
type Type = !;
fn new(self) -> T
where
Self: Sized,
{
unreachable!()
}
fn into_inner(v: Self::Type) -> Self
where
Self: Sized,
{
unreachable!()
}
}
#[derive(Copy, Clone, Ord, PartialOrd, Eq, PartialEq, Hash, Debug, Default)]
#[repr(transparent)]
pub struct ShrinkIfUninhabited<T: ?Sized>(<T as ShrinkIfUninhabitedSpec>::Type);
// accessor/constructor/into_inner/etc. impls |
As @programmerjake said, the problem is that any type parameter can be uninhabited. If One “solution” is to say the compiler rejects As a third alternative, there could a marker trait for “ |
Ah, thank you @hanna-kruppe, that's indeed what I was suggesting, and I wasn't aware of the desire to avoid doing that. |
This seems like a very reasonable assumption - it's an assumption I would make, and I don't see how the definition given of
This is guaranteed for the Footnotes
|
@steffahn On procedure - the original It looks like rust-lang/unsafe-code-guidelines#443 will need to be decided one way or another before any further process can occur here, since enum support in |
I believe the layout guarantees of
@GKFX To clarify, I asked mainly out of curiosity, not to criticize anybody in any way. I’m also not an expert on Rust procedures. Thank you for your efforts to help get a more capable I’d argue that probably this stabilization with full enum support (including |
Feature gate:
#![feature(offset_of_enum)]
This is a tracking issue for using enum variants in offset_of. Enum variants themselves do not have an offset within their type, so the macro will not give an offset for them, but their fields do. For example, the standard library uses the offset of the content of the
Some
variant ofOption
to implementOption::as_slice
. The original RFC for offset_of was rust-lang/rfcs#3308.Public API
Steps / History
Unresolved Questions
Footnotes
https://std-dev-guide.rust-lang.org/feature-lifecycle/stabilization.html ↩
The text was updated successfully, but these errors were encountered: