Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data object encoding #692

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Conversation

nevans
Copy link

@nevans nevans commented Oct 30, 2024

This adds basic encoding/decoding for ruby 3.2 Data objects, using !ruby/data:ClassName (a mapping of member names to member values) and !ruby/data-with-ivars:ClassName tags (a mapping with members and ivars mappings). Like Marshal, this uses rb_struct_initialize to assign member values to new Data objects.


Originally (in the first commit), I updated Visitor::ToRuby to simply call data.send(:initialize, **members). But #initialize freezes the object, so we can't set instance variables afterwards.

The second commit sets instance variables before calling #initialize, which I think will lead to unhappy surprises for some users. Unfortunately, Data doesn't give us any other mechanism for setting the members... except for the C API.

The third commit works around the above issue by copying how Marshal loads Data objects, delegating ToRuby#init_struct(obj, members) to rb_struct_initialize. I think this is the least surprising behavior for users. This also bypasses any user-defined #initialize method (which may be seen as a feature or a bug).

@nevans
Copy link
Author

nevans commented Nov 1, 2024

FYI: I haven't put any guards in yet to make this work for older versions of ruby (before Data was defined).

@nevans
Copy link
Author

nevans commented Nov 5, 2024

I'll update this to pass for earlier versions of ruby.

But I'm also looking for feedback on the approach. The more I think about it, the more I'm thinking this should use rb_struct_initialize from the C API, then copy over any ivars, then freeze. Unless I'm misreading the C code (very possible), that will skip any user-defined #initialize override, and just use the underlying Struct implementation... and, although that's not identical to rb_data_initialize_m, its close enough and it's exported and in the headers. That's what Marshal does, since Marshal just treats Data like Struct... which also means that Marshal does not freeze Data objects!

@nevans nevans force-pushed the data-object-encoding branch from 3110d41 to 3a845fd Compare November 8, 2024 14:35
@nevans
Copy link
Author

nevans commented Nov 8, 2024

For what it's worth, I've updated the PR to:

  1. pass with ruby versions down to 2.5 (at least on my machine)
  2. use rb_struct_initialize from the C API

@nevans nevans force-pushed the data-object-encoding branch from b6fcb35 to 2292f0b Compare November 26, 2024 14:21
This sets the ivars _before_ calling initialize, which feels wrong.  But
Data doesn't give us any mechanism for setting the members other than 1)
initialize, or 2) drop down into the C API.  Since initialize freezes
the object, we need to set the ivars before that.  I think this is a
reasonable compromise—if users need better handling, they can implement
their own `encode_with` and `init_with`.  But it will lead to unhappy
surprises for some users.

Alternatively, we could use the C API, similarly to Marshal.  Psych _is_
already using the C API for path2class and build_exception.  This would
be the least surprising behavior for users, I think.
@nevans nevans force-pushed the data-object-encoding branch from 2292f0b to 325b073 Compare December 2, 2024 19:00
@nevans
Copy link
Author

nevans commented Dec 2, 2024

Sorry. I thought I had it passing before, but I must have made a mistake in my earlier tests with ruby 2.5 (perhaps I had somehow disabled warnings locally)? I've changed the guards from defined?(::Data.define) to RUBY_VERSION >= "3.1.0". Unfortunately, it's impossible to check if a method is defined without triggering deprecated constant warnings.

@nevans
Copy link
Author

nevans commented Dec 5, 2024

Any thoughts about this approach? (Is it too close to the 3.4.0 release to hope for getting support for Data in time for 3.4?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

1 participant