-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deserializing strings in place. #79
Conversation
9a06f6b
to
1cafd85
Compare
Not to nag, but I've now fixed the tests, they run successfully on my fork, and thus it should now be ready to test and merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, this notification got buried in my Github notifications so I didn't see it until now.
In general, I'm not currently inclined to support merging the changes in this PR as implemented for the following reasons:
- Changing the API can have a very large ripple for downstream users, where this dependency may be injected in the middle of their dependency stacks. I recognize this would be a semver breaking change so it wouldn't break peoples code, but I suspect adoption would be low of the newer version because users may not have control of the mutability of their input data. (I know I have a number of dependencies like this personally)
- I have concerns about maintability and reliability of allowing the deserialization input to be mutated. We open ourselves up to a whole class of defects that werent previously possible by mutating the data that we're processing. For example, we could introduce bugs where the first serialization works, but the second doesn't. Or, we could cause the first deserialization result to differ from a second pass.
I don't believe that the benefit of supporting string escapes outweighs the downsides above.
That being said, I'd be more than happy to review a change where the input was copied into some buffer and then deserialized (i.e. the input is not mutable, but we do induce a copy of the data), but the user should have to specifically opt-in to this mechanism.
How about the following design?:
I believe that this design will also solve #74 |
That definitely sounds like a nice approach to me! It may be useful to try and look at some cases where string escaping is useful to see if this design is onerous for the end user. Do you have a sample use case in mind? That may be helpful in figuring out if this is a good approach. |
I've written a bare-metal HTTP server framework, and would like it to be able to deserialize JSON encoded POST bodies into arbitary data structures, and on soundness grounds and to have separation of concerns, would like all decoding and unescaping to happen before the data is passed to the request handler. |
Is this open-sourced and/or something I could look at to get an idea of how you're interested in using it? What I'm really trying to figure out is what this API would look like in a real-world example with actual code |
In case the message got lost in the post (hurrah for email), I've closed this pull request and opened a new one at #83 with my new design. |
De-escaping JSON strings will always produce a shorter (or equal length) string, so it's safe to de-escape strings in place, thus allowing types to borrow plaintext (not escaped) strings from the buffer after deserialization.
This is a semver breaking change as it requires the JSON input to be passed in mutably to allow for the de-escaping in place.
This is safe because once the serialization has passed the string, it never reads that part of the buffer again.