Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved table encoding code to handle sparse arrays. #28

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

aryajur
Copy link

@aryajur aryajur commented Jul 19, 2020

By going through the table before to get the maximum index we can handle sparse arrays. Also tostring seems to work better by maintaining float and integer number conversion.

@Mysak0CZ
Copy link

Hello! This would be nice change, but table.pack() creates an array with property n. Would it be possible to add check to line 71, that if k == "n" and type(v) == "number" then length = v else treat as object?
Lua documentation: https://www.lua.org/manual/5.2/manual.html#pdf-table.pack
Current behaviour:

> json.encode({1, nil, 3})
"[1,null,3]"
> json.encode(table.pack(1, nil, 3))
"{1:1,3:3,\"n\":3}"

Why is this beneficial?
If lua array ends with nil(s), they are trimmed. If we however use table.pack, the original count of objects is stored in property n allowing the array to end with nil/null.

… The array detection code now allows for the array to have a key n with a number value to detect the array length.
@aryajur
Copy link
Author

aryajur commented Aug 19, 2020

Fixed that. Now arrays can be specified through table.pack.

json.lua Outdated
@@ -108,7 +119,7 @@ local function encode_number(val)
if val ~= val or val <= -math.huge or val >= math.huge then
error("unexpected number value '" .. tostring(val) .. "'")
end
return string.format("%.14g", val)
return tostring(val)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unreliable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't understand. Perhaps a more useful/productive comment? Thanks.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, that was indeed too short to be useful. The documentation of tostring says:

Receives an argument of any type and converts it to a string in a reasonable format. For complete control of how numbers are converted, use string.format.

It doesn't guarantee anything. It could output the number with arbitrarily poor precision, or it could use an arbitrary format (which need not be valid JSON). (In practice, it does indeed behave quite reasonable, but you shouldn't rely on this; it should also be noted that it is not "lossless" in practice, it rounds (albeit neither is the current method - 14 digits aren't enough).)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for explaining it. I see. I remember I had changed it to make a certain case work properly when I was using it in the oauth2 module. Unfortunately I did not document why I changed it. I think I switched it back in the repository for now.

json.lua Outdated
-- Encode
for i, v in ipairs(val) do
table.insert(res, encode(v, stack))
for i=1,length do

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix your formatting

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes fixed it for this function only. Tab structure in the original file is different than my tool. Didn't change the whole file.

json.lua Outdated
end
stack[val] = nil
return "[" .. table.concat(res, ",") .. "]"

else
-- Treat as an object
for k, v in pairs(val) do
--[[

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dirty

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't understand. Perhaps a more useful/productive comment?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outcommented code is dirty; it's effectively leaving "trash" around in the codebase. If it isn't needed anymore, remove it; it'll still be in the Git history.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks. I usually keep some portions in my code to see some history immediately but you are right it is in the version history. I cleaned it up.

-- Check whether to treat as a array or object
local array = true
local length = 0
local nLen = 0

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered the worst-case this might create: What if I serialize {[1e9] = true}?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed all these. Revisiting after a long time. Thanks for catching that. Had not considered it at all. Been using it, did not have issues. Fixed it in the next commit by preventing sparse arrays greater than 100 (arbitrarility) length getting through. Seems like other json modules also handle sparse arrays in a undefined way. 100 should be good for many cases I suppose.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's a tricky issue, and behavior varies wildly. I believe ideally it should be left up to the user. Perhaps use something like a "density" factor as a threshold?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a variable called SPARSELIMIT in the module that the user can change after requiring the module to change it. Default set it to 100.

@ninjamonkey198206
Copy link

@appgurueu
Is there a reason for your abrupt and occasionally hostile behavior?

You're coming off as unnecessarily rude.

json.lua Outdated
local nLen = 0
local count = 0
for k,v in pairs(val) do
if (type(k) ~= "number" or k<=0) and not (k == "n" and type(v) == "number") then

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about fractional keys, like 4.2? I think you also need a check like k % 1 == 0 or math.floor(k) == k to check that k is (not) an integer.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, added k%1 ~= 0 in the condition.

@appgurueu
Copy link

appgurueu commented Nov 25, 2023

@appgurueu Is there a reason for your abrupt and occasionally hostile behavior?

You're coming off as unnecessarily rude.

I'm not sure what you consider to be rude about my review. I made three comments:

  • Pointing out a possible worst-case scenario.
  • Commenting on the bad formatting.
  • Labeling a clear code smell (outcommented code) as "dirty". If it isn't dirty, is it clean? I believe(d) this to be self-evident; it does not need elaboration. Maybe I should have phrased this differently, as "oops, looks like you forgot some outcommented code here"?
  • Pointing out that using tostring on numbers is unreliable (if a certain format is expected). I should have elaborated on this; I didn't because it is something I am used to.

This was intended just as very "direct" feedback from a quick look.
In hindsight, some of it would have needed some elaboration.

If I came off as rude to @aryajur, that was not my intention, and I apologize for that.

@aryajur
Copy link
Author

aryajur commented Nov 28, 2023

@appgurueu Is there a reason for your abrupt and occasionally hostile behavior?
You're coming off as unnecessarily rude.

I'm not sure what you consider to be rude about my review. I made three comments:

  • Pointing out a possible worst-case scenario.
  • Commenting on the bad formatting.
  • Labeling a clear code smell (outcommented code) as "dirty". If it isn't dirty, is it clean? I believe(d) this to be self-evident; it does not need elaboration. Maybe I should have phrased this differently, as "oops, looks like you forgot some outcommented code here"?
  • Pointing out that using tostring on numbers is unreliable (if a certain format is expected). I should have elaborated on this; I didn't because it is something I am used to.

This was intended just as very "direct" feedback from a quick look. In hindsight, some of it would have needed some elaboration.

If I came off as rude to @aryajur, that was not my intention, and I apologize for that.

I appreciate your feedback. Would always want to improve my work.

…to control the length threshold of sparse arrays. Some bug fixes and improvements as suggested by @appgurueu
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants