-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limited sqs size #3620
base: master
Are you sure you want to change the base?
Limited sqs size #3620
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. 👍
packages/aws-client/src/SQS.ts
Outdated
const logger = logOverride || log; | ||
let messageBody; | ||
if (isString(message)) messageBody = message; | ||
else if (isObject(message)) messageBody = JSON.stringify(message); | ||
else throw new Error('body type is not accepted'); | ||
const maxLength = 262144; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How'd we come to this size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is the max size of an sqs message as documented here https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/quotas-messages.html#:~:text=The%20maximum%20is%20262%2C144%20bytes,message%20payload%20in%20Amazon%20S3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm so that's in bytes 262,144 bytes
. JS's length
returns UTF-16 "units".
In UTF-16 encoding, every code unit is exact 16 bits long.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String#utf-16_characters_unicode_code_points_and_grapheme_clusters:~:text=In%20UTF%2D16%20encoding%2C%20every%20code%20unit%20is%20exact%2016%20bits%20long
Is there a better way to determine the message size? If I'm thinking about this correctly (big if), the length
JS would return is roughly twice as large as the max size SQS allows:
If length
returns e.g. 262144 units, which are each 16 bits long, and a byte is 8 bits...
It seems like 'length' may not be appropriate here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems like you're right. I've tested this with 262144 length vs 262144, but I think they're using the standard ascii/unicode split where most characters are 1 byte, but bytes >=128 are interpreted as the first of two in a unicode. I'll do some testing here to be sure and figure out how to correctly measure this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a fix for this is up now, thanks for that great catch!
there's an additional caveat that if a unicode character gets bifurcated, javascript encoding adds two more padding bytes that need to be dealt with. this latest change accounts for those things and tests them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm is there another way to handle this besides inspecting the character length of the message? It's starting to feel like an anti-pattern if we have to continue to use string.length and accommodate for the difference between units.
I'm also a little concerned about not storing the original, untruncated message anywhere.
We handle similar limitations with messages via the CMA by dumping the full message to S3 and referencing it in the message passed between functions in a step function. Is there something similar we could do here with SQS messages?
How was the issue initially encountered? Did a user report it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good @etcart, just a couple questions.
|
||
let recievedMessage = await receiveSQSMessages(queueUrl); | ||
let messageBody = recievedMessage[0].Body; | ||
t.true(messageBody.endsWith('...TruncatedForLength')); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you normally add negative test cases? Like if the message is under the max size, then verify the '...TruncatedForLength' message does not appear. In the event it adds it to all the messages regardless of size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a good point, I'll add it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh wait, the lower part of this test is doing exactly that. it could be separated though to be clear that that's wha's being tested
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe at the end add: t.false(messageBody.endsWith('...TruncatedForLength'));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, we're checking implicitly just by checking that "it's unchanged", but verbosity==good in testing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved the latest commit for a negative test case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest test changes appear good.
prevent possibility of sqs message overflow causing messages to fail.
Addresses CUMULUS-3678: limit sqs message size
Changes
PR Checklist