Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limited sqs size #3620

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
Open

Limited sqs size #3620

wants to merge 14 commits into from

Conversation

etcart
Copy link
Contributor

@etcart etcart commented Apr 2, 2024

prevent possibility of sqs message overflow causing messages to fail.

Addresses CUMULUS-3678: limit sqs message size

Changes

  • sqs messages over documented maximum size are truncated to 256kb

PR Checklist

  • Update CHANGELOG
  • Unit tests
  • Ad-hoc testing - Deploy changes and test manually
  • Integration tests

@vpnguye2
Copy link
Contributor

vpnguye2 commented Apr 4, 2024

Unit test passed locally

image

@vpnguye2 vpnguye2 removed the in review label Apr 5, 2024
Copy link
Contributor

@vpnguye2 vpnguye2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. 👍

const logger = logOverride || log;
let messageBody;
if (isString(message)) messageBody = message;
else if (isObject(message)) messageBody = JSON.stringify(message);
else throw new Error('body type is not accepted');
const maxLength = 262144;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How'd we come to this size?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm so that's in bytes 262,144 bytes. JS's length returns UTF-16 "units".

In UTF-16 encoding, every code unit is exact 16 bits long.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String#utf-16_characters_unicode_code_points_and_grapheme_clusters:~:text=In%20UTF%2D16%20encoding%2C%20every%20code%20unit%20is%20exact%2016%20bits%20long

Is there a better way to determine the message size? If I'm thinking about this correctly (big if), the length JS would return is roughly twice as large as the max size SQS allows:

If length returns e.g. 262144 units, which are each 16 bits long, and a byte is 8 bits...

It seems like 'length' may not be appropriate here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems like you're right. I've tested this with 262144 length vs 262144, but I think they're using the standard ascii/unicode split where most characters are 1 byte, but bytes >=128 are interpreted as the first of two in a unicode. I'll do some testing here to be sure and figure out how to correctly measure this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a fix for this is up now, thanks for that great catch!
there's an additional caveat that if a unicode character gets bifurcated, javascript encoding adds two more padding bytes that need to be dealt with. this latest change accounts for those things and tests them

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm is there another way to handle this besides inspecting the character length of the message? It's starting to feel like an anti-pattern if we have to continue to use string.length and accommodate for the difference between units.

I'm also a little concerned about not storing the original, untruncated message anywhere.

We handle similar limitations with messages via the CMA by dumping the full message to S3 and referencing it in the message passed between functions in a step function. Is there something similar we could do here with SQS messages?

How was the issue initially encountered? Did a user report it?

Copy link
Contributor

@npauzenga npauzenga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good @etcart, just a couple questions.


let recievedMessage = await receiveSQSMessages(queueUrl);
let messageBody = recievedMessage[0].Body;
t.true(messageBody.endsWith('...TruncatedForLength'));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you normally add negative test cases? Like if the message is under the max size, then verify the '...TruncatedForLength' message does not appear. In the event it adds it to all the messages regardless of size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a good point, I'll add it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh wait, the lower part of this test is doing exactly that. it could be separated though to be clear that that's wha's being tested

Copy link
Contributor

@BWex-NASA BWex-NASA Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe at the end add: t.false(messageBody.endsWith('...TruncatedForLength'));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, we're checking implicitly just by checking that "it's unchanged", but verbosity==good in testing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Copy link
Contributor

@BWex-NASA BWex-NASA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved the latest commit for a negative test case.

Copy link
Contributor

@BWex-NASA BWex-NASA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest test changes appear good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants