Use external storage for large data attributes [Large Feature] #506

longquanzheng · 2024-12-05T22:00:40Z

When exceeding the threshold (configurable, e.g. 100KB) for a data attribute, iWF can use others like S3 for storing the data attributes instead of writing into Temporal history. (only storing the keys and the S3 objectIDs)

By storing keys and S3 objectIDs in Temporal history, IWF server will load from S3 before sending to application, and write to S3 for updates. For optimization, server could also load from S3 lazily when application tried to read it.

This is possible because iWF server workflow never really read the value of the DAs -- they are transparent to iWF server.

By offloading the large data attributes to S3, it's much easier for users to deal with large datasets, and more cost effective on using Cadence/Temporal.

longquanzheng · 2024-12-18T19:55:35Z

probably we don’t need to store s3 object id at all. The id can just be based on workflowID+attribueKey

so in the history, we only store the DA is stored somewhere— a flag like “s3” .

longquanzheng added the large label Dec 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use external storage for large data attributes [Large Feature] #506

Use external storage for large data attributes [Large Feature] #506

longquanzheng commented Dec 5, 2024 •

edited

Loading

longquanzheng commented Dec 18, 2024 •

edited

Loading

Use external storage for large data attributes [Large Feature] #506

Use external storage for large data attributes [Large Feature] #506

Comments

longquanzheng commented Dec 5, 2024 • edited Loading

longquanzheng commented Dec 18, 2024 • edited Loading

longquanzheng commented Dec 5, 2024 •

edited

Loading

longquanzheng commented Dec 18, 2024 •

edited

Loading