-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds SQL support for Configurable Table Snapshot History #262
Conversation
...c/test/java/com/linkedin/openhouse/spark/e2e/extensions/SetSnapshotsRetentionPolicyTest.java
Outdated
Show resolved
Hide resolved
...c/test/java/com/linkedin/openhouse/spark/e2e/extensions/SetSnapshotsRetentionPolicyTest.java
Outdated
Show resolved
Hide resolved
...c/test/java/com/linkedin/openhouse/spark/e2e/extensions/SetSnapshotsRetentionPolicyTest.java
Outdated
Show resolved
Hide resolved
...nkedin/openhouse/spark/sql/catalyst/parser/extensions/OpenhouseSqlExtensionsAstBuilder.scala
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Will-Lo. Overall looks good. Added some comments.
a2316b4
to
6b7999a
Compare
Can we please add details regarding the SQL API contract? If both versions and max age is present what is expected for example. |
.../antlr/com/linkedin/openhouse/spark/sql/catalyst/parser/extensions/OpenhouseSqlExtensions.g4
Show resolved
Hide resolved
...nkedin/openhouse/spark/sql/catalyst/parser/extensions/OpenhouseSqlExtensionsAstBuilder.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comments. But overall LGTM
...nkedin/openhouse/spark/sql/catalyst/parser/extensions/OpenhouseSqlExtensionsAstBuilder.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for refactoring the tests and answering the questions. Added minor comment, otherwise LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
6039e0b
Summary
Adds Spark SQL support for configurable table snapshots, which controls the versioning of the Openhouse tables.
Syntax is similar to retention but is instead defined as
HISTORY
.History configuration supports both
MAX_AGE
andVERSIONS
, where we retain all table snapshots that live withinMAX_AGE
and withinVERSIONS
.Example: A table with
MAX_AGE = 1d
will retain all snapshots that are within 1 day of when the snapshot retention job last ran.A table with
VERSIONS = 5
will retain the last 5 snapshots of the table without considering the age of the snapshotsIf both
MAX_AGE = 1d
andVERSIONS = 5
is defined, keep the last 5 snapshots within the last day. Note: If there are less than 5 snapshots, then there were less than 5 commits done in the past day.MAX_AGE
andVERSIONS
cannot be defined as less than 1.The default maximums of
MAX_AGE
andVERSIONS
defined in #259 are 3 days and 100 versions respectively.Examples:
Changes
For all the boxes checked, please include additional details of the changes made in this pull request.
Testing Done
Tested on local docker running spark:
Tested setting both policies
Setting only versions:
Setting only max age
Also tested negative cases (invalid numbers, past maximums defined in #259)
e.g.
For all the boxes checked, include a detailed description of the testing done for the changes made in this pull request.
Additional Information
For all the boxes checked, include additional details of the changes made in this pull request.