-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compaction verification enhancements #60
Conversation
Allows the `producer_worker` and `repeater_worker` to generate tombstone records.
@@ -108,7 +109,10 @@ func (pw *ProducerWorker) newRecord(producerId int, sequence int64) *kgo.Record | |||
pw.Status.AbortedTransactionMessages += 1 | |||
} | |||
|
|||
payload := make([]byte, pw.config.messageSize) | |||
payload := pw.config.valueGenerator.Generate() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why valueGenerator
wasn't used here before instead of an empty make([]byte)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😨
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make a PR to redpanda and check that it works as expect by pointing kgo at this commit sha before merging this PR. Otherwise if we merge and detect a problem we get a "head of line blocking" on making concurrent changes to kgo.
Suggestion: For testing compacted topics I think we can do more. kgo-verifier should keep track of the keys and and the end checkpoint the expected set of keys and associated values (last values produced for a key minus tombstones). Then during consumption we can track the same and compare at the end.
Or maybe we have a test like this already?
@@ -108,7 +109,10 @@ func (pw *ProducerWorker) newRecord(producerId int, sequence int64) *kgo.Record | |||
pw.Status.AbortedTransactionMessages += 1 | |||
} | |||
|
|||
payload := make([]byte, pw.config.messageSize) | |||
payload := pw.config.valueGenerator.Generate() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😨
@@ -53,7 +53,7 @@ type ValidatorStatus struct { | |||
lastLeaderEpoch map[int32]int32 | |||
} | |||
|
|||
func (cs *ValidatorStatus) ValidateRecord(r *kgo.Record, validRanges *TopicOffsetRanges) { | |||
func (cs *ValidatorStatus) ValidateRecord(r *kgo.Record, validRanges *TopicOffsetRanges, compacted bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall this be a field on the ValidatorStatus
struct instead? it doesn't change for the lifetime entire lifetime anyway why pass it as a param then?
Ack, will do with an upcoming PR.
Yes, this is possible. Will push some follow up commits to this PR. |
Offset gaps exist in compacted topics. Suppress a warning log in the read workers if the user passes a `--compacted` flag as a parameter.
b13a42a
to
28a4a05
Compare
Force push to:
|
See redpanda-data/kgo-verifier#60, which added `--tombstone-probability`, `--compacted` as input parameters, and `LatestValueProduced`/`LatestValueConsumed` as return values in producer/worker status structs.
See redpanda-data/kgo-verifier#60, which added `--tombstone-probability`, `--compacted` as input parameters, and `LatestValueProduced`/`LatestValueConsumed` as return values in producer/worker status structs.
See redpanda-data/kgo-verifier#60, which added `--tombstone-probability`, `--compacted` as input parameters, and `LatestValueProduced`/`LatestValueConsumed` as return values in producer/worker status structs.
See redpanda-data/kgo-verifier#60, which added `--tombstone-probability`, `--compacted` as input parameters.
28a4a05
to
937e857
Compare
See redpanda-data/kgo-verifier#60, which added `--tombstone-probability`, `--compacted` as input parameters.
See redpanda-data/kgo-verifier#60, which added `--tombstone-probability`, `--compacted` as input parameters. Also, use `errors='replace'` in `kafka_cat.py`, to avoid UTF-8 decoding issues with randomly generated bytes in `kgo-verifier` records.
See redpanda-data/kgo-verifier#60, which added `--tombstone-probability`, `--compacted` as input parameters. Also, use `errors='replace'` in `kafka_cat.py`, to avoid UTF-8 decoding issues with randomly generated bytes in `kgo-verifier` records.
--tombstone-probability
and --compacted
937e857
to
3d37b2b
Compare
See redpanda-data/kgo-verifier#60, which added `--tombstone-probability`, `--compacted` as input parameters. Also, use `errors='replace'` in `kafka_cat.py`, to avoid UTF-8 decoding issues with randomly generated bytes in `kgo-verifier` records.
See redpanda-data/kgo-verifier#60, which added `--tombstone-probability`, `--compacted` as input parameters. Also, use `errors='replace'` in `kafka_cat.py`, to avoid UTF-8 decoding issues with randomly generated bytes in `kgo-verifier` records.
0d2c91a
to
df9a29a
Compare
See redpanda-data/kgo-verifier#60, which added `--tombstone-probability`, `--compacted` as input parameters.
See redpanda-data/kgo-verifier#60, which added `--tombstone-probability`, `--compacted` as input parameters.
See redpanda-data/kgo-verifier#60, which added `--tombstone-probability`, `--compacted` as input parameters.
See redpanda-data/kgo-verifier#60, which added `--tombstone-probability`, `--compacted`, `--validate-latest-values` as input parameters.
See redpanda-data/kgo-verifier#60, which added `--tombstone-probability`, `--compacted`, `--validate-latest-values` as input parameters.
For verification of compacted topics, we can track the last expected key-value pair that will be seen after the log has been fully compacted.
As a means to verify the results for a compacted topic. Use the flag `--validate-latest-values` to trigger validation.
ca5a89b
to
b552221
Compare
See redpanda-data/kgo-verifier#60, which added `--tombstone-probability`, `--compacted`, `--validate-latest-values` as input parameters.
See redpanda-data/kgo-verifier#60, which added `--tombstone-probability`, `--compacted`, `--validate-latest-values` as input parameters.
5ef813d
to
65fa3e8
Compare
See redpanda-data/kgo-verifier#60, which added `--tombstone-probability`, `--compacted`, `--validate-latest-values` as input parameters.
See redpanda-data/kgo-verifier#60, which added `--tombstone-probability`, `--compacted`, `--validate-latest-values` as input parameters.
And use it in the verifier workers. In the case that the topic being consumed from had tombstones produced, the high watermark may be given by a tombstone record that has been removed. In trying to consume until this point, sequential readers will become stuck polling for new records. Persist the last consumable offset in order to adjust the offset we attempt to read up to in the read workers.
There was a race-y access to a map within the producer worker. Add `OnAcked()` to allow for proper access of the producer's lock during reading and writing.
65fa3e8
to
300f32e
Compare
See redpanda-data/kgo-verifier#60, which added `--tombstone-probability`, `--compacted`, `--validate-latest-values` as input parameters.
See redpanda-data/kgo-verifier#60, which added `--tombstone-probability`, `--compacted`, `--validate-latest-values` as input parameters.
See redpanda-data/kgo-verifier#60, which added `--tombstone-probability`, `--compacted`, `--validate-latest-values` as input parameters.
See redpanda-data/kgo-verifier#60, which added `--tombstone-probability`, `--compacted`, `--validate-latest-values` as input parameters.
See redpanda-data/kgo-verifier#60, which added `--tombstone-probability`, `--compacted`, `--validate-latest-values` as input parameters.
See redpanda-data/kgo-verifier#60, which added `--tombstone-probability`, `--compacted`, `--validate-latest-values` as input parameters.
See redpanda-data/kgo-verifier#60, which added `--tombstone-probability`, `--compacted`, `--validate-latest-values` as input parameters.
See redpanda-data/kgo-verifier#60, which added `--tombstone-probability`, `--compacted`, `--validate-latest-values` as input parameters.
See redpanda-data/kgo-verifier#60, which added `--tombstone-probability`, `--compacted`, `--validate-latest-values` as input parameters. (cherry picked from commit 85bea2b)
Adds new flags and validation enhancements to
kgo-verifier
:--tombstone-probability
, which allows for random production of tombstone records per the input parameter.--compacted
, which indicates that the topic to be verified is compacted. A warning about gaps in consumed offsets will not be shown if this istrue
--validateLatestValues
, which if true, performs verification of the consumed keys against the latest values produced by a worker. The log should be fully compacted before the consumer is started if this flag is passed astrue
.