Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve fwdctl dump #764

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Improve fwdctl dump #764

wants to merge 1 commit into from

Conversation

zdf1230
Copy link

@zdf1230 zdf1230 commented Dec 21, 2024

Purpose

This change is to address issue #348.

Detail

This added flags such as --start-key, --stop-key, --max-key-count and --output-format with csv, json as options and stdout as default behavior.

Test

Tested locally & unit tests added

@zdf1230 zdf1230 marked this pull request as ready for review December 21, 2024 07:15
Copy link
Collaborator

@rkuris rkuris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good! A few things we need before landing:

  • Error handling when writing is very important here. We can't drop IO errors, or we risk having a bad dump with no warning
  • Continuation has to support hash key values, not just text. If the next key is binary, the lossy UTF-8 inputs won't work correctly. Maybe we need start-key-hex?
  • Small bug in the stop key

Otherwise, this looks fantastic! Great test coverage!

The remaining comments are mostly style preferences and are not mandatory.

long,
required = false,
value_name = "OUTPUT_FORMAT",
value_parser = ["csv", "json", "stdout"],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stdout should just be "plain" or "text" or "plain-text". Stdout implies that it's going to be printed to stdout.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, stdout is the default behavior which would print outputs into stdout.


key_count += 1;

if key == stop_key || key_count_exceeded(opts.max_key_count, key_count) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the stop_key does not exist, it looks like we keep going. We should also stop if key > stop_key.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. This makes sense, stop key might not always exist.

fn u8_to_string(data: &[u8]) -> Cow<'_, str> {
String::from_utf8_lossy(data)
const fn key_count_exceeded(max: Option<u32>, key_count: u32) -> bool {
matches!(max, Some(max) if key_count >= max)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
matches!(max, Some(max) if key_count >= max)
max.map(|max| value > max).unwrap_or(false)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, your version is easier to read

match next_key {
Some(Ok((key, _))) => {
println!(
"Next key is {0}, resume with \"--start-key={0}\".",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be ugly for non-printable keys, which is the norm. Keys are often hashes, which contain lots of unprintable characters.

}
}

trait OutputHandler {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🏆

Comment on lines +212 to +214
let _ = self
.writer
.write(format!(",\n \"{}\": \"{}\"", key_str, value_str).as_bytes());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is nearly identical to the other case, so I would probably separate the is_first case to before printing:

if self.is_first {
    self.writer.write(b",\n")?;
    self.is_first = true;
}


trait OutputHandler {
fn handle_record(&mut self, key: &[u8], value: &[u8]);
fn flush(&mut self);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I debated implementing flush as a Drop for those implementations that require it, but that limits your error handling to panic.

let hex = opts.hex;
match opts.output_format.as_str() {
"csv" => {
let file = File::create("dump.csv")?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be hardcoded. Just write to stdout, or provide a way to specify the filename.

default_value = "stdout",
help = "Output format of database dump, default to stdout. CSV and JSON formats are available."
)]
pub output_format: String,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use an enum here and let clap handle the errors?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think when declaring value_parser in clap, it would work just like using an Enum.

@@ -210,6 +210,189 @@ fn fwdctl_dump() -> Result<()> {
Ok(())
}

#[test]
#[serial]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🏆

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants