Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Writing manifest files outside of the write thread #3456

Closed
2 tasks done
evenyag opened this issue Mar 7, 2024 · 2 comments
Closed
2 tasks done

Writing manifest files outside of the write thread #3456

evenyag opened this issue Mar 7, 2024 · 2 comments
Assignees
Labels
C-performance Category Performance
Milestone

Comments

@evenyag
Copy link
Contributor

evenyag commented Mar 7, 2024

What type of enhancement is this?

Performance

What does the enhancement do?

As mentioned in #3447 (comment), we write manifest files inside the region worker thread (the write thread) and is likely to block the thread for a long time if there are many flush jobs.

We should move this operation to the background thread if possible and this should reduce the block time of the worker.

Implementation challenges

Once we write manifests outside of the worker thread, we must find a way to serialize the following operations:

  • checks whether a region is active (not been closed, dropped)
  • writes the manifest file
  • gets current region version
  • apply edits to the region version

Steps

@evenyag evenyag added the C-performance Category Performance label Mar 7, 2024
@evenyag evenyag added this to the v0.8 milestone Mar 7, 2024
@evenyag evenyag self-assigned this Mar 7, 2024
@evenyag evenyag added this to mito2 Mar 7, 2024
@evenyag evenyag moved this to Todo in mito2 Mar 25, 2024
@evenyag evenyag self-assigned this Apr 9, 2024
@evenyag evenyag moved this from Todo to In Progress in mito2 Apr 9, 2024
@evenyag
Copy link
Contributor Author

evenyag commented Apr 11, 2024

If we update the manifest in the background, we should ensure the background flush and compaction don't corrupt the manifest. For example, a flush job updates the manifest after a truncate action is written to the manifest. Then we might see data still exist after truncation.

Fortunately, flush and compaction don't break most operations:

  • ALTER
    • We always flush all memtables before altering the flush job is already finished
    • Compaction doesn't interrupt existing alteration as they only modify existing files
  • DROP
    • We write the drop marker so changing the manifest doesn't affect it
  • TRUNCATE
    • The only operation we need to consider is truncating the region
    • We can detect whether a region is truncated and whether a RegionEdit is created before truncation. Then we can ignore these edits.
  • SET READONLY
    • We can set and check the state of the region inside the manifest lock
    • We don't update the manifest if the region is read-only
  • EDIT
    • This is protected by manifest lock
    • We only expect the Edit action add files

To alter, drop, edit, or truncate a region, we set the state in the worker loop. After that, the region won't be able to process other requests until the background manifest job is done.

@evenyag
Copy link
Contributor Author

evenyag commented Apr 25, 2024

There is a remaining issue: If a region is editing the manifest, then sending another RegionEdit to the region will get a RegionState error. A more user-friendly way would be queueing the request and executing it after the current edit request is done.

@github-project-automation github-project-automation bot moved this from In Progress to Done in mito2 May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-performance Category Performance
Projects
Status: Done
Development

No branches or pull requests

2 participants