Greatly improve database performance for esgpull update
#47
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New
Database.commit_context()
for easier bulk transactionsChanged
query_file
table, instead of relying on the ORMExample
I created and filled this database in a few minutes with large queries that previously took hours to run, now a good chunk of the time is spent on fetching metadata from index nodes:
Running a similar test over a non-empty database (~1.4GB) produces no significant difference:
Before the current PR, a non-empty database would take longer to update. Multiple reasons made it very inefficient SQL to add a new relation to a query for a file that already had existing relations to other queries. This is now a single insert in all cases, which makes it irrelevant for the database to be empty or not.