Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(EAI-152) check for change to chunkAlgoHash when updating embeddings #580

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

yakubova92
Copy link
Collaborator

@yakubova92 yakubova92 commented Dec 16, 2024

Jira: https://jira.mongodb.org/browse/EAI-152

Changes

  • re-chunking if page content changed OR chunkAlgoHash changed

Notes

  • this uses a $lookup to get pages based on chunkAlgoHash field on embedded_content documents. If we are open to changing the model, we could avoid this if we record the chunkAlgoHash field on the page as well, essentially saying this page was chunked with this hash, which is information I feel is relevant to the page. I don't see much downside to duplicating this info here.

@yakubova92 yakubova92 changed the title check for change in chunkAlgoHash (EAI-152) check for change to chunkAlgoHash when updating embeddings Dec 16, 2024
@yakubova92 yakubova92 requested a review from mongodben December 16, 2024 16:09
Copy link
Collaborator

@mongodben mongodben left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment about caching.

and beyond that, sorry if i'm dense, but what changes are you making?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants