Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Integrate Weaviate DB #642

Closed
wants to merge 1 commit into from
Closed

Conversation

Dev-Khant
Copy link
Member

Description

Add support for Weaviate database.

Fixes #436

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)
  • Documentation update

How Has This Been Tested?

Not yet

  • Unit Test

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have checked my code and corrected any misspellings

@Dev-Khant
Copy link
Member Author

I'm confused about how we will add data to weaviate. Because we need to store data in dictionary format we chunk the data given by users and store it into a list of documents.

So I need help here :)

@taranjeet
Copy link
Member

Hey @Dev-Khant : can you tell me more about what help you need here?

@Dev-Khant
Copy link
Member Author

Hi @taranjeet,

Here as every data source is chunked into documents in Weaviate we can directly push documents as we do in cromadb. Here a proper structure needs to be maintained i.e. key-value pair. https://weaviate.io/developers/weaviate/manage-data/create

So here I'm confused about how to define a schema for every use case.

@rupeshbansal
Copy link
Contributor

Hi @taranjeet,

Here as every data source is chunked into documents in Weaviate we can directly push documents as we do in cromadb. Here a proper structure needs to be maintained i.e. key-value pair. https://weaviate.io/developers/weaviate/manage-data/create

So here I'm confused about how to define a schema for every use case.

Hey @Dev-Khant , I am unsure if I follow the issue here. In chroma DB, we do chunking, and each chunk has a unique ID. This means a single document can have multiple chunks in the DB. Weaviate seems quite similar to me in that sense, or maybe I have completely missed something. Do you mind elaborating? Happy to hop on a call to discuss further

@Dev-Khant
Copy link
Member Author

Dev-Khant commented Sep 29, 2023

Hi @rupeshbansal,
As much as I have read about Weaviate we need to store data as key-value pair and weaviate use GraphQL.

If you follow https://weaviate.io/developers/weaviate/manage-data/create and https://weaviate.io/developers/weaviate/manage-data/import then according to it I don't think a single document can be pushed. And even if we somehow push, I'm not sure if it will be able to query relevant data. And sure let me know want to discuss over call

@rupeshbansal
Copy link
Contributor

Thats correct, a single document cannot be pushed, which is why the data has to be chunked before pushing. You are correct in saying that each chunk will then be mapped to a different id. But all this holds true for any database. Embedchain has a concept of chunkers which uses a basic technique of recursively breaking the context into chunks of equal size and pushing them.

Do you mind if I take over this task of integrating Weaviate, and maybe you can try integrating some other databases? Let me know if you want to continue with this

@Dev-Khant
Copy link
Member Author

Yes you can pick this up, I wanted to learn how Weaviate worked with this type of data but I'll go through your code once you build it. I'll look for other databases

@Dev-Khant Dev-Khant closed this Oct 4, 2023
@Dev-Khant Dev-Khant deleted the weaviate-db branch June 7, 2024 05:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for weaviate database
3 participants