Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate vector databases with RetrievalResults #586

Open
AlekseySh opened this issue Jun 9, 2024 · 0 comments
Open

Integrate vector databases with RetrievalResults #586

AlekseySh opened this issue Jun 9, 2024 · 0 comments
Labels

Comments

@AlekseySh
Copy link
Contributor

AlekseySh commented Jun 9, 2024

We already have an example where a gallery set is fixed, but queries come online in batches: link.

It would be great to improve its performance with vector databases.
The reworked example may look somehow like:

...

# gallery is huge and fixed, so we only process it once
dataset_gallery = ImageBaseDataset(galleries, transform=transform)
embeddings_gallery = inference(extractor, dataset_gallery, batch_size=4, num_workers=0)

# ONE OF:
index = SklearnKNNIndex(embeddings_gallery)  # a child of IVectorIndex
index = FaissIndex(embeddings_gallery)  # a child of IVectorIndex
index = QdrantIndex(embeddings_gallery)  # a child of IVectorIndex

for queries in [queries1, queries2]:
    dataset_query = ImageBaseDataset(queries, transform=transform)
    embeddings_query = inference(extractor, dataset_query, batch_size=4, num_workers=0)

    rr = RetrievalResults.from_index(
        index = index, embeddings_query=embeddings_query,
        dataset_query=dataset_query, dataset_gallery=dataset_gallery
    )
    rr = ConstantThresholding(th=80).process(rr)
    rr.visualize_qg([0, 1], dataset_query=dataset_query, dataset_gallery=dataset_gallery, show=True)
    print(rr)

I think we should start here with understanding of what IVectorIndex interface should include so it can handle different backends.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: To do
Development

No branches or pull requests

1 participant