Integrate vector databases with `RetrievalResults` #586

AlekseySh · 2024-06-09T20:56:53Z

We already have an example where a gallery set is fixed, but queries come online in batches: link.

It would be great to improve its performance with vector databases.
The reworked example may look somehow like:

...

# gallery is huge and fixed, so we only process it once
dataset_gallery = ImageBaseDataset(galleries, transform=transform)
embeddings_gallery = inference(extractor, dataset_gallery, batch_size=4, num_workers=0)

# ONE OF:
index = SklearnKNNIndex(embeddings_gallery)  # a child of IVectorIndex
index = FaissIndex(embeddings_gallery)  # a child of IVectorIndex
index = QdrantIndex(embeddings_gallery)  # a child of IVectorIndex

for queries in [queries1, queries2]:
    dataset_query = ImageBaseDataset(queries, transform=transform)
    embeddings_query = inference(extractor, dataset_query, batch_size=4, num_workers=0)

    rr = RetrievalResults.from_index(
        index = index, embeddings_query=embeddings_query,
        dataset_query=dataset_query, dataset_gallery=dataset_gallery
    )
    rr = ConstantThresholding(th=80).process(rr)
    rr.visualize_qg([0, 1], dataset_query=dataset_query, dataset_gallery=dataset_gallery, show=True)
    print(rr)

I think we should start here with understanding of what IVectorIndex interface should include so it can handle different backends.

The text was updated successfully, but these errors were encountered:

AlekseySh added the new feature label Jun 9, 2024

AlekseySh added the good first issue Good for newcomers label Jun 9, 2024

github-project-automation bot added this to OML-planning Aug 30, 2024

github-project-automation bot moved this to To do in OML-planning Aug 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate vector databases with `RetrievalResults` #586

Integrate vector databases with `RetrievalResults` #586

AlekseySh commented Jun 9, 2024 •

edited

Loading

Integrate vector databases with RetrievalResults #586

Integrate vector databases with RetrievalResults #586

Comments

AlekseySh commented Jun 9, 2024 • edited Loading

Integrate vector databases with `RetrievalResults` #586

Integrate vector databases with `RetrievalResults` #586

AlekseySh commented Jun 9, 2024 •

edited

Loading