You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Investigate if OpenSearch is an option for combined sparse and dense vector search. txtai is another option.
Alternatively, we can use PostgresSQL with pg_vector. A very simple SQL setup can look like this:
CREATEDATABASEdocument_embeddings;
\c document_embeddings
CREATETABLEdocument_chunks (id SERIALPRIMARY KEY, title TEXTNOT NULL, content TEXTNOT NULL, url VARCHAR(512), embedding vector(384), sparse_embedding vector(1024));
CREATETABLEdocuments (id SERIALPRIMARY KEY, title TEXTNOT NULL, content TEXTNOT NULL, url VARCHAR(512));
CREATEINDEXON document_chunks USING hnsw (embedding vector_l2_ops);
CREATEINDEXON document_chunks USING hnsw (sparse_embedding vector_l2_ops);
CREATEINDEXidx_documents_url_btreeON documents (url);
CREATEINDEXidx_chunks_url_btreeON document_chunks (url);
CREATEUSERdocument_embeddings WITH ENCRYPTED PASSWORD 'CHANGEME';
GRANT ALL PRIVILEGES ON DATABASE document_embeddings TO document_embeddings;
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO document_embeddings;
GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO document_embeddings;
And a retrieval can then be done with the following query:
SELECT*, ({a} * distance + (1- {a}) * sparse_distance) AS total_distance FROM (SELECTd.url, d.title, d.content, MIN(c.embedding<->'{embedding}') AS distance, MIN(c.sparse_embedding<->'{sparse_vector}') AS sparse_distance FROM document_chunks c LEFT JOIN documents d ONc.url=d.urlGROUP BYd.url, d.title, d.content) ORDER BY total_distance ASCLIMIT10;"
The text was updated successfully, but these errors were encountered:
Investigate if OpenSearch is an option for combined sparse and dense vector search. txtai is another option.
Alternatively, we can use PostgresSQL with pg_vector. A very simple SQL setup can look like this:
And a retrieval can then be done with the following query:
The text was updated successfully, but these errors were encountered: