Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pagination in Hybrid query #1048

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

vibrantvarun
Copy link
Member

@vibrantvarun vibrantvarun commented Dec 31, 2024

Description

This PR contains changes for enabling support for pagination in hybrid query.
The highlight of this PR are

  1. Introduction of a new parameter "pagination_depth" to set a reference of hybrid query search results on which pagination can be applied.
  2. Handling of single shard scenario where fetch phase can run before the normalization process.
  3. Handling of from parameter conditions in Normalization processor.
  4. Disabling scroll operation in hybrid query.

Related Issues

#280

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@vibrantvarun
Copy link
Member Author

Integ test and BWC tests are failing due to opensearch-project/ml-commons#3321

// get fetched hit content by doc_id
SearchHit searchHit = docIdToSearchHit.get(scoreDoc.doc);
// update score to normalized/combined value (3)
searchHit.score(scoreDoc.score);
return searchHit;
}).toArray(SearchHit[]::new);
updatedSearchHitArray[i - fromValueForSingleShard] = searchHit;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to check if i - fromValueForSingleShard is a valid index, within [0, array.length- 1] boundaries

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, if you look at the code it is just putting values in the fresh array updateSearchHitArray. i starts from fromValueForSingleShard. So i-fromValueForSingleShard will be 0. i can maximum go to array.length- 1 so it has already been taken care off. Do you still want me to put extra assert condition here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, my intention is that we have array index iterator that goes from 0 to the length -1, we can adjust the logic for the for loop temp variable, something like this:

for (int i = 0; i < Math.min(topDocs.scoreDocs.length - topDocs.scoreDocs.length, updatedSearchHitArray.length); i++) {
ScoreDoc scoreDoc = topDocs.scoreDocs[i + fromValueForSingleShard];
....
updatedSearchHitArray[i] = searchHit;

@AllArgsConstructor
@Builder
@Getter
public class HybridQueryContext {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add class level java doc

/**
* Extract hybrid query from generic query object retrieved from search context
*/
private static HybridQuery extractHybridQueryFromAbstractQuery(Query query) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private static HybridQuery extractHybridQueryFromAbstractQuery(Query query) {
/**
* Unwraps a HybridQuery from either a direct query or a nested BooleanQuery
*/
private static HybridQuery unwrapHybridQuery(Query query) {

import lombok.Builder;
import lombok.Getter;

@AllArgsConstructor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to use either constructor or builder, but not both. You need to either remove constructor annotation or make it private.
To enforce setting of required pagination param use NonNull annotation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants