Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-16835: Add approximate-/select to OAS #2079

Merged

Conversation

gerlowskija
Copy link
Contributor

@gerlowskija gerlowskija commented Nov 15, 2023

https://issues.apache.org/jira/browse/SOLR-16835

Description

Now that Solr uses its OAS to generate client bindings in multiple languages (Java and JavaScript, with Python coming soon in #1681 ), users of these clients may wish to use them to run primitive searches.

Normally, this would require converting the /select endpoint to JAX-RS so that its inputs and outputs can be described comprehensively in our spec. However, doing this for /select will take some time due to the complexity and large degree of configurability the endpoint offers.

Solution

This commit works around this limitation by creating an 'api' module interface approximating the /select endpoint. This allows an incomplete /select endpoint to appear in our OAS (and the generated clients, by extension) until the API can be converted to JAX-RS in earnest. This will give generated-client users access to at least some query functionality in this interim.

The /select query parameters supported in this commit were chosen mostly arbitrarily. They may be added to freely as generated-client users run into particular needs (e.g. for setting 'faceting' parameters).

Tests

Manual testing of the generated 'QueryingApi' request and response classes. Inspection of the generated Javascript (and Python - when combined with #1681) client code. All existing tests continue to pass.

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Solr maintainers access to contribute to my PR branch. (optional but recommended)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.
  • I have added documentation for the Reference Guide

Now that Solr uses its OAS to generate client bindings in multiple
languages (Java and JavaScript, so far), users of these clients may wish
to use them to run searches.

Normally, this would require converting the `/select` endpoint to JAX-RS
so that its inputs and outputs can be described comprehensively in our
OAS.  However, doing this for `/select` will take some time due to the
complexity and large degree of configurability the endpoint offers.

This commit works around this limitation by creating an approximation of
the `/select` endpoint that can appear in our OAS until the API can be
converted to JAX-RS in earnest.  This will give generated-client users
access to at least some query functionality in this interim.

The `/select` query parameters supported in this commit were chosen
mostly arbitrarily.  They may be added to freely as generated-client
users run into particular needs (e.g. for setting 'faceting'
parameters).
@gerlowskija
Copy link
Contributor Author

This PR was motivated primarily by the desire to "dogfood" the Python client we're looking to generate in #1681. We have limited Python code in our repo, and very little of the code we do have actually makes Solr API calls. But several Python scripts query Solr - so giving our generated clients a way to do basic querying gives us options as "dogfooding" goes.

@epugh
Copy link
Contributor

epugh commented Nov 15, 2023

interesting idea! Our Admin UI handling of /select covers almost ever property so probably couldn't use the JS version.. However, for the V2 Query syntax, is that actually simpler from a API perspective since it's mostly "generate a lot of JSON and pass it to this end point"? Would that be easier to approximate?

@janhoy
Copy link
Contributor

janhoy commented Nov 15, 2023

Would that be easier to approximate?

I've been thinking about how to approach that part. Whether we can model each and every feature in the JSON DSL, such as query parsers, attributes, builders etc in a manner that lets us generate all those domain objects, which in turn converts the object graph to JSON, for each language?

I've tried generating py code for another OpenAPI spec and indeed it generated tons of domain classes representing the request/responses for that API. But modelling nested nature of JSON DSL, i.e. a facet, nested within a facet etc, sounds like a challenge.. Perhaps if we did it all with strongly typed and annotated Java classes first, it would work?

@gerlowskija
Copy link
Contributor Author

gerlowskija commented Nov 16, 2023

Our Admin UI handling of /select covers almost ever property so probably couldn't use the JS version.
Yep definitely. The "approximation" in this PR falls pretty far short of what the Admin UI exposes.

If we cared to, we could close a lot of that gap with very little effort. Most parameters would be trivial to add. But there's a few sticking points that are going to require a good bit more effort I think: in particular the arbitrary query-params that /select supports as placeholders/variables, and the JSON DSL.

That's one of the reasons I didn't go beyond the absolute basics in this PR - I wanted the approximation to be driven by actual dogfood-able usage, and the Admin UI query screen felt out of reach for an initial pass.

However, for the V2 Query syntax, is that actually simpler from a API perspective since it's mostly "generate a lot of JSON and pass it to this end point"? Would that be easier to approximate?

By the "V2 Query syntax" I assume you're talking about the JSON DSL?

I guess it depends how we wanted to represent that in the OAS. For the approximation, we could definitely have a POST that takes an arbitrary map in the request body (without trying to dictate exactly what properties/fields are in that map). I'll add that and see how it looks.

I've been assuming I guess that we'll eventually want something more DSL-syntax aware, but given Jan's point above, maybe that's not even true. Maybe the "approximate" solution is the best one here for clients, even in the long term. Users might just prefer to specify a "raw" String rather than finding whatever obscure qparser model class they would otherwise need. And you could imagine an approximate solution might be more flexible for supporting user's server-side plugins, etc. I'm not sure one way or another, but it's definitely worth thinking through...

@epugh
Copy link
Contributor

epugh commented Nov 16, 2023

Definitly referring to the JSON DSL. I suspect that many people will craft their JSON independent of the JS code or Python code, like via some string replacement, and then just pass it in.... Now, if we wanted to actually VALIDATE the query, that might be cool, and would presumably require all those typed objects ;-). I know that I tend to craft my query using Quepid or Curl, and then cut n paste it into whatever is going to execute that big old blob o JSON. I had been thinking about adding a JSON DSL Admin UI screen, and I would have just started with an html textarea for you to paste your JSON in... No syntax highlihgitng beyond whatever a JSON reneder does, and no understanding of the JSON DSL....

@gerlowskija
Copy link
Contributor Author

I suspect that many people will craft their JSON independent of the JS code or Python code, like via some string replacement, and then just pass it in.

Yeah, I imagine that is popular. The trick I think is going to be supporting that without precluding something more structured. AFAIK, without customizing the template at all, we can only have 1 client method per HTTP endpoint, since the clients are generated from an OAS and specs can't contain multiple entries for the same path/verb combo. Maybe the OAS just needs to be incredibly generic, and the client templates will handle things correctly whether users pass in a map, string, etc? I'll play with it a little bit and see if I can get something reasonable working.

I had been thinking about adding a JSON DSL Admin UI screen, and I would have just started with an html textarea for you to paste your JSON in... No syntax highlihgitng beyond whatever a JSON reneder does, and no understanding of the JSON DSL....

The "Query" screen does have a textarea called "JSON DSL", assuming you've seen that? It doesn't have syntax highlighting or any other niceties, but sounds like what you've been thinking about...

SelectApi.jsonQuery now takes a specially-annotated InputStream rather
than the slightly more structured `Map<String, Object>`.  This
ultimately results in client code that's a little more flexible in how
the JSON query is specified.

With this change, the generated Java client can take either a `String`
or any more structured Jackson-serializable object.  The Javascript
client can take a similar union of structured and unstructured types.

In support of this, this commit introduces a "x-genericEntity" extension
to our OAS, which the Java generation template required.
Using the standard 'SolrJerseyResponse' type, unknown top-level
properties cause a deserialization error.  We can get around this for
our query "approximation" by using Jackson's `@AnyGetter` support for
handling unknown properties.

Eventually we'll want to provide stronger typing for query responses,
but this is sufficient for a first-pass approximation.
@gerlowskija
Copy link
Contributor Author

Progress! I was able to make some tweaks to the Java interfaces the generate the OAS and clients so that users can specify their query JSON a few different ways in the generated clients. Things mostly "just worked" in JS with the generated Java code needing a good bit more massaging to get working. I also caught a few bugs in the process, so it was a nice diversion to take.

With the most recent changes client usage now looks like:

Javascript w/ String-JSON

var queryingApi = new V2Api.QueryingApi();
var opts = {
    'body': "{\"query\":\"*:*\"}"
}
queryingApi.jsonQuery("collections", "techproducts", opts, (error, data, response) => {
  ....handle the response...
});

Javascript w/ Typed-JSON

var queryingApi = new V2Api.QueryingApi();
var opts = {
    'body': {"query": "*:*"}
}
queryingApi.jsonQuery("collections", "techproducts", opts, (error, data, response) => {
  ....handle the response...
});

Java w/ String-JSON

final var rawRequest = new QueryingApi.JsonQuery(StoreType.COLLECTION, "techproducts", "{\"query\": \"*:*\"}");
final var rawRequestResponse = rawRequest.process(client).getParsed();
System.out.println("Response status was: " + rawRequestResponse.responseHeader.status);

Java w/ Typed-JSON (used a map here, but any Jackson-serializable object will work)

final var typedRequest = new QueryingApi.JsonQuery(StoreType.COLLECTION, "techproducts", Map.of("query", "*:*"));
final var typedRequestResponse = typedRequest.process(client).getParsed();
System.out.println("Response status was: " + typedRequestResponse.responseHeader.status);

@gerlowskija
Copy link
Contributor Author

Alright- I'm going to merge in our "approximation" for now and use it to dogfood the generated Python client over in #1681.

If anyone has any more feedback on this, lmk and we can iterate in a subsequent PR.

@gerlowskija gerlowskija merged commit fb9b6fd into apache:main Nov 29, 2023
2 of 3 checks passed
@gerlowskija gerlowskija deleted the SOLR-16397-add-jaxrs-shim-for-select branch November 29, 2023 16:53
gerlowskija added a commit that referenced this pull request Nov 29, 2023
Now that Solr uses its OAS to generate client bindings in multiple
languages (Java and JavaScript, so far), users of these clients may wish
to use them to run searches.

Normally, this would require converting the `/select` endpoint to JAX-RS
so that its inputs and outputs can be described comprehensively in our
OAS.  However, doing this for `/select` will take some time due to the
complexity and large degree of configurability the endpoint offers.

This commit works around this limitation by creating an approximation of
the `/select` endpoint that can appear in our OAS until the API can be
converted to JAX-RS in earnest.  This will give generated-client users
access to at least some query functionality in this interim.

The `/select` query parameters supported in this commit were chosen
mostly arbitrarily.  They may be added to freely as generated-client
users run into particular needs (e.g. for setting 'faceting'
parameters).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants