Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Term query on a wildcard field does not perform exact match #16754

Open
n9 opened this issue Dec 2, 2024 · 4 comments · May be fixed by #16827
Open

[BUG] Term query on a wildcard field does not perform exact match #16754

n9 opened this issue Dec 2, 2024 · 4 comments · May be fixed by #16827
Assignees
Labels
bug Something isn't working Search Search query, autocomplete ...etc

Comments

@n9
Copy link

n9 commented Dec 2, 2024

Describe the bug

Term query on a wildcard field does not perform exact match.

Related component

Search

To Reproduce

Based on #15855.

  1. Create a simple index containing a wildcard field
    PUT case_test
    {
        "mappings": {
            "properties": {
                "wildcard": {
                    "type": "wildcard"
                }
            }
        }
    }
    
  2. Bulk insert documents containing capital letters
    PUT _bulk?refresh=true
    {"index": {"_index": "case_test"}}
    {"wildcard": "TtAa"}
    {"index": {"_index": "case_test"}}
    {"wildcard": "ttaa"}
    {"index": {"_index": "case_test"}}
    {"wildcard": "TTAA"}
    
  3. Perform a term search with a asterisk character:
    POST case_test/_search
    {
        "query": {
            "term": {
                "wildcard": "*"
            }
        }
    }
    
  4. Check results

Expected behavior

It should not return any document, but it returns all documents.

Additional Details

Host/Environment (please complete the following information):

  • Version 2.18.0
@n9 n9 added bug Something isn't working untriaged labels Dec 2, 2024
@github-actions github-actions bot added the Search Search query, autocomplete ...etc label Dec 2, 2024
@n9 n9 changed the title [BUG] Term query on a wildcard field does perform exact match [BUG] Term query on a wildcard field does not perform exact match Dec 2, 2024
@msfroh
Copy link
Collaborator

msfroh commented Dec 2, 2024

Thanks @n9! That's definitely a bug.

I'm not exactly sure what I was thinking when I wrote this line. I think I must have only been thinking about term queries that don't contain wildcard characters.

I think it might be worthwhile to refactor WildcardFieldType to use a nice unified code path built around the regexp use-case.

Specifically:

  1. A term query can be turned into a simple regular expression (i.e. any syntax is escaped).
  2. A wildcard query can be converted to a regexp by replacing (unescaped) ? with . and (unescaped) * with .*. (Any other regexp characters would need to be escaped.)
  3. A prefix query can be converted to a regexp by escaping any syntax and appending .*.
  4. A terms query is just a bunch of escaped term expressions joined with the alternative (|) operator.

Then we can make sure that one code path works well with case-insensitive flags and appropriate approximation and automaton matching. As a bonus, that solution would address this issue and #16755.

Are you able to work on a fix? If not, would you be able to contribute a failing test case to https://github.com/opensearch-project/OpenSearch/blob/aaa92aeeb75a745ee7b058cc4b30b734e861cfc4/rest-api-spec/src/main/resources/rest-api-spec/test/search/270_wildcard_fieldtype_queries.yml?

@HUSTERGS
Copy link
Contributor

HUSTERGS commented Dec 5, 2024

@msfroh
I'm wondering whether you are already working on this problem. If not, I'm willing to try : )

@msfroh
Copy link
Collaborator

msfroh commented Dec 5, 2024

Hey @HUSTERGS ! I wanted to give @n9 the chance to take it, then I was planning to reach out to you to see if you are interested.

I'll go ahead and assign it to you. Thanks a lot!

@msfroh msfroh assigned HUSTERGS and unassigned msfroh Dec 5, 2024
@n9
Copy link
Author

n9 commented Dec 5, 2024

Hi @msfroh @HUSTERGS, sorry for the delay, I am currently out of time.

@HUSTERGS HUSTERGS linked a pull request Dec 11, 2024 that will close this issue
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Search Search query, autocomplete ...etc
Projects
Status: 🆕 New
Development

Successfully merging a pull request may close this issue.

4 participants