[fix](search) support minus/plus prefix modifiers in search DSL#60893
Closed
airborne12 wants to merge 2 commits intoapache:masterfrom
Closed
[fix](search) support minus/plus prefix modifiers in search DSL#60893airborne12 wants to merge 2 commits intoapache:masterfrom
airborne12 wants to merge 2 commits intoapache:masterfrom
Conversation
…field expansion In lucene mode with multi-field queries like '"phrase" OR *', the standalone wildcard '*' is parsed as MATCH_ALL_DOCS with occur=SHOULD. However, during the cross-field expansion (expandNodeCrossFields, deepCopyWithField, setFieldOnLeaves), new MATCH_ALL_DOCS nodes were created without preserving the occur attribute. This caused the BE to default to MUST, changing the query semantics from "phrase OR match_all = all docs" to "phrase AND match_all = only phrase matches", producing results inconsistent with ES. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add support for '-' (prohibited/MUST_NOT) and '+' (required/MUST) prefix modifiers in the search DSL, matching ES query_string syntax. Changes: - SearchLexer.g4: Add PLUS and MINUS lexer tokens - SearchParser.g4: Extend notClause rule with MINUS and PLUS alternatives - SearchDslParser.java: Handle MINUS in both standard and lucene mode visitNotClause; handle PLUS as isRequired in collectTermsFromNotClause and applyLuceneBooleanLogic; add isRequired field to TermWithOccur Examples: "-apple" → MUST_NOT(apple) (same as "NOT apple") "+apple fox" → MUST(apple) SHOULD(fox) "-apple -fox" → MUST_NOT(apple) MUST_NOT(fox) + MATCH_ALL_DOCS
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Member
Author
|
Closing: decided not to introduce raw Lucene +/- prefix syntax into search DSL. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Related PR: #60814
Problem Summary:
The search() function's DSL parser does not support
-(minus/prohibited)and
+(plus/required) prefix modifiers, which are standard Lucenequery_string syntax supported by ES. Queries like
-appleor+apple foxfail to parse or produce incorrect results.
Fix: Add PLUS and MINUS lexer tokens and parser grammar alternatives.
Handle
-as NOT/MUST_NOT and+as MUST in both standard and lucenemode visitors.
Examples:
-apple→ MUST_NOT(apple) (same asNOT apple)+apple fox→ MUST(apple) SHOULD(fox)-apple -fox→ MUST_NOT(apple) MUST_NOT(fox) + MATCH_ALL_DOCS injectionRelease note
Support
-(prohibited) and+(required) prefix modifiers in search() DSL, matching ES query_string syntax.Check List (For Author)
Test
Behavior changed:
-termnow parsed as NOT/MUST_NOT;+termparsed as required/MUST.Does this need documentation?
Check List (For Reviewer who merge this PR)