Skip to content

[fix](search) support minus/plus prefix modifiers in search DSL#60893

Closed
airborne12 wants to merge 2 commits intoapache:masterfrom
airborne12:fix-search-minus-plus-prefix
Closed

[fix](search) support minus/plus prefix modifiers in search DSL#60893
airborne12 wants to merge 2 commits intoapache:masterfrom
airborne12:fix-search-minus-plus-prefix

Conversation

@airborne12
Copy link
Member

What problem does this PR solve?

Related PR: #60814

Problem Summary:
The search() function's DSL parser does not support - (minus/prohibited)
and + (plus/required) prefix modifiers, which are standard Lucene
query_string syntax supported by ES. Queries like -apple or +apple fox
fail to parse or produce incorrect results.

Fix: Add PLUS and MINUS lexer tokens and parser grammar alternatives.
Handle - as NOT/MUST_NOT and + as MUST in both standard and lucene
mode visitors.

Examples:

  • -apple → MUST_NOT(apple) (same as NOT apple)
  • +apple fox → MUST(apple) SHOULD(fox)
  • -apple -fox → MUST_NOT(apple) MUST_NOT(fox) + MATCH_ALL_DOCS injection

Release note

Support - (prohibited) and + (required) prefix modifiers in search() DSL, matching ES query_string syntax.

Check List (For Author)

  • Test

    • Unit Test
    • Regression test
    • Manual test
    • No need to test
  • Behavior changed:

    • Yes. -term now parsed as NOT/MUST_NOT; +term parsed as required/MUST.
  • Does this need documentation?

    • No.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

airborne12 and others added 2 commits February 27, 2026 12:33
…field expansion

In lucene mode with multi-field queries like '"phrase" OR *', the standalone
wildcard '*' is parsed as MATCH_ALL_DOCS with occur=SHOULD. However, during
the cross-field expansion (expandNodeCrossFields, deepCopyWithField,
setFieldOnLeaves), new MATCH_ALL_DOCS nodes were created without preserving
the occur attribute. This caused the BE to default to MUST, changing the
query semantics from "phrase OR match_all = all docs" to "phrase AND
match_all = only phrase matches", producing results inconsistent with ES.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add support for '-' (prohibited/MUST_NOT) and '+' (required/MUST) prefix
modifiers in the search DSL, matching ES query_string syntax.

Changes:
- SearchLexer.g4: Add PLUS and MINUS lexer tokens
- SearchParser.g4: Extend notClause rule with MINUS and PLUS alternatives
- SearchDslParser.java: Handle MINUS in both standard and lucene mode
  visitNotClause; handle PLUS as isRequired in collectTermsFromNotClause
  and applyLuceneBooleanLogic; add isRequired field to TermWithOccur

Examples:
  "-apple" → MUST_NOT(apple)      (same as "NOT apple")
  "+apple fox" → MUST(apple) SHOULD(fox)
  "-apple -fox" → MUST_NOT(apple) MUST_NOT(fox) + MATCH_ALL_DOCS
@Thearas
Copy link
Contributor

Thearas commented Feb 27, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@airborne12
Copy link
Member Author

Closing: decided not to introduce raw Lucene +/- prefix syntax into search DSL.

@airborne12 airborne12 closed this Feb 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants