Skip to content

Conversation

@b41sh
Copy link
Member

@b41sh b41sh commented Dec 31, 2025

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

This PR addresses a bug in the inverted index implementation for JSON array fields where incorrect scorer logic led to wrong query results. When a search keyword matched all rows, tantivy query construct an AllScorer instead of the expected TermScorer for JSON array fields. This misselection broke the path resolution logic for JSON arrays since AllScorer does not retain the necessary path information for JSON array elements.

for example

CREATE TABLE t (id int, body variant, INVERTED INDEX idx (body));

INSERT INTO t VALUES
(1, '{"videoInfo":{"extraData":[{ "name": "codecA", "type": "mp4" },{ "name": "codecB", "type": "jpg" }]}}'),
(2, '{"videoInfo":{"extraData":[{ "name": "codecA", "type": "jpg" },{ "name": "codecA", "type": "mp4" }]}}'),
(3, '{"videoInfo":{"extraData":[{ "name": "codecA", "type": "jpg" },{ "name": "codecB", "type": "mp4" }]}}');

SELECT * FROM t WHERE QUERY('body.videoInfo.extraData.name:codecB AND body.videoInfo.extraData.type:jpg');
╭───────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│        id       │                                             body                                            │
│ Nullable(Int32) │                                      Nullable(Variant)                                      │
├─────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────┤
│               1 │ {"videoInfo":{"extraData":[{"name":"codecA","type":"mp4"},{"name":"codecB","type":"jpg"}]}} │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
  • fixes: #[Link the issue here]

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@github-actions github-actions bot added the pr-bugfix this PR patches a bug in codebase label Dec 31, 2025
@b41sh b41sh requested a review from sundy-li December 31, 2025 03:46
@bohutang bohutang merged commit 0adba06 into databendlabs:main Jan 3, 2026
108 of 110 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-bugfix this PR patches a bug in codebase

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants