Skip to content

RERANK_CONTEXT_SIZE (2048) too small — qmd query crashes on CJK content #291

@SDNAnyFlow

Description

@SDNAnyFlow

Summary

qmd query crashes during reranking when the combined input (query + document chunk + Qwen3 template overhead) exceeds RERANK_CONTEXT_SIZE = 2048. The error is deterministic and reproducible.

Environment

  • QMD version: 1.1.0 (also reproduced on 1.0.7)
  • OS: Rocky Linux 9 (x86_64)
  • Node.js: v22.22.0
  • GPU: NVIDIA RTX 3090 (24GB VRAM)
  • Content: ~345 markdown files, primarily CJK (Chinese) text
  • Index: 1386 chunks from 338 documents

Error

$ qmd query "test" --json
Searching 6 queries...
Reranking 40 chunks...

Error: The input lengths of some of the given documents exceed the context size.
Try to increase the context size to at least 2099
    at LlamaRankingContext.rankAll (...LlamaRankingContext.js:50:19)
    at LlamaCpp.rerank (...llm.js:751:82)

Root Cause

In src/llm.ts:

static RERANK_CONTEXT_SIZE = 2048;

The reranker input = query tokens + chunk tokens + Qwen3 template overhead (~200). The comment says chunks are capped at ~800 tokens so ~1100 should fit, but:

  1. CJK tokenization produces different token counts — a chunk ~900 tokens in the embedding tokenizer may be longer in the Qwen3 reranker tokenizer.
  2. Query expansion generates HyDE documents 100+ tokens, pushing total past 2048.
  3. The error requests at least 2099 — only 51 tokens over.

Workaround

Changing RERANK_CONTEXT_SIZE to 4096 in dist/llm.js resolves the issue.

Suggested Fix

  1. Increase default to 4096 (safest, modest VRAM cost)
  2. Dynamic sizing: compute required context from actual longest (query + chunk) pair
  3. Graceful fallback: skip oversized chunks during reranking instead of crashing (use retrieval score)

Option 3 is most robust.

Related

  • Changelog v1.0.0: right-sized reranker context (40960 to 2048, 17x less memory)
  • The reduction was too aggressive for CJK content with long query expansions

Thank you for building QMD!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions