Skip to content

fix(search): PathScope filter returns zero results on Volcengine VikingDB backend #1119

@gaomind

Description

@gaomind

Description

When using the Volcengine VikingDB cloud service as the vector database backend, POST /api/v1/search/find with a target_uri parameter returns zero results, while the same query without target_uri returns correct results.

Environment

  • OpenViking version: v0.2.15
  • Vector DB backend: volcengine (Volcengine VikingDB cloud service)
  • Storage backend: S3 (TOS)

Steps to Reproduce

  1. Configure ov.conf with Volcengine VikingDB backend:
    {
      "storage": {
        "vectordb": {
          "backend": "volcengine",
          "volcengine": { "region": "cn-beijing", "ak": "...", "sk": "..." }
        }
      }
    }
  2. Store a memory via session commit (POST /sessions/{id}/messages + POST /sessions/{id}/commit)
  3. Wait for memory extraction and embedding to complete (queue empty, no circuit breaker)
  4. Search without target_uri:
    curl -X POST /api/v1/search/find \
      -d '{"query": "basketball", "limit": 5}'
    # Returns results (e.g., total: 5, score ~0.5)
  5. Search with target_uri:
    curl -X POST /api/v1/search/find \
      -d '{"query": "basketball", "target_uri": "viking://user/memories", "limit": 5}'
    # Returns 0 results

Expected Behavior

Both queries should return the same memory results — target_uri should narrow the scope, not eliminate all results.

Actual Behavior

Any query with target_uri returns total: 0.

Root Cause Analysis

Traced the code path:

  1. target_uri_build_scope_filter() in viking_vector_index_backend.py (line ~1023)
  2. Creates PathScope("uri", target_dir, depth=-1) filter expression
  3. Compiled to DSL in vectordb_adapters/base.py (line ~313):
    {"op": "must", "field": "uri", "conds": ["/user/<user_id>/memories"], "para": "-d=-1"}
  4. Passed to coll.search_by_vector(filters=...) via Volcengine SDK

The "para": "-d=-1" parameter (hierarchical depth for path-scoped queries) appears to be not supported by the Volcengine VikingDB cloud service SDK. The search_by_vector method silently returns zero matches for this filter format.

Evidence: Even an exact URI match ({"op": "must", "field": "uri", "conds": ["/exact/path/to/file.md"]}) returns 0 results when passed through the same filter pipeline, confirming the issue is in how the Volcengine SDK processes the "must" + "para" filter.

Impact

  • POST /api/v1/search/find with target_uri is non-functional on Volcengine VikingDB
  • The OpenClaw plugin's auto-recall uses target_uri to scope searches, making memory recall broken
  • Workaround: Remove target_uri from find requests (searches all scopes globally)

Suggested Fix

Option A (recommended): Override _compile_filter in VolcengineCollectionAdapter to convert PathScope into a filter format that Volcengine SDK supports.

Option B: Fall back to unscoped search when PathScope filter yields zero results, then filter results client-side by URI prefix.

Minimal Reproduction

import httpx

BASE = "http://localhost:1933"
HEADERS = {"Authorization": "Bearer <root_key>",
           "X-OpenViking-Account": "default",
           "X-OpenViking-User": "<user_id>",
           "X-OpenViking-Agent": "<agent_id>"}

# Works (returns results):
r1 = httpx.post(f"{BASE}/api/v1/search/find", headers=HEADERS,
    json={"query": "test", "limit": 5})
print(r1.json()["result"]["total"])  # > 0

# Broken (returns 0):
r2 = httpx.post(f"{BASE}/api/v1/search/find", headers=HEADERS,
    json={"query": "test", "target_uri": "viking://user/memories", "limit": 5})
print(r2.json()["result"]["total"])  # Always 0

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

Backlog

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions