-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
System Info
git main branch
Information
- The official example scripts
- My own modified scripts
🐛 Describe the bug
A regression introduced in commit 3de9ad0a breaks Ollama model listing when using inference recording mode. see below
reset HEAD to the commit before this solves the problem
(base) derekh@laptop:~/workarea/llama-stack$ git reset --hard 3de9ad0a
HEAD is now at 3de9ad0a chore(recorder, tests): add test for openai /v1/models (#3426)
(base) derekh@laptop:~/workarea/llama-stack$
(base) derekh@laptop:~/workarea/llama-stack$ LLAMA_STACK_TEST_INFERENCE_MODE=record OLLAMA_URL='http://0.0.0.0:11434' SAFETY_MODEL='ollama/llama-guard3:1b' llama stack run --image-type venv --image-name ci-tests llama_stack/distributions/ci-tests/run.yaml
INFO 2025-09-16 17:15:42,874 llama_stack.core.utils.config_resolution:45 core: Using file path: llama_stack/distributions/ci-tests/run.yaml
INFO 2025-09-16 17:15:42,876 llama_stack.cli.stack.run:129 cli: Using run configuration:
/home/derekh/workarea/llama-stack/llama_stack/distributions/ci-tests/run.yaml
Using virtual environment: ci-tests
+ '[' -n /home/derekh/workarea/llama-stack/llama_stack/distributions/ci-tests/run.yaml ']'
+ yaml_config_arg=/home/derekh/workarea/llama-stack/llama_stack/distributions/ci-tests/run.yaml
+ python -m llama_stack.core.server.server /home/derekh/workarea/llama-stack/llama_stack/distributions/ci-tests/run.yaml --port 8321
INFO 2025-09-16 17:15:44,331 llama_stack.core.utils.config_resolution:45 core: Using file path:
/home/derekh/workarea/llama-stack/llama_stack/distributions/ci-tests/run.yaml
INFO 2025-09-16 17:15:44,351 __main__:618 core::server: Run configuration:
INFO 2025-09-16 17:15:44,359 __main__:621 core::server: apis:
- agents
- batches
- datasetio
- eval
- files
- inference
- post_training
- safety
- scoring
- telemetry
- tool_runtime
- vector_io
benchmarks: []
datasets: []
image_name: ci-tests
inference_store:
db_path: /home/derekh/.llama/distributions/ci-tests/inference_store.db
type: sqlite
metadata_store:
db_path: /home/derekh/.llama/distributions/ci-tests/registry.db
type: sqlite
models: []
providers:
agents:
- config:
persistence_store:
db_path: /home/derekh/.llama/distributions/ci-tests/agents_store.db
type: sqlite
responses_store:
db_path: /home/derekh/.llama/distributions/ci-tests/responses_store.db
type: sqlite
provider_id: meta-reference
provider_type: inline::meta-reference
batches:
- config:
kvstore:
db_path: /home/derekh/.llama/distributions/ci-tests/batches.db
type: sqlite
provider_id: reference
provider_type: inline::reference
datasetio:
- config:
kvstore:
db_path: /home/derekh/.llama/distributions/ci-tests/huggingface_datasetio.db
type: sqlite
provider_id: huggingface
provider_type: remote::huggingface
- config:
kvstore:
db_path: /home/derekh/.llama/distributions/ci-tests/localfs_datasetio.db
type: sqlite
provider_id: localfs
provider_type: inline::localfs
eval:
- config:
kvstore:
db_path: /home/derekh/.llama/distributions/ci-tests/meta_reference_eval.db
type: sqlite
provider_id: meta-reference
provider_type: inline::meta-reference
files:
- config:
metadata_store:
db_path: /home/derekh/.llama/distributions/ci-tests/files_metadata.db
type: sqlite
storage_dir: /home/derekh/.llama/distributions/ci-tests/files
provider_id: meta-reference-files
provider_type: inline::localfs
inference:
- config:
url: http://0.0.0.0:11434
provider_id: ollama
provider_type: remote::ollama
- config:
api_key: '********'
url: https://api.fireworks.ai/inference/v1
provider_id: fireworks
provider_type: remote::fireworks
- config:
api_key: '********'
url: https://api.together.xyz/v1
provider_id: together
provider_type: remote::together
- config: {}
provider_id: bedrock
provider_type: remote::bedrock
- config:
api_key: '********'
base_url: https://llama-3-2-3b-maas-apicast-production.apps.prod.rhoai.rh-aiservices-bu.com:443
provider_id: openai
provider_type: remote::openai
- config:
api_key: '********'
provider_id: anthropic
provider_type: remote::anthropic
- config:
api_key: '********'
provider_id: gemini
provider_type: remote::gemini
- config:
api_key: '********'
url: https://api.groq.com
provider_id: groq
provider_type: remote::groq
- config:
api_key: '********'
url: https://api.sambanova.ai/v1
provider_id: sambanova
provider_type: remote::sambanova
- config: {}
provider_id: sentence-transformers
provider_type: inline::sentence-transformers
post_training:
- config:
checkpoint_format: meta
provider_id: torchtune-cpu
provider_type: inline::torchtune-cpu
safety:
- config:
excluded_categories: []
provider_id: llama-guard
provider_type: inline::llama-guard
- config: {}
provider_id: code-scanner
provider_type: inline::code-scanner
scoring:
- config: {}
provider_id: basic
provider_type: inline::basic
- config: {}
provider_id: llm-as-judge
provider_type: inline::llm-as-judge
- config:
openai_api_key: '********'
provider_id: braintrust
provider_type: inline::braintrust
telemetry:
- config:
service_name: "\u200B"
sinks: console,sqlite
sqlite_db_path: /home/derekh/.llama/distributions/ci-tests/trace_store.db
provider_id: meta-reference
provider_type: inline::meta-reference
tool_runtime:
- config:
api_key: '********'
max_results: 3
provider_id: brave-search
provider_type: remote::brave-search
- config:
api_key: '********'
max_results: 3
provider_id: tavily-search
provider_type: remote::tavily-search
- config: {}
provider_id: rag-runtime
provider_type: inline::rag-runtime
- config: {}
provider_id: model-context-protocol
provider_type: remote::model-context-protocol
vector_io:
- config:
kvstore:
db_path: /home/derekh/.llama/distributions/ci-tests/faiss_store.db
type: sqlite
provider_id: faiss
provider_type: inline::faiss
- config:
db_path: /home/derekh/.llama/distributions/ci-tests/sqlite_vec.db
kvstore:
db_path: /home/derekh/.llama/distributions/ci-tests/sqlite_vec_registry.db
type: sqlite
provider_id: sqlite-vec
provider_type: inline::sqlite-vec
scoring_fns: []
server:
port: 8321
shields:
- provider_id: llama-guard
provider_shield_id: ollama/llama-guard3:1b
shield_id: llama-guard
tool_groups:
- provider_id: tavily-search
toolgroup_id: builtin::websearch
- provider_id: rag-runtime
toolgroup_id: builtin::rag
vector_dbs: []
version: 2
INFO 2025-09-16 17:15:44,594 llama_stack.core.stack:330 core: Inference recording enabled: mode=record
WARNING 2025-09-16 17:15:44,650 llama_stack.core.distribution:149 core: Failed to import module prompts: No module named
'llama_stack.providers.registry.prompts'
INFO 2025-09-16 17:15:44,683 llama_stack.providers.remote.inference.ollama.ollama:120 inference::ollama: checking connectivity to Ollama at
`http://0.0.0.0:11434`...
INFO 2025-09-16 17:15:46,592 llama_stack.providers.utils.inference.inference_store:74 inference_store: Write queue disabled for SQLite to avoid
concurrency issues
ERROR 2025-09-16 17:15:47,051 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider ollama: 'tuple'
object has no attribute 'model'
╭──────────────────────────────────────────────────── Traceback (most recent call last) ────────────────────────────────────────────────────╮
│ /home/derekh/workarea/llama-stack/llama_stack/core/routing_tables/models.py:34 in refresh │
│ │
│ 31 │ │ │ │ continue │
│ 32 │ │ │ │
│ 33 │ │ │ try: │
│ ❱ 34 │ │ │ │ models = await provider.list_models() │
│ 35 │ │ │ except Exception as e: │
│ 36 │ │ │ │ logger.exception(f"Model refresh failed for provider {provider_id}: │
│ {e}") │
│ 37 │ │ │ │ continue │
│ │
│ /home/derekh/workarea/llama-stack/llama_stack/providers/utils/telemetry/trace_protocol.py:101 in async_wrapper │
│ │
│ 98 │ │ │ │
│ 99 │ │ │ with tracing.span(f"{class_name}.{method_name}", span_attributes) as span: │
│ 100 │ │ │ │ try: │
│ ❱ 101 │ │ │ │ │ result = await method(self, *args, **kwargs) │
│ 102 │ │ │ │ │ span.set_attribute("output", serialize_value(result)) │
│ 103 │ │ │ │ │ return result │
│ 104 │ │ │ │ except Exception as e: │
│ │
│ /home/derekh/workarea/llama-stack/llama_stack/providers/remote/inference/ollama/ollama.py:132 in list_models │
│ │
│ 129 │ │
│ 130 │ async def list_models(self) -> list[Model] | None: │
│ 131 │ │ provider_id = self.__provider_id__ │
│ ❱ 132 │ │ response = await self.client.list() │
│ 133 │ │ │
│ 134 │ │ # always add the two embedding models which can be pulled on demand │
│ 135 │ │ models = [ │
│ │
│ /home/derekh/workarea/llama-stack/llama_stack/testing/inference_recorder.py:425 in patched_ollama_list │
│ │
│ 422 │ │ ) │
│ 423 │ │
│ 424 │ async def patched_ollama_list(self, *args, **kwargs): │
│ ❱ 425 │ │ return await _patched_inference_method( │
│ 426 │ │ │ _original_methods["ollama_list"], self, "ollama", "/api/tags", *args, │
│ **kwargs │
│ 427 │ │ ) │
│ 428 │
│ │
│ /home/derekh/workarea/llama-stack/llama_stack/testing/inference_recorder.py:336 in _patched_inference_method │
│ │
│ 333 │ │ │ return replay_recorded_stream() │
│ 334 │ │ else: │
│ 335 │ │ │ response_data = {"body": response, "is_streaming": False} │
│ ❱ 336 │ │ │ _current_storage.store_recording(request_hash, request_data, response_data) │
│ 337 │ │ │ return response │
│ 338 │ │
│ 339 │ else: │
│ │
│ /home/derekh/workarea/llama-stack/llama_stack/testing/inference_recorder.py:151 in store_recording │
│ │
│ 148 │ │ # If this is an Ollama /api/tags recording, include models digest in filename to │
│ distinguish variants │
│ 149 │ │ endpoint = request.get("endpoint") │
│ 150 │ │ if endpoint in ("/api/tags", "/v1/models"): │
│ ❱ 151 │ │ │ digest = _model_identifiers_digest(endpoint, response) │
│ 152 │ │ │ response_file = f"models-{short_hash}-{digest}.json" │
│ 153 │ │ │
│ 154 │ │ response_path = self.responses_dir / response_file │
│ │
│ /home/derekh/workarea/llama-stack/llama_stack/testing/inference_recorder.py:209 in _model_identifiers_digest │
│ │
│ 206 │ │ idents = [m.model if endpoint == "/api/tags" else m.id for m in items] │
│ 207 │ │ return sorted(set(idents)) │
│ 208 │ │
│ ❱ 209 │ identifiers = _extract_model_identifiers() │
│ 210 │ return hashlib.sha256(("|".join(identifiers)).encode("utf-8")).hexdigest()[:8] │
│ 211 │
│ 212 │
│ │
│ /home/derekh/workarea/llama-stack/llama_stack/testing/inference_recorder.py:206 in _extract_model_identifiers │
│ │
│ 203 │ │ Returns a list of unique identifiers or None if structure doesn't match. │
│ 204 │ │ """ │
│ 205 │ │ items = response["body"] │
│ ❱ 206 │ │ idents = [m.model if endpoint == "/api/tags" else m.id for m in items] │
│ 207 │ │ return sorted(set(idents)) │
│ 208 │ │
│ 209 │ identifiers = _extract_model_identifiers() │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
AttributeError: 'tuple' object has no attribute 'model'
INFO 2025-09-16 17:15:48,123 llama_stack.core.stack:400 core: starting registry refresh task
INFO 2025-09-16 17:15:48,334 __main__:583 core::server: Listening on ['::', '0.0.0.0']:8321
INFO 2025-09-16 17:15:48,349 uvicorn.error:84 uncategorized: Started server process [3008061]
INFO 2025-09-16 17:15:48,350 uvicorn.error:48 uncategorized: Waiting for application startup.
INFO 2025-09-16 17:15:48,351 __main__:170 core::server: Starting up
INFO 2025-09-16 17:15:48,352 uvicorn.error:62 uncategorized: Application startup complete.
INFO 2025-09-16 17:15:48,353 uvicorn.error:216 uncategorized: Uvicorn running on http://['::', '0.0.0.0']:8321 (Press CTRL+C to quit)
ERROR 2025-09-16 17:15:48,355 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider ollama: 'tuple'
object has no attribute 'model'
╭──────────────────────────────────────────────────── Traceback (most recent call last) ────────────────────────────────────────────────────╮
│ /home/derekh/workarea/llama-stack/llama_stack/core/routing_tables/models.py:34 in refresh │
│ │
│ 31 │ │ │ │ continue │
│ 32 │ │ │ │
│ 33 │ │ │ try: │
│ ❱ 34 │ │ │ │ models = await provider.list_models() │
│ 35 │ │ │ except Exception as e: │
│ 36 │ │ │ │ logger.exception(f"Model refresh failed for provider {provider_id}: │
│ {e}") │
│ 37 │ │ │ │ continue │
│ │
│ /home/derekh/workarea/llama-stack/llama_stack/providers/utils/telemetry/trace_protocol.py:101 in async_wrapper │
│ │
│ 98 │ │ │ │
│ 99 │ │ │ with tracing.span(f"{class_name}.{method_name}", span_attributes) as span: │
│ 100 │ │ │ │ try: │
│ ❱ 101 │ │ │ │ │ result = await method(self, *args, **kwargs) │
│ 102 │ │ │ │ │ span.set_attribute("output", serialize_value(result)) │
│ 103 │ │ │ │ │ return result │
│ 104 │ │ │ │ except Exception as e: │
│ │
│ /home/derekh/workarea/llama-stack/llama_stack/providers/remote/inference/ollama/ollama.py:132 in list_models │
│ │
│ 129 │ │
│ 130 │ async def list_models(self) -> list[Model] | None: │
│ 131 │ │ provider_id = self.__provider_id__ │
│ ❱ 132 │ │ response = await self.client.list() │
│ 133 │ │ │
│ 134 │ │ # always add the two embedding models which can be pulled on demand │
│ 135 │ │ models = [ │
│ │
│ /home/derekh/workarea/llama-stack/llama_stack/testing/inference_recorder.py:425 in patched_ollama_list │
│ │
│ 422 │ │ ) │
│ 423 │ │
│ 424 │ async def patched_ollama_list(self, *args, **kwargs): │
│ ❱ 425 │ │ return await _patched_inference_method( │
│ 426 │ │ │ _original_methods["ollama_list"], self, "ollama", "/api/tags", *args, │
│ **kwargs │
│ 427 │ │ ) │
│ 428 │
│ │
│ /home/derekh/workarea/llama-stack/llama_stack/testing/inference_recorder.py:336 in _patched_inference_method │
│ │
│ 333 │ │ │ return replay_recorded_stream() │
│ 334 │ │ else: │
│ 335 │ │ │ response_data = {"body": response, "is_streaming": False} │
│ ❱ 336 │ │ │ _current_storage.store_recording(request_hash, request_data, response_data) │
│ 337 │ │ │ return response │
│ 338 │ │
│ 339 │ else: │
│ │
│ /home/derekh/workarea/llama-stack/llama_stack/testing/inference_recorder.py:151 in store_recording │
│ │
│ 148 │ │ # If this is an Ollama /api/tags recording, include models digest in filename to │
│ distinguish variants │
│ 149 │ │ endpoint = request.get("endpoint") │
│ 150 │ │ if endpoint in ("/api/tags", "/v1/models"): │
│ ❱ 151 │ │ │ digest = _model_identifiers_digest(endpoint, response) │
│ 152 │ │ │ response_file = f"models-{short_hash}-{digest}.json" │
│ 153 │ │ │
│ 154 │ │ response_path = self.responses_dir / response_file │
│ │
│ /home/derekh/workarea/llama-stack/llama_stack/testing/inference_recorder.py:209 in _model_identifiers_digest │
│ │
│ 206 │ │ idents = [m.model if endpoint == "/api/tags" else m.id for m in items] │
│ 207 │ │ return sorted(set(idents)) │
│ 208 │ │
│ ❱ 209 │ identifiers = _extract_model_identifiers() │
│ 210 │ return hashlib.sha256(("|".join(identifiers)).encode("utf-8")).hexdigest()[:8] │
│ 211 │
│ 212 │
│ │
│ /home/derekh/workarea/llama-stack/llama_stack/testing/inference_recorder.py:206 in _extract_model_identifiers │
│ │
│ 203 │ │ Returns a list of unique identifiers or None if structure doesn't match. │
│ 204 │ │ """ │
│ 205 │ │ items = response["body"] │
│ ❱ 206 │ │ idents = [m.model if endpoint == "/api/tags" else m.id for m in items] │
│ 207 │ │ return sorted(set(idents)) │
│ 208 │ │
│ 209 │ identifiers = _extract_model_identifiers() │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
AttributeError: 'tuple' object has no attribute 'model'
Error logs
/
Expected behavior
lls should start normally
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working