Skip to content

feat(knowledge): Redis memory graph schema — entity/relation indexes and chat integration (#3385)#3608

Closed
mrveiss wants to merge 2 commits intoDev_new_guifrom
issue-3385
Closed

feat(knowledge): Redis memory graph schema — entity/relation indexes and chat integration (#3385)#3608
mrveiss wants to merge 2 commits intoDev_new_guifrom
issue-3385

Conversation

@mrveiss
Copy link
Copy Markdown
Owner

@mrveiss mrveiss commented Apr 6, 2026

Closes #3385

Implements Week 1 foundation from docs/database/REDIS_MEMORY_GRAPH_SPECIFICATION.md.

Summary

  • schema.py — Redis key patterns (memory:entity:, memory:relations:out/in:), index names, ENTITY_TYPES, RELATION_TYPES, ensure_indexes() (issues FT.CREATE memory_entity_idx and FT.CREATE memory_fulltext_idx matching spec exactly)
  • graph_store.pycreate_entity(), get_entity(), create_relation() (bidirectional), get_outgoing_relations(), get_incoming_relations(), traverse_relations() (BFS, depth-limited, cycle-safe)
  • __init__.py — Package re-exports public API

Tests

28 unit tests, all passing (graph_store_test.py). Redis fully mocked.

Design decisions

  • Uses database="knowledge" (DB 1) — consistent with existing autobot_memory_graph package
  • All functions standalone async for testability
  • Key patterns exactly match spec

Out of scope (Weeks 2–4)

Multi-hop query patterns, conversation migration, and chat integration will be addressed in follow-up issues.

@mrveiss
Copy link
Copy Markdown
Owner Author

mrveiss commented Apr 6, 2026

Review fixes applied ✓

  • Critical — hardcoded "autobot" user_id removed from create_entity() base metadata
  • Critical_init_relation_doc replaced EXISTS+SET two-step with atomic JSON.SET ... NX; test mock updated to handle NX flag
  • Critical__init__.py unified to export both graph_store/schema symbols and query_processor/hybrid_scorer symbols (with try/except ImportError guard for the latter until feat(knowledge): memory graph semantic search — query processor and hybrid scoring (#3384) #3609 merges)
  • Minor — docstring corrected from "Redis DB 2" to "Redis DB 1 (knowledge)"
  • Bonusisinstance(entity, Exception) narrowed to isinstance(entity, BaseException) to fix type checker warning in traverse_relations

All 28 tests passing.

mrveiss added a commit that referenced this pull request Apr 6, 2026
Sync tasks lacked notify — service ran stale code after rsync, causing 502
on /api/health and login failures. All three sync tasks now notify handlers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@mrveiss
Copy link
Copy Markdown
Owner Author

mrveiss commented Apr 6, 2026

Superseded by #3616 which consolidates all memory graph code (including semantic search) into the canonical autobot_memory_graph package. A compat shim at knowledge/memory_graph/__init__.py re-exports the unified API.

@mrveiss mrveiss closed this Apr 6, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 7, 2026

✅ SSOT Configuration Compliance: Passing

🎉 No hardcoded values detected that have SSOT config equivalents!

mrveiss added a commit that referenced this pull request Apr 8, 2026
* feat(ansible): add sync-code-source.yml to push code from controller to SLM (#3604)

Adds a new playbook that rsync's the dev machine's local checkout to
/opt/autobot/code_source on the SLM manager.  Integrated into
deploy-slm-manager.yml as a no-op step when controller_repo_root is
not set, and active when passed via -e:

  ansible-playbook ... -e "controller_repo_root=/path/to/AutoBot-AI"

Closes the offline-deployment gap where the pre-flight GitHub pull
silently falls back to stale code_source, causing fixes that exist in
the local repo to never reach the SLM.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ansible): replace fragile group-based SLM detection in Phase 4c with filesystem stat

The _is_slm_manager check relied on inventory group membership
(slm_server/slm) or node_roles values, both of which are absent in
temp inventories generated by per-node provisioning from the SLM UI.
This caused Phase 4c to skip the nginx co-location re-render entirely,
leaving / → /slm/ redirect intact even when the user frontend was
deployed on the same host.

Replace with filesystem-based detection:
  1. Stat /etc/nginx/sites-available/autobot-slm to identify the SLM manager
  2. Stat autobot-frontend/package.json to confirm co-location

Both stats work regardless of inventory shape (wizard, per-node, static).
Also remove the role_path fallback in the template src — the playbook_dir
relative path is always correct and avoids a None-based string.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ansible): use set_fact to override slm_colocated_frontend — include_vars beats task vars

include_vars has Ansible precedence 18 while task-level vars: has 17.
Loading slm_manager/defaults/main.yml (which has slm_colocated_frontend: false)
via include_vars was silently overriding vars: slm_colocated_frontend: true
on the template task, so the template always rendered in non-co-located mode.

Replace vars: on the template task with an explicit set_fact (precedence 19)
that runs after include_vars, ensuring the template sees true.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ansible): fix code_source ownership before git reset to unblock pre-flight sync

Earlier sudo cp commands left provision-fleet-roles.yml owned by root:root
in code_source, causing git reset --hard to fail with EPERM. Add a
pre-flight chown (become: true) that runs before the git pull so force: yes
can always overwrite any root-owned file regardless of how it got there.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ansible): preserve wizard's slm_colocated_frontend in role set_fact; fix include_vars precedence in Phase 4c

slm_manager role set_fact was unconditionally overwriting slm_colocated_frontend
with the file-detection result, discarding the True value the wizard set in the
inventory.  Change to OR expression so a wizard-supplied True is preserved.

Phase 4c had include_vars (precedence 18) loading defaults with false, then
task vars: (precedence 17) trying to set true — always losing.  Add an explicit
set_fact after include_vars so the correct value wins.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ansible): restart backend+celery when code is synced (#3608)

Sync tasks lacked notify — service ran stale code after rsync, causing 502
on /api/health and login failures. All three sync tasks now notify handlers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ansible): correct systemd startup ordering for all core services (#3609)

- autobot-backend: redis.service → redis-stack-server.service (was silently
  ignored by systemd); add Wants= + network-online.target
- autobot-celery: same redis fix; keep After=autobot-backend.service
- autobot-slm-backend: add redis-stack-server + postgresql dependencies
  (had none — could race both on every boot)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(rag): per-user annotation signals for personalized RAG retrieval (#3240) (#3610)

* feat(rag): per-user annotation signals for personalized RAG retrieval (#3240)

* fix(rag): export GLOBAL_USER publicly, fix query_text field, Literal validation, move imports (#3240)

* fix(backend): replace naive datetime.now() with UTC-aware calls (#3613) (#3615)

* fix(backend): replace naive datetime.now() with UTC-aware calls throughout (#3613)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(backend): fix aware/naive datetime comparisons in agent_client + auth_middleware (#3613)

- agent_client.py: default sentinel for last_health_check was datetime.min (naive);
  cold-start subtraction from datetime.now(tz=utc) raised TypeError — use
  datetime.min.replace(tzinfo=timezone.utc) instead
- auth_middleware.py: fromisoformat() of pre-migration Redis strings returns naive;
  normalize to UTC before comparing against now(tz=utc)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(memory-graph): consolidate into single autobot_memory_graph with semantic search (#3612) (#3616)

* feat(memory-graph): add semantic search and hybrid scoring to autobot_memory_graph (#3612)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(memory-graph): fix _parse_ft_results, SECRET_USAGE pattern, ssot_config.get, asyncio.run (#3612)

- _parse_ft_results now extracts keys and _redis_search calls _fetch_entities_by_keys
- Add SECRET_USAGE to secret/credential pattern in _ENTITY_TYPE_PATTERNS
- Replace ssot_config.get() with getattr() (Pydantic model, not dict)
- Replace asyncio.get_event_loop().run_until_complete() with asyncio.run() in all 6 test sites

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(constants): add TTL_1_HOUR/TTL_5_MINUTES and replace raw cache TTL literals (#3614) (#3617)

* refactor(constants): add TTL_1_HOUR/TTL_5_MINUTES and replace raw cache TTL literals (#3614)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(constants): fix orphan TTL import, residual literals, noqa suppressions in #3617

- router.py: replace raw 300 with TTL_5_MINUTES (resolves orphan import)
- advanced_cache_manager.py: add TTL_1_HOUR import, replace all 8 raw 300/3600 literals
- embedding_cache.py: remove incorrect # noqa: F401 comment
- token_optimizer.py: remove incorrect # noqa: F401 comment

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(slm): add admin password reset to user management UI (#3625)

* feat(slm): add admin password reset to user management UI (#3625)

- POST /slm-users/{id}/change-password and /autobot-users/{id}/change-password
  backend endpoints using UserService.change_password(require_current=False)
- changeSlmUserPassword() + changeAutobotUserPassword() in useSlmUserApi.ts
- PasswordChangeForm: apiEndpoint prop replaces hardcoded /api/users/ URL
- UserManagementSettings: key (lock) icon in each user row opens reset modal;
  selectedUserType tracks slm/autobot/legacy to route to correct endpoint

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(slm): remove unreachable InvalidCredentialsError handler; use getSlmApiBase() for password change URLs (#3625)

- Drop except InvalidCredentialsError blocks — require_current=False means the
  service never raises it; dead code flagged in code review
- passwordChangeApiEndpoint now uses getSlmApiBase() from ssot-config instead
  of hardcoded /api/ prefix, fixing co-located /slm/api deployments

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(frontend): add getApiBase() to ssot-config for configurable API prefix (#3628)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(backend): use local time in is_business_hours and is_weekend (#3619)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(backend): normalize naive datetime from Redis in knowledge_manager (#3618)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(backend): normalize naive datetimes from Redis in stats and analytics (#3620)

Closes #3620

Pre-#3615 Redis data stores naive datetime strings. After fromisoformat()
parsing, add UTC tzinfo when tzinfo is None to prevent TypeError when
comparing with UTC-aware datetime.now(tz=timezone.utc).

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(slm-frontend): replace hardcoded /api/ with getSlmApiBase() in 3 remaining files (#3627)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(frontend): adopt getApiBase() in composables (#3630)

- Added getApiBase() to ssot-config.ts (prerequisite from #3628)
- Replaced ~133 hardcoded '/api/' occurrences across 36 composable files
- All replacements use template literals: `${getApiBase()}/path`
- Imports added or extended in each modified file

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(frontend): adopt getApiBase() in service and utils layer (#3629)

* chore(frontend): adopt getApiBase() in service and utils layer (#3629)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(frontend): remove duplicate getApiBase() — defined in #3632 (#3629)

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(frontend): adopt getApiBase() in stores, models, and components (#3631) (#3641)

- Added getApiBase() to ssot-config.ts (prerequisite from #3628)
- Replaced 110 hardcoded /api/ occurrences across 8 files:
  - stores/useKnowledgeStore.ts (7 paths)
  - stores/usePermissionStore.ts (10 paths)
  - stores/useUserStore.ts (2 paths)
  - models/repositories/KnowledgeRepository.ts (36 paths)
  - models/repositories/ChatRepository.ts (12 paths)
  - models/repositories/SystemRepository.ts (37 paths)
  - models/repositories/ApiRepository.ts (5 paths incl. cache keys)
  - components/ChatInterface.ts (1 path)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: replace asyncio.get_event_loop() with asyncio.run() (#3605)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(helpers): add extra_data support to TaskResult (#3564)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(helpers): remove duplicate raise_not_found_error in catalog_http_exceptions (#3566)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(helpers): cap exponential backoff via TimingConstants (#3565)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(backend): remove dead AgentThresholds import and replace magic 0.8 in orchestrator.py (#3607)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(agents): declare _get_system_prompt @abstractmethod in StandardizedAgent (#3606)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(ci): align isort line_length with Black at 120 in pyproject.toml (#3408)

Align pyproject.toml isort line_length from 100 to 120 to match Black's
line-length and the explicit --line-length=120 args in pre-commit and CI.
Black and isort hooks were already present in both pre-commit-config.yaml
and code-quality.yml; this fix resolves the config inconsistency that
caused isort --settings-path=. to use a conflicting line_length value.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ansible): sync standalone autobot_shared on backend deploy (#3649)

The backend role only synced code_source/autobot_shared/ into
backend_install_dir/autobot_shared/ (the in-backend copy). The separate
/opt/autobot/autobot_shared/ directory — resolved via PYTHONPATH by
Celery workers and other services — was never updated, causing
ModuleNotFoundError for newly added modules (pagination.py,
task_result.py, error_boundaries.py, alert_cooldown.py).

Add a synchronize task immediately after the existing backend sync that
also pushes to the standalone path, plus a file ownership fix task.
Introduce backend_shared_standalone_dir default (/opt/autobot/autobot_shared)
so the path is a named variable, not a hardcoded literal.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(frontend): remove duplicate getApiBase() declarations from ssot-config.ts (#3631)

All target files (stores, model repositories, ChatInterface) already had
getApiBase() imported and in use from prior development. Fixed the only
remaining issue: three duplicate getApiBase() function declarations in
ssot-config.ts (added by parallel PRs #3628-#3630) — reduced to a single
canonical declaration.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(tests): replace hardcoded localhost:8001 with get_test_backend_url() helper

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(slm-frontend): replace hardcoded /autobot-api/ with getBackendUrl()

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(backend): guard context_window_manager against list-format llm_models.yaml (#3647)

PR #3588 changed models: to a list-based registry; ContextWindowManager
still accessed it as a dict, crashing all workers on startup.
Detect list format and fall back to default context-window config.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(slm): pass auth token to PasswordChangeForm — was always sending empty Bearer

PasswordChangeForm read localStorage.getItem('authToken') which is never set;
SLM auth store uses key 'slm_access_token' in sessionStorage.
Added authToken prop; UserManagementSettings passes authStore.token.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ansible): guard groups['database'] lookup in deploy-backend-local.yml (#3651)

Wrap the hostvars lookup with an inline if-guard so AnsibleUndefinedVariable
is never raised when the 'database' inventory group is absent or empty.
Falls back to 127.0.0.1 in both cases.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(backend): default SLM_URL to localhost and deduplicate startup warning (#3655)

On co-located deployments SLM_URL is not set, so the backend now defaults
to http://127.0.0.1:8000 instead of None, preventing silent SLM client
failures.  The startup warning is deferred to init_slm_client() and gated
by a module-level flag, so it fires at most once per process instead of 3×.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(backend): give context_window_manager its own config YAML (#3650)

Decouple ContextWindowManager from llm_models.yaml (which uses a list
schema for the YAML-model-registry feature) by introducing a dedicated
config/context_windows.yaml with the dict-keyed schema the manager has
always expected.  Covers all 40+ models from llm_models.yaml with
realistic context_window_tokens, max_output_tokens, and message_budget
values.  Default config path updated from config/llm_models.yaml to
config/context_windows.yaml.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(backend): add TTL_24_HOURS to chat_workflow/manager.py import (#3646)

PR #3617 added TTL_24_HOURS usage in manager.py but the import from
constants.ttl_constants was never updated — NameError crashes all workers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(agents): cap exponential backoff in agent_client.py via exponential_backoff_delay (#3658)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(backend): remove dead doc_generation_threshold attribute (#3656)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(backend): fix REDIS_HOST → AUTOBOT_REDIS_HOST in chat history modules

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(frontend): remove duplicate getApiBase() declarations from ssot-config.ts (#3628)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(slm-frontend): replace hardcoded /api/ with getSlmApiBase() (#3627) (#3676)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(ansible): add pre-flight code_source sync to deploy playbooks (#3604)

Create shared pre-flight-code-sync.yml extracted from provision-fleet-roles.yml
Play 0; import it at the top of deploy-full.yml, deploy-aiml.yml,
deploy-backend-local.yml, and deploy-backend-remote.yml so standalone
deployments always pull the latest code before roles run.  GitHub sync uses
ignore_errors so air-gapped SLMs fall back to existing code_source with a
visible warning and staleness report.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(slm): add component selector to File Drift Check UI (#3433)

Add a <select> dropdown bound to selectedDriftComponent (default: autobot-slm-backend)
in the File Drift Check card. The selected value is passed to fetchDrift() so the
API call includes the correct ?component= query param. A watcher clears stale drift
results when the user switches components.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(ci): add doc linting to catch stale file references in developer guides (#3425)

- Add pipeline-scripts/check-doc-references.py that validates backtick-quoted
  file path references in the 9 canonical developer docs resolve to existing
  files; searches by full path then by filename across all source trees.
- Wire the script as a new step in code-quality.yml before the summary step.
- Fix 13 stale references in AUTOBOT_REFERENCE, DEVELOPER_SETUP,
  REDIS_CLIENT_USAGE, LOGGING_STANDARDS, and ERROR_CODE_CONVENTIONS that the
  new linter caught: removed deleted redis_database_manager.py entries, updated
  tls.yml → nginx/tasks/main.yml, corrected troubleshooting/security doc paths,
  updated logging config path, and replaced non-existent error_migration_map.yaml
  and test_error_catalog.py references with current file locations.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(frontend): adopt getApiBase() in Vue components and views (Phase 4)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(analytics): scope codebase analytics caches and live scans to source_id (#3685)

Five cross-project data leakage vectors fixed:
1. _duplicate_cache keyed by source_id (was a single Optional[dict])
2. _ownership_analysis_cache keyed by source_id (same pattern)
3. Live duplicate analysis resolves source.clone_path so DuplicateCodeDetector
   scans the correct project instead of always the AutoBot repo
4. detect_config_duplicates_endpoint adds source_id param + clone_path resolution
5. indexing_tasks entries now store source_id; _get_active_indexing_task filters
   by it so Project B stats don't show Project A's indexing progress

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(frontend): adopt getApiBase() in chat Vue components (Phase 5b)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(backend): replace bare REDIS_HOST with AUTOBOT_REDIS_HOST in chat_history (no-op — already in #3672)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(slm-frontend): replace hardcoded /autobot-api/ with getBackendUrl() (#3652) (no-op — already in #3662)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(slm-frontend): safe WebSocket URL construction when getBackendUrl() is absolute (#3673)

Add buildWsUrl() helper that strips the http(s):// scheme from getBackendUrl()
and replaces it with the correct ws(s):// scheme. Handles proxy mode (empty
base URL) by falling back to window.location.host. Fixes the broken URL
produced by prepending ws:// to an already-absolute URL.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(infrastructure): standardise Redis file ownership to autobot:autobot (#3396)

- Add redis_owner/redis_group defaults (both "autobot") to redis role
- Add idempotent file tasks to create and chown /var/lib/redis,
  /etc/redis, /var/run/redis, /var/log/redis with recurse for data
  and config dirs; tagged 'redis,ownership' for selective runs
- Update systemd service override to run redis-stack-server as
  autobot:autobot instead of redis:redis so all written files
  inherit the correct ownership automatically

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(frontend): use getApiBase() instead of getBackendUrl() in notification composables (#3675)

Replace getBackendUrl() with getApiBase() in useNotificationConfig (lines 72, 94)
and remove the getBackendUrl() prefix from apiRequest in useWorkflowNotificationConfig
(line 80) so all notification API calls use the canonical /api path prefix.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(frontend): replace hardcoded /api/advanced/ with getApiBase() in BusinessIntelligenceView (#3674)

All 5 hardcoded /api/advanced/ calls were replaced with ${getApiBase()}/advanced/...
template literals in BusinessIntelligenceView.vue. Import added at line 320.
Fix was included in Phase 4 commit 2aca9bc08 — this commit closes the issue.

Closes #3674

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(frontend): adopt getApiBase() in knowledge Vue components (#3682) (#3703)

Replace all hardcoded '/api/' path prefixes with `${getApiBase()}/`
in 20 knowledge components. Add `import { getApiBase } from
'@/config/ssot-config'` to each file. Covers single-quoted strings
and template literals across apiClient and apiService call sites.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(frontend): adopt getApiBase() in security, visualizations, workflow, and research Vue components (#3684) (#3706)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(frontend): adopt getApiBase() in knowledge Vue components (Phase 5c) (#3629)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(analytics): scope quality/review/evolution/generation to project source_root (#3441)

- analytics_shared: add resolve_source_root_or_404() returning Path so callers
  can scope query results, not just validate the source exists
- analytics_quality: _get_problems_from_chromadb() and
  calculate_real_quality_metrics() accept source_root; problems whose
  file_path does not resolve under source_root are excluded; Redis cache key
  is namespaced per source_root so per-project caches are independent
- analytics_code_review: analyze_diff() resolves source_root and skips any
  file whose resolved path falls outside source_root
- analytics_evolution: _fetch_timeline_snapshots() and
  _fetch_trend_snapshots_sync() accept an evolution_prefix parameter;
  _build_evolution_prefixes() builds the evolution:{source_id}: namespace;
  timeline, patterns, and trends endpoints use the scoped prefix so each
  project reads/writes its own Redis keys; added
  _extract_pattern_types_from_prefix() to generalise the existing helper
- analytics_code_generation: CodeGenerationEngine.get_stats() accepts
  source_id and namespaces its Redis key as stats:{source_id}:{date} when
  provided; the /stats endpoint passes source_id through

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(frontend): correct useEvolution base URL from /analytics/evolution to /evolution (#3698)

* chore(frontend): adopt getApiBase() in analytics Vue components (#3681)

Replace all hardcoded '/api/' strings with getApiBase() across 12
analytics Vue components. Add import { getApiBase } from
'@/config/ssot-config' to each file and update all fetchWithAuth and
api service call sites.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(backend): add explicit encoding='utf-8' to file I/O (#3391)

- Add encoding='utf-8' to open() call in event_manager.py
- Add media_type='application/json; charset=utf-8' to all JSONResponse
  calls in api/chat.py (11 sites)
- Add encoding='utf-8' to open() calls in code_analysis scripts and tools
  (analyze_env_vars, analyze_code_quality, analyze_architecture,
  analyze_frontend, generate_patches, code_quality_dashboard,
  logging_standardizer, setup.py)
- Skips binary-mode opens (rb/wb) and non-text opens (tarfile, Image)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(tests): co-locate single-module tests from autobot-backend/ root (#3364)

Move 5 unit tests that each test a single module into the directory of
the module they cover, following the co-location pattern established in
chat_workflow/ and other packages.

Moves:
- chat_intent_detector_test.py → chat_workflow/
- tool_discovery_test.py       → tools/
- encryption_service_test.py   → security/
- auth_rbac_test.py            → security/
- memory_package_test.py       → memory/

Tests in chat_workflow/ and tools/ confirmed passing after move.
Tests in security/ have a pre-existing collection error (structlog not
installed in local dev venv) that existed before this PR.
Tests in memory/ have a pre-existing collection error (aiosqlite not
installed in local dev venv) that existed before this PR.

E2E, integration, and cross-module tests remain at autobot-backend/ root.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(frontend): remove spurious /analytics prefix from code-review pattern preference URLs (#3699)

* fix(frontend): use wss:// on HTTPS for CodeQualityDashboard WebSocket connection (#3702)

* fix(frontend): remove dead loadReview() — GET /review/{id} has no backend storage (#3701)

* fix(ansible): AUTOBOT_CHROMADB_HOST fallback uses backend_ai_stack_host (#3541)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(slm): implement autobot-admin CLI with reset-password subcommand (#3691)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ansible): add post-deploy smoke test to backend role (#3687)

Wait for uvicorn to accept TCP connections on port 8001 and verify
/api/health returns 200 after every backend role execution.  Adds
backend_health_check_port/timeout defaults so values are configurable.
Also adds equivalent checks to update-node.yml Play 3 when
autobot-backend is among the deployed roles.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(backend): rate-limit AI Stack connection error logging (#3686)

Add _log_connection_error() helper with a 60-second suppression window: first
failure per window is emitted at WARNING, subsequent retries are demoted to
DEBUG so backend-error.log is not flooded when AI Stack is unavailable. Also
downgrade HTTP-error and unexpected-exception log calls from ERROR to WARNING
since these are expected transient conditions when the remote service is down.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ansible): idempotent autobot:autobot ownership on Redis dirs (#3396)

The command module with changed_when:true always reported "changed" and
would trigger restart redis-stack on every play run. Replaced with the
file module (recurse:yes) which only marks changed when ownership
actually differs, making the task fully idempotent. Also removed the
duplicate directory-creation task that preceded it.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(frontend): remove duplicate withSourceId() wrap in Redis health URL (#3714)

* fix(tests): update stale src.tool_discovery mock path (#3722)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(dev): add requirements-dev.txt with structlog and aiosqlite (#3723)

Creates requirements-dev.txt at repo root so local dev venvs include
structlog and aiosqlite, preventing ModuleNotFoundError during pytest
collection of security/ and memory/ test packages.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(frontend): wrap router-view with ErrorBoundary in App.vue (#3375)

Wire the existing ErrorBoundary.vue into App.vue by importing it,
registering it in the components map, and wrapping <router-view> so
any unhandled runtime error in a child view shows a user-friendly
fallback instead of a white screen.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(backend): persist code-review results in Redis + GET /review/{id} endpoint (#3716)

- Store each analyze result under code_review:result:{source}:{uuid} with TTL_7_DAYS
- Push summary entry to code_review:history:{source} list (capped at 100)
- Add GET /review/{review_id} endpoint for drill-down lookup
- GET /history now reads from Redis instead of returning no-data stub
- Frontend re-adds loadReview(id) and @click handler on history items
- Add reviewLoaded i18n key to all 11 locale files

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(slm): Redis service management UI — start/stop/restart controls (#3381)

Adds RedisServicePanel.vue with start/stop/restart controls, live status
polling every 10 s (uptime, memory, client count), and a stop confirmation
dialog. Wires the panel into ServicesView.vue below the summary cards.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(dev): uncomment aiosqlite in autobot-backend/requirements.txt (#3723) (#3745)

Makes aiosqlite an explicit dependency rather than relying on transitive
inclusion from ../requirements.txt. Required for security/ and memory/
test collection under pytest.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(analytics): fix flake8 E501 line-length violation in analytics_evolution.py (#3724) (#3741)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): correct stale src.tool_discovery patch path in tool_discovery_test (#3722) (#3738)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(backend): remove total timeout from LLM streaming — use connect-only timeout (#3732) (#3737)

Replace ClientTimeout(total=TIMEOUT_HTTP_DEFAULT) with ClientTimeout(total=None,
connect=TIMEOUT_HTTP_DEFAULT) so the 60s cap no longer kills long-running LLM
generation. Connection failures still time out fast via the connect param.

Also improve the user-facing error message for TimeoutError from the generic
"{type}: unexpected error" to "The model is taking too long to respond. Please
try again."

Closes #3732

* chore(tests): extend get_test_backend_url() to remaining 14 backend test files (#3661) (#3727)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(slm-frontend): replace remaining 2 hardcoded /api/ with getSlmApiBase() (#3725) (#3733)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(browser): replace hardcoded :3000 with NetworkConstants.BROWSER_SERVICE_PORT in web_crawler.py fallback (#3728) (#3731)

* fix(browser): replace hardcoded :3000 with NetworkConstants.BROWSER_SERVICE_PORT in web_crawler.py fallback (#3728)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ansible): inject AUTOBOT_BROWSER_SERVICE_HOST into backend.env from fleet vars (#3728)

backend.env.j2 never set AUTOBOT_BROWSER_SERVICE_HOST, so PlaywrightService
raised ValueError on every Ansible-provisioned deployment and BROWSER_VM_IP
defaulted to empty string causing health checks to hit http://:3000/health.

- backend.env.j2: render AUTOBOT_BROWSER_SERVICE_HOST/AUTOBOT_BROWSER_HOST
  from browser_host (fleet wizard) or backend_browser_host (direct playbooks)
- defaults/main.yml: add backend_browser_host="" and backend_browser_port=3000
- deploy-backend-remote.yml / deploy-backend-local.yml: resolve browser host
  from groups['browser'][0] inventory, same pattern as redis/ai_stack hosts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(frontend): adopt getApiBase() in JS utility files (#3726) (#3730)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): evaluate get_test_backend_url() lazily in TakeoverTestClient (#3747) (#3756)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(frontend): adopt getApiBase() in remaining 7 SecretsApiClient.js paths (#3746) (#3757)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): handle https→wss protocol upgrade in simple_terminal.e2e_test.py

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(frontend): clear auto-save interval on ChatController destroy

* fix(frontend): guard loadReview against undefined reviewId (#3755) (#3759)

Pre-existing Redis history entries (created before PR #3735) have no id
field. Clicking them called loadReview(undefined), which sent
GET /review/undefined and produced a confusing 404 error toast.

Replace the silent early-return with showToast(reviewNotAvailable,
'warning') so the user sees a clear message instead. Added the
reviewNotAvailable i18n key to all 11 locale files.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(config): replace hardcoded 172.16.168.* IPs with named placeholders in .env.example

* fix(monitoring): replace hardcoded 172.16.168.25 with $browser_vm_ip variable in Grafana dashboard

* fix(config): canonicalize browser host env var to AUTOBOT_BROWSER_SERVICE_HOST

* fix(frontend): clear auto-save interval on ChatController destroy (#3749) (#3766)

* chore(frontend): adopt getApiBase() in ApiClientMonitor.js (#3752) (#3767)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(chat-workflow): route system prompt via Ollama system field; raise num_ctx to 8192 (#3761) (#3765)

* chore(frontend): adopt getApiBase() in AppConfig.js and ServiceDiscovery.js (#3751) (#3769)

Replace 5 hardcoded '/api/' path literals with getApiBase() calls in
AppConfig.js (lines 382, 475, 559) and ServiceDiscovery.js (lines 403, 429).

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(backend): allow any authenticated user to GET /review/{id} (#3754) (#3772)

Replace check_admin_permission with get_current_user on get_review_by_id
so regular users can drill into their own code-review history entries.
All other write/admin endpoints retain check_admin_permission.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ansible): inject SLM secrets into standalone deploy extra_vars (#3773)

* fix(ansible): inject SLM secrets into standalone deploy extra_vars (#3519)

* fix: guard against None variables in _run_playbook merge

* feat(frontend): add KB vectorization status badge, action button, batch toolbar, progress modal (#3388) (#3774)

* perf(autoresearch): replace HumanReviewScorer polling loop with BLPOP (#3781)

* fix(ansible): inject SLM secrets into standalone deploy extra_vars (#3519)

* fix: guard against None variables in _run_playbook merge

* perf(autoresearch): replace HumanReviewScorer polling loop with BLPOP (#3209)

* perf(backend): convert file I/O to aiofiles in chat_history_manager and simple_pty (#3783)

* fix(ansible): inject SLM secrets into standalone deploy extra_vars (#3519)

* fix: guard against None variables in _run_playbook merge

* perf(backend): convert file I/O to aiofiles in chat_history_manager and simple_pty (#3392)

* chore(frontend): add i18n keys for KB vectorization UI components (#3785)

* fix(ansible): inject SLM secrets into standalone deploy extra_vars (#3519)

* fix: guard against None variables in _run_playbook merge

* chore(frontend): add i18n keys for KB vectorization UI components (#3779)

* fix(ansible): inject SLM secrets into standalone deploy extra_vars (#3519) (#3780)

* fix(ansible): inject SLM secrets into standalone deploy extra_vars (#3519)

* fix: guard against None variables in _run_playbook merge

* feat(memory): Redis-backed session-scoped working memory with TTL (#3775)

* feat(memory): Redis-backed session-scoped working memory with TTL (#3768)

- Add WorkingMemoryService with store/get/list/clear async methods
- Key pattern: autobot:session:{session_id}:memory:{key} in 'knowledge' DB
- Register singleton via UnifiedMemoryManager.working_memory property
- Add TTL_WORKING_MEMORY_DEFAULT = 3600 to ttl_constants.py
- Unit tests: 9 cases covering all methods + manager integration

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(memory): connection caching, error handling, race-free singleton (#3768)

- WorkingMemoryService: lazy-init _redis instance attribute via _get_redis()
  so the Redis client is created once per service instance, not per call
- All four public methods wrapped in try/except that logs a warning and re-raises
- UnifiedMemoryManager: _working_memory initialized eagerly in __init__ to
  eliminate the race condition in the lazy property

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(memory): context-adaptive memory compression for small-context models (#3776)

* feat(memory): context-adaptive memory compression for small models (#3770)

Add ContextCompressionService that summarises dropped history and
re-ranks KB results by score to fit within model token budgets.
Wire into ContextWindowManager via async_should_compress().
Add compression_threshold field to every entry in context_windows.yaml.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(memory): compression guard logic, role:system mid-history, phi3 threshold, wire to llm_handler (#3770)

- Fix phi3 compression_threshold from 8192 to 4096 (matches context_window_tokens)
- Remove large-model early-return guard from should_compress; YAML thresholds are the sole policy
- Change summary role from 'system' to 'assistant' to avoid mid-history system messages
- Add model_thresholds constructor param to ContextCompressionService to avoid double YAML load
- Wire compress_kb_results into _prepare_llm_request_params after KB retrieval
- Update tests to reflect corrected behavior

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(chat-workflow): remove system_prompt from _build_continuation_prompt to fix double-injection (#3784) (#3791)

* chore(chat-workflow): remove orphaned system_prompt param from _build_full_prompt (#3794) (#3801)

* feat(agents): enforce memory read/write via pipeline lifecycle hooks (#3777)

* feat(agents): _before_process/_after_process memory lifecycle hooks (#3771)

Add async _before_process / _after_process hooks to StandardizedAgent.process_request()
so agents can optionally read from and write to WorkingMemoryService (issue #3768)
without breaking any existing agent when the dependency is absent.

- WorkingMemoryService imported with try/except ImportError fallback to no-op stubs
- Default hook implementations are no-ops — all 24+ existing agents unchanged
- Hook exceptions are caught and logged at WARNING; never re-raised
- Hook timing logged at DEBUG level (before/after elapsed ms)
- SentimentAnalysisAgent demonstrates the override pattern
- 9 unit tests verify call order and failure isolation

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(agents): correct WMS method names, remove private flag import, add pass to _after_process (#3771)

- Replace WorkingMemoryService.load/save() static calls with instance
  get()/store() via self._wm stored in __init__
- Remove fragile _working_memory_available import from standardized_agent;
  use local try/except with _WMS alias in sentiment_analysis_agent
- Add pass to empty _after_process base body in standardized_agent
- Document enriched context availability in process_request comment
- Fix 3 pre-existing E501 violations in standardized_agent (line wraps)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: correct WorkingMemoryService import path (autobot_backend. → memory.)

* fix(agents): correct memory import path in standardized_agent (#3771)

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(ansible): move _SECRET_TO_ANSIBLE_VAR to shared location to avoid drift (#3778) (#3782)

Extract the duplicated secret-key -> Ansible-variable mapping from
setup_wizard.py and playbook_executor.py into a new shared module
services/ansible_secrets.py. Both callers now import from the single
source of truth, eliminating the risk of the two copies drifting apart.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(backend): expand hardcoding prevention for file paths and model names (#3397) (#3799)

- Add PathConfig to ssot_config with AUTOBOT_BASE_DIR/AUTOBOT_PLUGINS_DIR/etc fields
- Replace hardcoded /opt/autobot/plugins/* in plugin_manager.py with config.path.plugins_path
- Replace hardcoded /opt/autobot/docs in mcp_manual_integration.py with AUTOBOT_BASE_DIR env var
- Replace hardcoded "llama3.2:latest" in rlm/types.py, agent_loop/think_tool.py,
  rlm/benchmark.py with ROUTING_MODEL from ssot_config
- Add check_hardcoded_paths() and check_hardcoded_model_names() to pre-commit hook
- Exclude constants definition files (path_constants.py etc.) from hook scanning

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(ansible): add deploy-hybrid-docker.yml and docker Ansible role (#3424) (#3786)

* feat(ansible): add deploy-hybrid-docker.yml playbook and docker role (#3424)

Create the missing Ansible playbook and role that
SLMDeploymentOrchestrator.deploy_docker() targets. Without these files
every call to POST /api/slm/deployments/docker fails immediately with
FileNotFoundError from PlaybookExecutor.execute_playbook().

- autobot-slm-backend/ansible/deploy-hybrid-docker.yml: standalone
  playbook that runs the docker role; accepts target_host and
  docker_containers extra_vars passed by the orchestrator
- autobot-slm-backend/ansible/roles/docker/tasks/main.yml: installs
  Docker Engine (Debian + RedHat), starts the service, adds the service
  user to the docker group, pulls images and runs containers from the
  docker_containers list, then asserts all containers reach running state
- autobot-slm-backend/ansible/roles/docker/defaults/main.yml: defaults
  for all vars referenced in the role (docker_image, docker_containers,
  docker_default_restart_policy, docker_verify_timeout, etc.)
- autobot-slm-backend/ansible/roles/docker/handlers/main.yml: restart
  docker handler
- autobot-slm-backend/ansible/deploy.yml: add docker branch so the
  standard DeploymentService path also resolves the docker role when
  target_roles includes "docker"

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: route deploy.yml override from extra_data; add docker to DEFAULT_ROLES; fix arch default

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(knowledge): semantic duplicate guard on individual fact writes (#3788) (#3800)

Add FactsMixin._find_duplicate() that queries ChromaDB for near-duplicate
content before every store_fact() insert. Threshold (default 0.92) is
configurable via AUTOBOT_KB_DEDUP_THRESHOLD / config.cache.l2.kb_dedup_threshold.
Nine unit tests cover above-threshold skip, below-threshold write, exact
hash dedup, graceful ChromaDB error handling, and empty-KB no-false-positive.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(agents): per-agent cross-session diary — persistent agent journal in KB (#3792)

* feat(agents): per-agent cross-session diary in KB (#3789)

Add AgentDiaryService that stores timestamped journal entries as KB
facts (category AGENT_DIARY, source=agent_name) enabling cross-session
memory and retrospective semantic search per agent.  Wire diary write
into SentimentAnalysisAgent.process_query() as a concrete usage example.
10 unit tests — all passing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(agents): diary read() uses metadata filter instead of semantic search (#3789)

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(memory): temporal fact validity — valid_from/valid_to on memory graph entities and relations (#3798)

* feat(memory): temporal fact validity — valid_from/valid_to on memory graph (#3790)

- Add valid_from/valid_to to entity metadata in _prepare_entity_metadata()
- Add valid_from/valid_to to both relation builder helpers
- Add EntityOperationsMixin.invalidate_entity() — marks entity expired without deleting
- Add RelationOperationsMixin.invalidate_relation() — marks a specific edge expired
- Add include_expired param to search_entities() and _fallback_search() (default False)
- Add QueryOperationsMixin.get_entities_as_of() — point-in-time entity query
- Legacy entities without valid_to are treated as currently valid (no migration needed)
- 21 new tests, all passing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(memory): _is_entity_valid: future valid_to is still valid (#3790)

An entity with valid_to set to a future timestamp is still currently
valid. Only treat valid_to as expired when the timestamp is in the past.
Add datetime/timezone import; update test assertion to match.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(memory): essential story layer — always-loaded compact memory summary (#3793)

* feat(memory): essential story layer — always-loaded compact memory summary (#3787)

- Add EssentialStoryGenerator in memory/essential_story.py: fetches top
  facts by quality_score, respects per-model token budget, caches result
  in Redis knowledge DB with TTL_5_MINUTES, never raises from generate()
- Add essential_story_tokens to every model entry in context_windows.yaml
  (300 for ≤8192, 600 for ≤32768, 800 for >32768 context windows)
- Inject story into _prepare_llm_request_params in llm_handler.py after
  _get_system_prompt() so every LLM call carries persistent top-memories
- Add 18 unit tests covering generation, token budget enforcement,
  cache hit/miss, empty KB, and never-raise guarantee

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(memory): cap essential story KB fetch at top 50 facts (#3787)

Sort all facts by quality_score descending, then slice to 50 before the
token-budget loop. Avoids processing the entire KB on every cache miss.
Added test_caps_at_50_facts_before_token_loop to verify the hard cap.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(knowledge): semantic duplicate guard on individual fact writes (#3788) (#3805)

Add FactsMixin._find_duplicate() that queries ChromaDB for near-duplicate
content before every store_fact() insert. Threshold (default 0.92) is
configurable via AUTOBOT_KB_DEDUP_THRESHOLD / config.cache.l2.kb_dedup_threshold.
Nine unit tests cover above-threshold skip, below-threshold write, exact
hash dedup, graceful ChromaDB error handling, and empty-KB no-false-positive.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(memory): essential story layer — always-loaded compact memory summary (#3787) (#3804)

* feat(memory): essential story layer — always-loaded compact memory summary (#3787)

- Add EssentialStoryGenerator in memory/essential_story.py: fetches top
  facts by quality_score, respects per-model token budget, caches result
  in Redis knowledge DB with TTL_5_MINUTES, never raises from generate()
- Add essential_story_tokens to every model entry in context_windows.yaml
  (300 for ≤8192, 600 for ≤32768, 800 for >32768 context windows)
- Inject story into _prepare_llm_request_params in llm_handler.py after
  _get_system_prompt() so every LLM call carries persistent top-memories
- Add 18 unit tests covering generation, token budget enforcement,
  cache hit/miss, empty KB, and never-raise guarantee

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(memory): cap essential story KB fetch at top 50 facts (#3787)

Sort all facts by quality_score descending, then slice to 50 before the
token-budget loop. Avoids processing the entire KB on every cache miss.
Added test_caps_at_50_facts_before_token_loop to verify the hard cap.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(memory): temporal fact validity — valid_from/valid_to on memory graph (#3790) (#3807)

* feat(memory): temporal fact validity — valid_from/valid_to on memory graph (#3790)

- Add valid_from/valid_to to entity metadata in _prepare_entity_metadata()
- Add valid_from/valid_to to both relation builder helpers
- Add EntityOperationsMixin.invalidate_entity() — marks entity expired without deleting
- Add RelationOperationsMixin.invalidate_relation() — marks a specific edge expired
- Add include_expired param to search_entities() and _fallback_search() (default False)
- Add QueryOperationsMixin.get_entities_as_of() — point-in-time entity query
- Legacy entities without valid_to are treated as currently valid (no migration needed)
- 21 new tests, all passing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(memory): _is_entity_valid: future valid_to is still valid (#3790)

An entity with valid_to set to a future timestamp is still currently
valid. Only treat valid_to as expired when the timestamp is in the past.
Add datetime/timezone import; update test assertion to match.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* perf(backend): convert delete_session os.remove() to async (#3812)

* perf(backend): convert delete_session os.remove() to async (#3796)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: also await aiofiles.os.path.exists() in delete_session (review fix)

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(memory): wire WorkingMemory/EssentialStory/AgentDiary into UnifiedMemoryManager (#3822)

* feat(memory): wire WorkingMemoryService, EssentialStoryGenerator, AgentDiaryService into UnifiedMemoryManager (#3809)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(memory): wire WorkingMemoryService, EssentialStoryGenerator, AgentDiaryService into UnifiedMemoryManager (#3809)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(memory): make StandardizedAgent.memory_manager lazy-init

Avoid allocating UnifiedMemoryManager (SQLite + LRU cache) at every
agent construction — create it only on first property access.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(llm): consolidate llm_providers/ with llm_interface_pkg adapters (#3185)

- Add OTel tracing (Issue #697) and circuit breaker to
  llm_providers/openai_provider.py — previously dropped vs the older
  llm_interface_pkg/providers/openai_provider.py implementation
- Document that llm_providers/ollama_provider.py delegates
  chat_completion to llm_interface_pkg/providers/ollama.py which
  carries OTel tracing and circuit breaker; add docstring clarification
- Rewrite llm_interface_pkg/adapters/anthropic_adapter.py to delegate
  execute() to llm_providers.AnthropicProvider — eliminates ~90 lines of
  duplicated message-split/API-call logic; preserves test_environment()
- Rewrite llm_interface_pkg/adapters/openai_adapter.py to delegate to
  llm_providers.OpenAIProvider; fixes broken provider.api_key attribute
  reference in test_environment()
- Rewrite llm_interface_pkg/adapters/ollama_adapter.py to delegate to
  llm_providers.OllamaProvider instead of llm_interface_pkg/providers/ollama

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(backend): delete legacy chat_history_manager.py monolith (#3838)

Zero production importers — all call sites use `from chat_history import ChatHistoryManager`.
Superseded by chat_history/ mixin package since arch refactor #926.

* chore(backend): use config.path.docs_path in mcp_manual_integration (#3802) (#3817)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(autoresearch): deduplicate _NOTIFY_KEY format string (#3795) (#3818)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(ansible): move _SECRET_TO_ANSIBLE_VAR to shared location (#3778) (#3819)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(backend): add TimeoutConfig to SSOT config (#3820)

* chore(backend): add TimeoutConfig to SSOT config (#3803)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(backend): remove inline issue comments from TimeoutConfig callers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(backend): wire additional TimeoutConfig fields — sandbox, llm_request, connection_pool (#3803)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(backend): consolidate terminal API implementations, remove compat aliases (#3383) (#3833)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* arch(knowledge): replace Redis adjacency list with queryable property graph (#3230) (#3844)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* arch(orchestration): unified graph model for DAG executor and LangGraph (#3228) (#3836)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(hardening): tasks 3-6 + perf(memory): fix O(n) get_all_facts scan (#3009, #3808) (#3846)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: post-merge polish — hoist diary fetch limit, document property_graph serialisation, explain FeatureConfig alias suffixes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(config): add alias to AutoBotConfig.path to prevent PATH env var collision (#3851)

* fix(orchestration): guard _save_checkpoint so failures never fail a successful step (#3825) (#3852)

chore(mcp): extract _cached_fetch helper, remove duplicated cache-check pattern (#3826)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(memory-graph): REST endpoints for invalidate_entity and invalidate_relation (#3810) (#3854)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* bug(memory): validate compression_threshold <= context_window_tokens at load (#3811) (#3855)

bug(a2a): evict terminal tasks after TTL to prevent TaskManager OOM (#3823)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(hardening): Tool SDK Registry + ToolRegistry SDK dispatch + PermissionEnforcementExtension (#3009, tasks 7-9) (#3853)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(hardening): WebSocket auth, shared WorkflowMemory, register PermissionExtension in lifespan (#3009, tasks 10-12) (#3857)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(llm): migrate chat completion calls to vLLM optimised API (#3389) (#3835)

- Fix bug in chat_completion_optimized(): was passing LLMRequest object to
  chat_completion(messages: list) — now calls _execute_with_fallback() +
  _finalize_response() directly so the pre-built provider="vllm" request
  is honoured
- Add automatic _calculate_cache_hit_rate() call in chat_completion() when
  the resolved provider is vllm, surfacing prefix-cache hit rates in
  response.metadata["cache_hit_rate"] for all vLLM-routed calls
- Expand AGENT_TIER_MAP with 7 task-agent types (summarization, translation,
  sentiment-analysis, code-generation, audio-processing, image-analysis,
  data-analysis) so get_base_prompt_for_agent() resolves them to Tier 3 instead
  of emitting a warning and falling back silently
- Migrate process_query() in all 7 task agents to chat_completion_optimized(),
  passing session_id / user_name / user_role from the request context; callers
  that carry conversation history (chat_agent) or custom multi-turn prompts
  (knowledge/RAG agents) are intentionally left on chat_completion() to
  preserve context fidelity

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* bug(backend): fix ChatHistoryManager create_task race with explicit initialize() (#3797) (#3815)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(agent-loop): content-aware repetitive tool-call detection (#3255) (#3832)

* feat(agent-loop): content-aware repetitive tool-call detection (#3255)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(agent-loop): correct off-by-one in repetition threshold check

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(config): config management enhancements — startup validation and sync API (#3398) (#3834)

* feat(config): config management enhancements — startup validation and sync API (#3398)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(config): replace blocking fdopen/json.dump with aiofiles in sync_config

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ansible): add notify+flush_handlers to backend role — restart workers on autobot_shared reinstall (#3856)

* fix(chat): use ModelConfig.CHAT_NUM_CTX — was referencing wrong class ModelConstants

* refactor(chat): tombstone legacy conversation.py, remove dead ConversationManager from orchestrator (#3831) (#3865)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(voice): wire VoiceInterface into app.state, guard endpoints with 503 (#3848) (#3867)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(http): consolidate HTTP clients — TracedHttpClient wraps HTTPClientManager, extract sign_request (#3827) (#3870)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(autoresearch): wire real scorers + benchmark, add agent registration path (#3208) (#3876)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(knowledge): consolidate vector search under VectorSearchEngine (#3828) (#3879)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(a2a): guard AUTOBOT_A2A_CARD_TTL int() cast against invalid env var (#3824) (#3869)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(security): apply validate_path() to fix path-injection CodeQL alerts (#3164) (#3875)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(config): refactor sync_config to ≤65 lines per function (#3863) (#3888)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(backend): delete confirmed dead root-level modules (#3847) (#3887)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(types): rename 3 competing TaskComplexity enums to distinct names (#3878)

* chore(types): rename 3 competing TaskComplexity enums to distinct names (#3845)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(types): update test imports after TaskComplexity rename (#3845)

Replace TaskComplexity with ModelCapabilityTier in
utils/model_optimizer_refactoring_test.py to match the rename in
utils/model_optimization/types.py — fixes ImportError at pytest collection.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(backend): migrate singletons to AsyncInitializable lazy-init (#3885)

* refactor(backend): migrate singletons to AsyncInitializable lazy-init (#3390)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(backend): await async register_all_caches and get_cache_coordinator (#3390)

Three call sites in api/system.py and lifespan.py were missing await
after the functions were made async in the AsyncInitializable migration.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(llm): add lock to claude adapter singleton + reset _initialized on shutdown (#3390)

- Add _claude_adapter_lock to guard get_autobot_claude_adapter singleton
  creation against concurrent callers
- Reset self._initialized = False in shutdown() so ensure_initialized()
  re-runs after shutdown instead of fast-pathing into a None manager

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(agents): stream chain-of-thought events to frontend in real time (#3889)

* feat(agents): stream chain-of-thought events to frontend in real time (#3232)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(backend,frontend): CoT event session_id forwarding and activeSteps guard (#3232)

- mcp_dispatch: add session_id param to dispatch() and forward to emit calls
- cot_events: add done_callback to fire-and-forget create_task for exception logging
- useReasoningTrace: use Math.max(0, ...) to prevent activeSteps from going negative

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(agents): coroutine leak, missing step_complete on deny, frozen session ID (#3232)

- cot_events.py: move publish() call inline into loop.create_task() so
  the coroutine is only created after confirming a loop exists — fixes
  RuntimeWarning: coroutine was never awaited
- graph.py: add emit_step_complete() before early return on denied-
  approval path in execute_tools — fixes frontend activeSteps stuck at 1
- useReasoningTrace.ts: accept MaybeRef<string|null|undefined> instead
  of plain string so session ID is read reactively at handler call time

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(knowledge): resolve async Redis client coroutine ping error (#3872) (#3890)

get_redis_client(async_client=True) returns a coroutine because it delegates
to the async RedisConnectionManager.get_async_client() method. All 11 call
sites that assigned the result to a variable without await were storing a
coroutine object instead of a live Redis client, causing AttributeError on
any subsequent Redis operation (e.g. .ping(), .get(), .set()).

Fixed by adding await at each call site:
- knowledge/base.py (primary issue — .ping() failure on init)
- autobot_memory_graph/property_graph.py (initialize())
- initialization/lifespan.py (_wire_npu_task_queue())
- knowledge/search_components/reranking.py (_fetch_staleness_scores())
- services/autoresearch/osint_engine.py (_get_redis())
- services/autoresearch/prompt_optimizer.py (_get_redis())
- services/autoresearch/routes.py (4 route handlers)
- services/autoresearch/scorers.py (_get_redis())
- services/documentation_watcher.py (_on_file_changed())
- services/workflow_sharing_service.py (4 methods)
- services/workflow_versioning.py (4 methods)

* fix(ansible): deploy permission_rules.yaml to infrastructure config path (#3873) (#3891)

- Add permission_rules.yaml to backend role files/
- Ensure /opt/autobot/config exists before deploying the file
- Deploy permission_rules.yaml to /opt/autobot/config/ with notify: restart backend
- Remove the clean task that wiped /opt/autobot/config; that directory is
  required by permission_matcher.py at runtime (three levels up from services/)
- Update clean.yml comment to reflect that /opt/autobot/config is now live

* docs(specs): add language switcher design spec

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(plans): add language switcher implementation plan

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(agent_loop): robust tool-call hash + repetition-halt guard (#3868 #3874 #3877)

Three related fixes to _compute_tool_call_hash and the halt mechanism:

- Issue #3868: wrap json.dumps() in try/except TypeError; use default=str as
  primary guard so non-serializable objects (custom classes, dataclasses) never
  raise — repr() fallback is only hit if default=str itself somehow fails.
- Issue #3874: non-dict args are now wrapped as {"__type__": ..., "__repr__": ...}
  instead of coerced to {}; distinct values (None, "", 42, "foo") now hash
  differently rather than all collapsing to the same empty-dict bucket.
- Issue #3877: add _halted_on_repetition flag (reset in _init_task_context);
  set it in _execute_tools when repetition fires; _should_continue() returns
  False immediately when flag is set, giving a belt-and-suspenders exit that
  works even if _should_iterate() misclassifies the halt error result.

Adds test_loop_repetition.py with 14 focused tests covering all three issues.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(autoresearch): rename test_issue_3208.py to benchmark_test.py (#3883) (#3893)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(config): use explicit None checks in startup validation (#3880) (#3897)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(backend): delete dead type_definitions/ package (#3839) (#3892)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(agent_loop): don't add halted tool to executed list; surface error to LLM (#3859 #3862)

When the repetition-halt guard fires in _execute_tools(), the selected tool
never actually ran — yet _execute_iteration_phases was unconditionally adding
it to result.tools_executed and then falling through to _should_iterate()
which might not correctly classify the error.

Fix:
- Check _halted_on_repetition immediately after _execute_tools() returns.
- On halt: set result.tools_executed = [] (tool never ran, #3859) and
  result.tool_results = tool_results (error is visible to the LLM, #3862)
  then return early with should_continue=False.
- Normal (non-halt) execution path is unchanged.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant