feat: resolve issue #7 — full Python parity alignment (v0.4.0) by webup · Pull Request #10 · ob-labs/powermem-ts

webup · 2026-04-03T13:04:41Z

Summary

Comprehensive Python/TypeScript alignment covering all 13 sections of issue #7.

Config-driven Memory.create() with autoConfig/parseMemoryConfig
20+ LLM/Embedding providers via registry-based dynamic LangChain loading
Rerank factory with OpenAI-compat provider (Jina, Cohere, vLLM)
Dashboard server: auth, rate-limiting, Prometheus metrics, CORS, agent/user routes, OpenAPI/Swagger
CLI: --run-id, --metadata, --memory-type on all memory commands
Memory API: getStatistics, getUsers, optimize, exportMemories, importMemories, migrateToSubStore
Hybrid search: SQLite FTS5 + cosine vector scoring
BM25 sparse embedder with tokenizer and dot product
Multimodal add(): accepts string | MessageInput[] with vision/audio detection
PgVectorStore backend (requires npm install pg)
SubStorageRouter with migration state machine
Observability: TelemetryCollector + AuditLogger
IntelligenceManager wired into NativeProvider
Graph store wiring with relations in SearchResult
AsyncMemory alias, version bump to 0.4.0

Test plan

436 total tests, 0 failures (76 new unit tests added)
Type-check passes
Build succeeds (dist/index.d.ts 49KB)
Real-world verification: 69 live assertions against Ollama+SQLite
CLI end-to-end: add/search/list/delete with new options
Dashboard: auth/rate-limit/metrics/CRUD/agents/users/OpenAPI all verified

Resolves #7

Source files reorganized into module-based directories mirroring oceanbase/powermem/src/powermem/: src/core/ — Memory facade, NativeProvider, HttpProvider, Inferrer src/storage/ — VectorStore base, SQLiteStore, SeekDBStore src/integrations/ — Embedder, provider factory src/intelligence/ — Ebbinghaus decay src/prompts/ — LLM prompt templates src/utils/ — Cosine search, Snowflake IDs, env, platform Test files reorganized into 4-layer structure matching Python: tests/unit/ — Per-module unit tests tests/integration/ — Full-stack with real SQLite, mock LLM tests/regression/ — Scenario-based (multi-agent, edge cases, language) tests/e2e/ — Real Ollama models Deleted: src/server/ (legacy Python bridge) No behavior changes. All 187 tests pass. Build unchanged.

Port Python powermem config system to TypeScript: - configs.ts: Zod schemas for MemoryConfig, IntelligentMemoryConfig, TelemetryConfig, AuditConfig, AgentMemoryConfig, QueryRewriteConfig, provider configs (vectorStore, llm, embedder, reranker) - config-loader.ts: loadConfigFromEnv(), autoConfig(), createConfig(), env var reading for all providers - settings.ts: getDefaultEnvFile() .env resolution - version.ts: VERSION constant 18 new tests in tests/unit/config-loader.test.ts covering: - Config parsing with defaults - Sub-config default application - Explicit overrides - Custom prompts - validateConfig() - loadConfigFromEnv() for all providers - Intelligent memory env settings - createConfig() with overrides Total: 205 tests (17 files)

Port Python powermem/storage/ module: - factory.ts: VectorStoreFactory with provider registry pattern, built-in sqlite and seekdb providers, dynamic import - adapter.ts: StorageAdapter bridges VectorStore with Memory core, adds getStatistics(), getUniqueUsers(), higher-level CRUD - config/{base,sqlite,seekdb}.ts: typed storage configs - index.ts: barrel re-exports 17 new tests: - factory.test.ts: provider listing, create sqlite, unsupported throws, custom provider registration - adapter.test.ts: full CRUD through adapter, search, pagination, count with filters, statistics, unique users, reset Total: 222 tests (19 files)

Port Python powermem/integrations/ module structure: - embeddings/{base,factory,config,index}.ts — EmbeddingProvider interface, createEmbeddings() factory (OpenAI/Qwen/SiliconFlow/DeepSeek/Ollama) - llm/{base,factory,config,index}.ts — LLMProvider interface, createLLM() factory (same providers + Anthropic) - rerank/{base,config,index}.ts — RerankProvider interface - index.ts — barrel with all re-exports Split old provider-factory.ts into embeddings/factory + llm/factory. NativeProvider updated to import from new locations. Old factory.ts kept for backward compat. 222 tests pass (no new tests needed — existing provider-factory tests still exercise the factory logic through the old import path).

Port Python powermem/intelligence/ module: - memory-optimizer.ts: exact dedup (MD5 hash grouping, keep oldest), semantic dedup (cosine similarity threshold), LLM compression (greedy clustering + summarization) - importance-evaluator.ts: rule-based importance scoring (keywords, length, emotion, punctuation, metadata priority) - manager.ts: IntelligenceManager orchestrator (processMetadata adds importance, processSearchResults applies Ebbinghaus decay) - plugin.ts: IntelligencePlugin interface - index.ts: barrel 17 new tests: - memory-optimizer: exact dedup (3), semantic dedup (2), similarity (1) - importance-evaluator: low/high/emotional/metadata/capped/empty (7) - manager: disabled passthrough, enabled importance, decay (4) Total: 239 tests (22 files)

Prompts module — port of Python powermem/prompts/: - importance-evaluation.ts: IMPORTANCE_SYSTEM_PROMPT, evaluation prompt builder - optimization.ts: MEMORY_COMPRESSION_PROMPT - query-rewrite.ts: query expansion prompt (stub) - user-profile.ts: profile extraction prompt (stub) - templates.ts: formatTemplate utility - index.ts: barrel Utils expansion — port of Python powermem/utils/: - filter-parser.ts: parseAdvancedFilters (time range, tags, type→category, importance→$gte) - stats.ts: calculateStatsFromMemories (byType, avgImportance, topAccessed, growthTrend, ageDistribution) - io.ts: exportToJson, importFromJson, exportToCsv 17 new tests: - filter-parser: empty, time range, tags $in, type→category, importance $gte, combined, unknown fields (8) - stats: empty, total, byType, default category, avg importance, access ranking, growth trend, age distribution, truncation (9) Total: 256 tests (24 files)

CLI (port of Python powermem/cli/): - src/cli/main.ts: Commander.js entry point with global --env-file, --json, --verbose - src/cli/commands/config.ts: pmem config show|validate|test (section filter, JSON output) - src/cli/commands/memory.ts: pmem memory add|search|list|get|delete|delete-all (all with --user-id, --agent-id, --json support) - package.json: "bin": {"pmem": "./dist/cli.js"} - tsup.config.ts: dual entry (library + CLI with shebang banner) Fixes: - settings.ts: use import.meta.url instead of __dirname for ESM compat - Bump version to 0.3.0 8 new CLI smoke tests (regression/cli.test.ts): - --version, --help, config --help, memory --help - config validate, config show --json, config show --section, config test Phase A summary (6 commits): - A.1: Config system (configs, config-loader, settings, version) - A.2: Storage module (factory, adapter, configs) - A.3: Integrations module (embeddings/llm/rerank base+factory) - A.4: Intelligence module (optimizer, evaluator, manager, plugin) - A.5: Prompts + Utils expansion (filter-parser, stats, io) - A.6: Minimal CLI (config + memory commands) Total: 264 tests (25 files), all passing. Source: 50 files matching Python powermem directory layout.

Phase B complete — port of Python powermem/cli/: Commands: - pmem stats: memory statistics (by-type, age distribution, top accessed) - pmem manage backup: export memories to JSON file - pmem manage restore: import memories from JSON backup - pmem manage cleanup: dedup (exact/semantic) with optimizer - pmem shell: interactive REPL with tab completion, session defaults, add/search/get/list/delete/stats/set/show commands CLI utilities: - utils/output.ts: formatJson, truncate, formatMemoryTable, formatSearchTable, formatStats, print{Success,Error,Warning,Info} - utils/envfile.ts: parseEnvLines, formatEnvValue, updateEnvFile, readEnvFile with backup support 17 new tests: - cli.test.ts: +6 (stats/manage/shell help, backup/restore/cleanup options) - cli-utils.test.ts: 11 (truncate, table formatting, stats format, env parsing, env value quoting, env file create/update/read) Total: 281 tests (26 files)

Phase C.1 — Agent module (port of Python powermem/agent/): - agent.ts: AgentMemory unified interface (add/search/getAll/update/delete with scope and permission management) - types.ts: 7 enums (MemoryType, MemoryScope, AccessPermission, PrivacyLevel, CollaborationType, CollaborationStatus, CollaborationLevel) - abstract/: 6 strategy interfaces (scope, permission, collaboration, privacy, context, manager) - components/: ScopeController (scope determination, memory scope management), PermissionController (grant/revoke/check with access logging) - factories/: AgentFactory (creates scope + permission managers) Phase C.2 — User memory module (port of Python powermem/user_memory/): - user-memory.ts: UserMemory (profile-aware add, search with query rewrite, profile management, deleteAll with profile cleanup) - storage/user-profile.ts: UserProfile types + UserProfileStore interface - storage/user-profile-sqlite.ts: SQLite-backed profile storage (CRUD, upsert, topic filtering, pagination) - query-rewrite/rewriter.ts: QueryRewriter (LLM-based query expansion with user profile context) Phase C.3 — Graph store + prompts: - storage/base.ts: GraphStoreBase interface (add, search, deleteAll, getAll, reset, statistics, uniqueUsers) - prompts/graph/: graph extraction + update + deletion prompts Exports: Updated src/index.ts with all new modules (agent, user-memory, intelligence, config, storage factory/adapter, integrations, utils) 39 new tests (5 test files): - agent-memory.test.ts: init, add, search, getAll, delete, deleteAll, statistics, permissions, reset (9) - scope-controller.test.ts: default scope, hint, config, update, stats (5) - permission-controller.test.ts: default allow/deny, grant, revoke, getPermissions, history, custom defaults (7) - user-profile-sqlite.test.ts: create, update, topics, nonexistent, list, pagination, mainTopic filter, delete, count (10) - user-memory.test.ts: add, extractProfile, search, addProfile, profile null, deleteProfile, deleteAll (8) Total: 320 tests (31 files), all passing. Source: 63 files matching Python powermem directory layout.

Dashboard: - src/dashboard/server.ts: Express server serving REST API + HTML dashboard (health, status, stats, memories CRUD, search endpoints) - src/dashboard/public/index.html: Single-page dashboard with 3 pages (Overview with stat cards/charts, Memories list with pagination, Settings with system config), dark/light theme toggle BDD test specification (tests/bdd/README.md): - 30+ CLI scenarios across 6 features (version/help, config management, memory CRUD, statistics, backup/restore, interactive shell) - 15+ dashboard UI scenarios across 5 features (overview page, navigation/theme, memories page, settings, error handling) BDD test implementation: - tests/bdd/cli.test.ts: 19 tests — real CLI subprocess execution verifying version, help, config show/validate/test, stats/manage/ memory help, delete-all confirmation, restore error handling - tests/bdd/dashboard.test.ts: 16 tests — headless browser via dev-browser verifying stat cards, system health panel, growth/age charts, hot memories table, theme toggle, page navigation, memories table with pagination, REST API (health/status/stats/ list/create/search) All 35 BDD tests pass. Dashboard verified with screenshots in light and dark themes.

15 new data correctness tests (tests/bdd/data-correctness.test.ts): API write → API read round-trip: - content, userId, metadata survive round-trip - search returns correct memory with valid score (0-1) - delete removes memory, no longer retrievable - stats reflect accurate counts after writes API write → Dashboard displays correctly: - memory added via API appears in dashboard memories table - stats cards show non-zero total after API writes - growth trend chart shows today's date User isolation: - user A memories not visible in user B list - search for user A returns only A's results - stats for user A reflect only A's count Data type fidelity: - Chinese content survives round-trip - emoji content survives round-trip - special characters (newlines, tabs, quotes, HTML) survive - 500-char content survives round-trip Pagination: - offset/limit returns correct pages with no ID overlap Total BDD tests: 50 (19 CLI + 16 dashboard UI + 15 data correctness)

CI: - New `test-seekdb` job on macOS (where native bindings are bundled) - Runs `npm run test:seekdb` with 5-min timeout - Separate from Linux test matrix (SeekDB requires platform-specific .so/.dylib) New SeekDB E2E tests (tests/integration/seekdb-e2e.test.ts, 22 tests): - Memory facade over SeekDB: add/get round-trip, search with scores, update re-embeds, delete, getAll pagination (no ID overlap), count, deleteAll, addBatch, reset - User isolation: A/B data isolation in list/search, scoped deleteAll - Data fidelity: Chinese, emoji, metadata, scope/category round-trip - Stats: correct total, age distribution, growth trend with today - Intelligent add: LLM fact extraction + storage over SeekDB - VectorStoreFactory: creates SeekDBStore via factory - NativeProvider: accepts injected SeekDBStore Total SeekDB tests: 70 (40 unit + 8 integration + 22 e2e) All auto-skip when native bindings unavailable. test:seekdb script updated to include new test file.

…m64 bindings

…ad job

macOS ARM64 (macos-14): - Set DYLD_LIBRARY_PATH at job level so it propagates to vitest - Add verification step that actually creates embedded DB + collection - Confirm binding loads before running tests Linux x64 (ubuntu-latest): - Install libaio1 system dependency - Download SeekDB bindings via on-demand downloader - Extract libseekdb.so from zip (download.js misses it) - Set LD_LIBRARY_PATH for native lib discovery - continue-on-error since S3 download may be restricted

- SeekDBStore.create(): pass embeddingFunction:null in VectorIndexConfig to disable auto-vectorization (we pass pre-computed embeddings) - CI macOS: fix verification step to use Schema with null embeddingFunction - CI Linux: try libaio1t64 (Ubuntu 24.04) then fallback to libaio1 - Add @seekdb/default-embed as devDependency

…ata tests - All seekdb test guards: don't call store.close() in availability check (SeekDB embedded C engine may SIGABRT on cleanup) - metadata round-trip test: use flat metadata (no nested objects) — SeekDB embedded has JSON limitations with complex nested values - unicode metadata test: use ASCII values (SeekDB C parser issue) - seekdb-e2e guard: same close() fix Previous CI run showed 46/48 passed on macOS ARM64 — these 2 fixes should bring it to 48/48 + unlock the 22 e2e tests.

…rser SeekDB's embedded C engine rejects JSON strings containing escaped quotes in metadata values. Solution: store user metadata as base64-encoded string (metadata_b64) instead of raw JSON (metadata_json). - toSeekDBMetadata(): metadata_json → metadata_b64 (Buffer.from().toString('base64')) - toRecord() + search(): decode metadata_b64 with fallback to metadata_json - Restores full metadata test coverage: nested objects, arrays, unicode, emoji

…tial execution - seekdb-e2e.test.ts: rewrite guard to match exact pattern from passing seekdb.test.ts (same tryCreateStore function, same params) - test:seekdb script: force single-fork sequential execution to prevent concurrent SeekDB embedded engine initialization across test files

…ingle-instance)

…he dir Root cause: require('@seekdb/js-bindings/download.js') fails because package.json exports only exposes '.', not './download.js'. Fix: resolve filesystem path via require.resolve() then replace filename. Also: - Dynamic cache dir discovery (find ~/.seekdb -name seekdb.node) instead of hardcoded commit hash - Verify both seekdb.node and libseekdb.so exist after download - Remove continue-on-error since the fix should work

README.md — complete rewrite: - Pure TypeScript (no Python dependency) positioning - Quick start with env vars, explicit LangChain, SeekDB, server modes - CLI usage examples (all commands) - Full API reference (Memory facade + configuration options) - Architecture overview (10 modules, 89 source files) - Test summary (504 tests, 7 CI jobs) - Dependencies and peer deps docs/architecture.md — complete rewrite: - Module structure matching Python powermem layout - Key flows (create, intelligent add, search) - Storage backends (SQLite + SeekDB with details) - CLI and Dashboard descriptions - Test architecture (6 layers, 7 CI jobs, 8 testing perspectives) - Python parity mapping table CHANGELOG.md — v0.3.0 release notes: - Directory restructure, all new modules, SeekDB improvements - Test counts (504 total), CI jobs (7, all green) tests/bdd/README.md — added data correctness scenarios: - API round-trip, dashboard display, user isolation, data fidelity, pagination

Comprehensive Python/TypeScript alignment covering all 13 sections of issue ob-labs#7: **Entry API & Config (Section 1)** - Memory.create() now uses autoConfig()/parseMemoryConfig() for config-driven creation - Added AsyncMemory alias for Python parity - Version bump to 0.4.0 **Storage (Sections 2, 9)** - PgVectorStore backend (requires `npm install pg`) - SubStorageRouter with dict-filter/function-based routing - SubStoreMigrationManager with state machine (pending→migrating→completed/failed) - Migration with re-embedding and progress tracking - Hybrid search: SQLite FTS5 + cosine vector scoring via hybridSearch() **Graph Store (Section 3)** - GraphStore wiring in NativeProvider (add/search with relations) - SearchResult.relations field for graph results **Embeddings & Multimodal (Section 4)** - Registry-based dynamic provider loading (20+ providers) - BM25SparseEmbedder with tokenizer and dot product similarity - Multimodal add(): content accepts string | MessageInput[] - Vision/audio detection utilities (hasVisionContent, hasAudioContent, extractImageUrls) **LLM Integrations (Section 5)** - Registry-based factory: azure, gemini, bedrock, cohere, mistral, together, fireworks, groq, vllm, lmstudio + auto-discovery for any @langchain/* package **Rerank (Section 6)** - createReranker() factory with OpenAI-compat provider (Jina, Cohere, vLLM) - Config-driven creation via createRerankerFnFromConfig() **Observability (Section 7)** - TelemetryCollector: event tracking + flush - AuditLogger: JSON-lines file logging with level filtering - IntelligenceManager wired into NativeProvider for importance scoring + decay **Memory API (Section 8)** - getStatistics(), getUsers(), optimize(), exportMemories(), importMemories() - migrateToSubStore() on Memory class **CLI (Section 10)** - --run-id, --metadata (JSON), --memory-type on memory add/search/list/delete-all - runId added to GetAllParams and FilterParams **Dashboard Server (Section 11)** - Auth middleware (X-API-Key header + api_key query param) - Rate limiting middleware (in-memory sliding window) - Prometheus metrics (request counters, duration histograms, operation counters) - CORS middleware - Agent routes (4 endpoints) - User profile routes (7 endpoints) - Memory routes: export, import, batch update/delete, GET search, users - OpenAPI 3.0 spec at /openapi.json + Swagger UI at /docs - HttpProvider: X-API-Key support + extended API methods **Tests: +76 new unit tests (436 total, 0 failures)** - sparse-embedder.test.ts, messages.test.ts, observability.test.ts - rerank-factory.test.ts, sub-storage.test.ts, hybrid-search.test.ts - dashboard-middleware.test.ts, memory-extended-api.test.ts

… enrichment

…io parsing, logging, OpenAPI - CJK tokenizer: tokenizeCJK() for Chinese/Japanese/Korean character splitting, CHINESE_STOPWORDS set, works as custom tokenizer for BM25SparseEmbedder - Vision LLM parsing: parseVisionMessages() pipes images through vision-capable LLM, inlines [Image description: ...] in embedded text. Wired into NativeProvider.resolveTextContent() automatically when LLM is available - Audio ASR: parseAudioMessages() transcribes via configurable WHISPER_API_URL/ ASR_API_URL endpoint, inlines [Transcript: ...]. Wired into NativeProvider - Request logging middleware: JSON/text format, configurable level, request duration tracking. Wired into dashboard server - OpenAPI spec: extracted to dedicated openapi.ts with full schema definitions (MemoryRecord, AddResult, SearchResult, etc.), request/response types, security schemes, and all 30+ endpoints documented - Tests: +15 new assertions (CJK tokenizer, vision parsing, audio parsing)

…tor:pg17 - 21 unit tests: CRUD, search (cosine similarity), filtering (userId/agentId/runId), pagination, sorting, access count, upsert, metadata round-trip, factory registration - Tests auto-skip when Postgres is not available (describe.skipIf) - CI: new "PgVector (PostgreSQL)" job using pgvector/pgvector:pg17 service container - npm test excludes pgvector tests (no PG on standard CI nodes) - npm run test:pgvector runs pgvector tests specifically

webup added 30 commits April 2, 2026 22:50

fix: resolve lint errors in agent module

f97884c

fix: add @types/express for CI type-check

74883b5

ci: use macos-14 (ARM64) for SeekDB tests — matches bundled darwin-ar…

92b73a3

…m64 bindings

ci: fix SeekDB — set DYLD_LIBRARY_PATH on macOS, add Linux x64 downlo…

a2f9d01

…ad job

ci: remove seekdb verification step that crashes on client.close()

e8dcec2

fix: run seekdb test files sequentially to avoid concurrent init

a30d441

fix: guard afterAll close() calls in seekdb-e2e tests

159fdf3

fix: seekdb-e2e — single shared Memory instance (embedded engine is s…

7ba321e

…ingle-instance)

fix: symlink libaio.so.1t64 → libaio.so.1 for Ubuntu 24.04 compat

67cdb78

fix: resolve lint errors (no-unsafe-function-type, unused import)

1c2e31b

webup added 5 commits April 3, 2026 21:16

ci: add workflow_dispatch trigger for manual CI runs

a4cb474

fix: seekdb-e2e metadata assertion — use toMatchObject for importance…

b3771bc

… enrichment

docs: update README and architecture for v0.4.0 Python parity

876fbd2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: resolve issue #7 — full Python parity alignment (v0.4.0)#10

feat: resolve issue #7 — full Python parity alignment (v0.4.0)#10
webup wants to merge 35 commits intoob-labs:mainfrom
webup:refactor/python-layout

webup commented Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

webup commented Apr 3, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant