feat: speed up with parallel call and support for vLLM deployment emb… by wuxuedaifu · Pull Request #19 · nikmcfly/MiroFish-Offline

wuxuedaifu · 2026-03-24T14:02:09Z

PR: Parallel Execution & Local Inference Optimization(2X+ faster)

🚨 Existing Problems

Synchronous Data Ingestion: Knowledge Graph text portions were ingested sequentially, causing severe performance degradation scaling alongside local LLM calls.
I/O Thread Blocking: Profile generation locked the main thread entirely to write tracking logs (CSV/JSON) per entity, creating quadratic O(N²) delays.
Infinite Generation Deadlocks (vLLM): Using strict json_object endpoints without max_tokens limits frequently caused local models to hallucinate infinite whitespace loops, permanently hanging execution threads.
Inefficient ReACT Loops: The Report Agent rigorously enforced min_tool_calls=3, forcing aware agents to blindly search tools redundantly.
Hardcoded Scaling Limits: Core concurrency modifiers (like thread counts and batch sizes) were hardcoded across various backend functions, preventing user-specific deployment tuning.
Inflexible Embeddings: The embedding services lacked seamless configuration to support bare-metal or containerized local inference integrations (Ollama / vLLM).

🛠️ Key Improvements & Solutions

1. Parallel Knowledge Graph Pipeline (`graph_builder.py`, `neo4j_storage.py`)

Replaced iterative chunking mechanisms inside add_text_batch natively utilizing a ThreadPoolExecutor.
Concurrent graph interactions safely manage transactional Neo4j MERGE deadlocks via the existing TransientError exponential backoff implementation.

2. High-Throughput Profile Generation (`oasis_profile_generator.py`)

Background I/O Pipelines: Replaced sequential main-thread file persistence with threading.Thread fire-and-forget daemon tasks.
Concurrent DB Queries: Split sequential edges and nodes hybrid searches into simultaneous tandem DB queries, halving query wait times.
Anti-Deadlocking Constraints: Secured completions endpoints assigning max_tokens=4000. Combined directly with the _fix_truncated_json parser, the system automatically salvages any trailing JSON syntax instead of permanently hanging 128k context windows.

3. Streamlined Report Processing (`report_agent.py`)

Standardized report subsections to build entirely asynchronously minimizing total payload generation times.
Stripped the min_tool_calls metric allowing context-sufficient agents to generate responses instantly.

4. Dynamic Execution Metrics (`config.py`, `.env`)

Centralized hardware scaling configurations up to .env targeting adaptable performance across deployments natively:
- GRAPH_BUILD_BATCH_SIZE=10
- PROFILE_PARALLEL_COUNT=10
- PROFILE_SEARCH_WORKERS=2
- REPORT_PARALLEL_SECTIONS=5

5. Extended Local Deployment Support (`embedding_service.py`)

Abstracted embedding initialization parameters fully restoring unified connectivity targets mapping generalized configurations standardizing Ollama/vLLM accessibility automatically.

…edding

feat: speed up with parallel call and support for vLLM deployment emb…

9badda3

…edding

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: speed up with parallel call and support for vLLM deployment emb…#19

feat: speed up with parallel call and support for vLLM deployment emb…#19
wuxuedaifu wants to merge 1 commit intonikmcfly:mainfrom
wuxuedaifu:feat/parallel_version

wuxuedaifu commented Mar 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

wuxuedaifu commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR: Parallel Execution & Local Inference Optimization(2X+ faster)

🚨 Existing Problems

🛠️ Key Improvements & Solutions

1. Parallel Knowledge Graph Pipeline (graph_builder.py, neo4j_storage.py)

2. High-Throughput Profile Generation (oasis_profile_generator.py)

3. Streamlined Report Processing (report_agent.py)

4. Dynamic Execution Metrics (config.py, .env)

5. Extended Local Deployment Support (embedding_service.py)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

wuxuedaifu commented Mar 24, 2026 •

edited

Loading

1. Parallel Knowledge Graph Pipeline (`graph_builder.py`, `neo4j_storage.py`)

2. High-Throughput Profile Generation (`oasis_profile_generator.py`)

3. Streamlined Report Processing (`report_agent.py`)

4. Dynamic Execution Metrics (`config.py`, `.env`)

5. Extended Local Deployment Support (`embedding_service.py`)