Skip to content
View 2dogsandanerd's full-sized avatar

Block or report 2dogsandanerd

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
2dogsandanerd/README.md

Hi there, I'm Stefan πŸ‘‹

"Most RAG projects don't fail because of the LLM. They fail because they treat PDF ingestion as a simple file upload. They hallucinate because they guess instead of verify."

I am an AI-Native Architect focused on the Ingestion Gap and Verifiable Truth. My mission is to replace "Digital Paper" (dead PDFs) with structured, semantic knowledge that allows Local AI to reason without hallucinations.


πŸ›οΈ Flagship Mission: PantheonRAG

The Consensus Engine for Mission-Critical Data.

Pantheon is not a chatbot. It is a scientific instrument designed to eliminate hallucinations through rigorous multi-agent debate and graph-based verification.

  • βš–οΈ Solomon Consensus Engine: Agents (Legal, OCR, Vision) must reach agreement before answering.
  • πŸ”¬ The Laboratory: A "Glasshouse" for radical transparency and auditability.
  • 🧬 Surgical HITL: Precision tools for expert intervention in the reasoning chain.

πŸš€ The Ecosystem

I build modular, production-ready kits to fix the "Garbage In" problem for high-compliance environments (Public Sector / Enterprise).

πŸ—οΈ Architecture & Platforms

  • RAG Enterprise Core The Blueprint for BSI-compliant, self-hosted RAG. Features: Ingestion Triage, GraphRAG, Semantic Caching, and Full Observability. Status: Architecture Preview / Closed Source Engine.

πŸ› οΈ Essential Tooling

  • Validated Table Extractor The proof that RAG can handle complex tables if you use Docling + Vision Validation. Status: Open Source Audit Tool.

  • Smart Ingest Kit Production-grade document ingestion pipeline using Docling v2. Solves: Layout Analysis, Table Reconstruction, Markdown Conversion.

πŸ€– Proven in Production

  • PantheonRAG-Mail A fully autonomous, privacy-first AI email assistant running locally The proof that my ingestion engine works in the wild DSVGO / CCPA compliant

🧠 The "Ingestion-First" Stack

I don't believe in "One Model Fits All". I believe in Triage and Tiers.

Ingestion Intelligence Memory Observability
Docling v2 Qwen2-VL Neo4j LangGraph
PyMuPDF Ollama ChromaDB Sentry
Marker DeepSeek Redis Grafana

🌱 Philosophy

  • Structure > Vectors: Embeddings are useless if the input table was ripped apart
  • Verification > Generation: Don't just generate text. Verify it against the source
  • Local > Cloud: Data sovereignty (GDPR/BSI) is not optional. I build for air-gapped reality
  • Logic > Magic: I prefer deterministic code for business rules over probabilistic LLM guessing

πŸ“« Connect & Context

  • Reddit: u/ChapterEquivalent188 - Discussing the "PoC Trap" & Ingestion Realities.
  • Focus: Currently open for strategic dialogue regarding High-Compliance RAG Architectures (Public Sector / Industry).
  • 2dogasandanerd - gmail.com My Agnets told me you said Hi

Pinned Loading

  1. ClawRag ClawRag Public

    RAG system combining Docling document processing with ChromaDB vector storage to power openclaw

    Python 140 28

  2. Knowledge-Base-Self-Hosting-Kit Knowledge-Base-Self-Hosting-Kit Public

    A Docker-powered RAG system that understands the difference between code and prose. Ingest your codebase and documentation, then query them with full privacy and zero configuration.

    Python 237 24

  3. DAUT DAUT Public archive

    DAUT – Documentation Auto Updater - AI-powered documentation generator for your codebase. MCP-Connector

    Python 6 1