Skip to content
View JetXu-LLM's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report JetXu-LLM

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
JetXu-LLM/README.md

Hi, I'm Jet Xu 👋

LinkedIn Blog Email

"The future of AI coding isn't just larger context windows—it's smarter context retrieval."

I am an architect solving the "Context Precision" problem in AI software engineering.

While others focus on stuffing more code into an LLM, my focus is on Repository Graph RAG—building the "GPS" for codebases. My goal is to enable AI to navigate complex, cross-module dependencies and understand architectural impact with surgical precision and minimal token usage.


1. The Proof of Concept: LlamaPReview

High-Precision Code Review via Contextual Retrieval

I built LlamaPReview to prove that less is more: by retrieving only the relevant dependency graph, we can outperform massive context windows.

Active Repos Combined Stars

  • The Metric: Achieved a 61% Signal-to-Noise Ratio (3x industry average) by filtering out irrelevant code noise.
  • The Evidence: Caught a critical transaction bug in Vanna.ai (20K stars) that required tracing logic across multiple hidden modules—something standard "diff-based" AI missed entirely.
  • The Product: A validated SaaS solution trusted by 4,000+ repositories.

👉 Visit Product Site | 📊 Read the Signal-to-Noise Analysis


2. The Foundation: llama-github

The Retrieval Infrastructure Layer

To build a graph, you first need high-fidelity data. I open-sourced the retrieval engine that powers my experiments.

PyPI

  • Role: A production-grade library designed to fetch and structure GitHub data specifically for RAG pipelines.
  • Capability: Bridges the gap between raw Git objects and AI-ready context.
from llama_github import GithubRAG
# Efficiently retrieve cross-module context without cloning the entire repo
context = github_rag.retrieve_context("How does the payment service impact the user schema?")

👉 View on GitHub


3. The Vision: Repository Graph RAG

Occupying the "Code Understanding" Ecological Niche

LlamaPReview was just the first application. My long-term strategy is to build the definitive Repository Knowledge Graph that serves as the backend for all autonomous coding agents.

  • The Problem: Flat text search (Standard RAG) loses the relationships between classes, methods, and data flows.
  • The Solution: A traversable graph that allows LLMs to "hop" through dependencies.
  • The Value:
    • Token Efficiency: Solves the problem with 5% of the tokens required by full-context approaches.
    • Impact Analysis: Instantly identifies how a change in Module A breaks Module Z without reading the files in between.
    • Scalability: The only viable path for AI to understand million-line monoliths.

📚 Strategic Insights

I document my research on defining the next generation of AI architecture.

  • Case Study: Catching the "Invisible" BugReal-world evidence: How we found a critical logic error in a 20k-star repo that standard "Diff-based" AI missed entirely.
  • The Signal-to-Noise Ratio in AI Code ReviewA new evaluation framework: Why simply increasing context window size often leads to lower quality reviews.
  • (Coming Soon) The Inconsistency ProblemWhy the same AI tool works perfectly on Monday but fails on Tuesday: A deep dive into "Context Instability."
  • (Coming Soon) The End of Guesswork: Repository Graph RAGMoving beyond probabilistic search to deterministic, graph-based dependency analysis for 100% consistent context.

💻 Tech Stack

| Core Intelligence | Python LangChain Hugging Face | | Graph & Data | ArangoDB Neo4j AWS |


📫 Let's Connect

I am building the infrastructure that will power the next decade of AI development tools.


Building the GPS for the world's code.

Pinned Loading

  1. llamapreview-context-research llamapreview-context-research Public

    Research artifacts exploring Search-based vs. Agentic RAG strategies for AI Code Review. A deep dive into solving the "Context Instability" problem in LLM-based software engineering, analyzing trad…

    Python 1

  2. llama-github llama-github Public

    Llama-github is an open-source Python library that empowers LLM Chatbots, AI Agents, and Auto-dev Solutions to conduct Agentic RAG from actively selected GitHub public projects. It Augments through…

    Python 314 21

  3. LlamaPReview-site LlamaPReview-site Public

    Official website for LlamaPReview Github APP: AI-powered GitHub Code review assistant. Showcases features, usage, and benefits of our GitHub App.

    HTML 7 3