Skip to content

Add Question-Level Caching to LangChain for RAG Pipelines #252

@bharatht19

Description

@bharatht19

Currently, RedisSemanticCache and other cache integrations in LangChain operate at the LLM layer — caching on the rendered prompt (question + retrieved context).

For many RAG use cases, it would be very useful to cache before retrieval, i.e., at the question level, so that repeated or semantically similar questions can bypass vector DB retrieval entirely.

Why it matters:
• Saves cost and latency (skip retrieval + LLM if cached).
• Supports both exact-match and semantic caching of questions.
• Fits common enterprise use cases where the same queries recur frequently.

What I’m asking:

Could LangChain provide a QuestionCache or extend RedisSemanticCache (and similar backends) with an option to store & lookup only on the raw user question (optionally semantic), before retrieval?

This would complement the existing LLM-level cache and make caching more flexible in RAG pipelines.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions