Skip to content

Shared codebase for generating Retrieval-Augmented Generation (RAG) vector databases

License

Notifications You must be signed in to change notification settings

lightspeed-core/rag-content

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

161 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RAG Content

RAG Content provides a shared codebase for generating vector databases. It serves as the core framework for Lightspeed-related projects (e.g., OpenShift Lightspeed, OpenStack Lightspeed, etc.) to generate their own vector databases that can be used for RAG.

Installing the Python Library

The lightspeed_rag_content library is not available via pip, but it's included:

Via uv

To install the library via uv, do:

  1. Run the command uv sync

    uv sync
  2. Test if the library can be imported (expect lightspeed-rag-content in the output):

    uv run python -c "import lightspeed_rag_content; print(lightspeed_rag_content.__name__)"

Via Container Image

The base container image can be manually generated or pulled from a container registry.

Prebuilt Image

There are prebuilt two images. One with CPU support only (size cca 3.7 GB) and image with GPU support with CUDA support (size cca 12 GB).

  1. Pull the CPU variant:

    podman pull quay.io/lightspeed-core/rag-content-cpu:latest
  2. Pull the GPU variant:

    podman pull quay.io/lightspeed-core/rag-content-gpu:latest

Build image locally

To build the image locally, follow these steps:

  1. Install the requirements: make and podman.

  2. Generate the container image:

    podman build -t localhost/lightspeed-rag-content-cpu:latest .
  3. The lightspeed_rag_content and its dependencies will be installed in the image (expect lightspeed-rag-content in the output):

    podman run localhost/lightspeed-rag-content-cpu:latest python -c "import lightspeed_rag_content; print(lightspeed_rag_content.__name__)"

Generating the Vector Database

You can generate the vector database either using

  1. Llama-Index Faiss Vector Store
  2. Llama-Index Postgres (PGVector) Vector Store
  3. Llama-Stack Faiss Vector-IO
  4. Llama-Stack SQLite-vec Vector-IO
  5. Llama-Stack Postgres (PGVector) Vector Store

Llama-Index approaches require you to download the embedding model, and we also recommend it for Llama-Stack targets even though it should work even without manually downloading the model, model-download.

All cases require you to prepare documentation in text format that is going to be chunked and map to embeddings generated using the model:

  1. Download the embedding model (sentence-transformers/all-mpnet-base-v2) from HuggingFace as follows:

    mkdir ./embeddings_model
    pdm run python ./scripts/download_embeddings_model.py -l ./embeddings_model/ -r sentence-transformers/all-mpnet-base-v2
  2. Prepare dummy documentation:

    mkdir -p ./custom_docs/0.1
    echo "Vector Database is an efficient way how to provide information to LLM" > ./custom_docs/0.1/info.txt
  3. Prepare a custom script (./custom_processor.py) for populating the vector database. We provide an example of how such a script might look like using the lightspeed_rag_content library. Note that in your case the script will be different:

    from lightspeed_rag_content.metadata_processor import MetadataProcessor
    from lightspeed_rag_content.document_processor import DocumentProcessor
    from lightspeed_rag_content import utils
    
    class CustomMetadataProcessor(MetadataProcessor):
    
        def __init__(self, url):
            self.url = url
    
        def url_function(self, file_path: str) -> str:
            # Return a URL for the file, so it can be referenced when used
            # in an answer
            return self.url
    
    if __name__ == "__main__":
        parser = utils.get_common_arg_parser()
        args = parser.parse_args()
    
        # Instantiate custom Metadata Processor
        metadata_processor = CustomMetadataProcessor("https://www.redhat.com")
    
        # Instantiate Document Processor
        document_processor = DocumentProcessor(
            chunk_size=args.chunk,
            chunk_overlap=args.overlap,
            model_name=args.model_name,
            embeddings_model_dir=args.model_dir,
            num_workers=args.workers,
            vector_store_type=args.vector_store_type,
        )
    
        # Load and embed the documents, this method can be called multiple times
        # for different sets of documents
        document_processor.process(args.folder, metadata=metadata_processor)
    
        # Save the new vector database to the output directory
        document_processor.save(args.index, args.output)

Faiss Vector Store

Generate the documentation using the script from the previous section (Generating the Vector Database):

uv run ./custom_processor.py -o ./vector_db/custom_docs/0.1 -f ./custom_docs/0.1/ -md embeddings_model/ -mn sentence-transformers/all-mpnet-base-v2 -i custom_docs-0_1

Once the command is done, you can find the vector database at ./vector_db, the embedding model at ./embeddings_model, and the Index ID set to custom-docs-0_1.

Postgres (PGVector) Vector Store

To generate a vector database stored in Postgres (PGVector), run the following commands:

  1. Start Postgres with the pgvector extension by running:

    make start-postgres-debug

    The data folder of Postgres is created at ./postgresql/data. This command also creates ./output for the output directory, in which the metadata is saved.

  2. Run:

    POSTGRES_USER=postgres \
    POSTGRES_PASSWORD=somesecret \
    POSTGRES_HOST=localhost \
    POSTGRES_PORT=15432 \
    POSTGRES_DATABASE=postgres \
    uv run python ./custom_processor.py \
     -o ./output \
     -f custom_docs/0.1/ \
     -md embeddings_model/ \
     -mn sentence-transformers/all-mpnet-base-v2 \
     -i custom_docs-0_1 \
     --vector-store-type postgres

    Which generates embeddings on PostgreSQL, which can be used for RAG, and metadata.json in ./output. Generated embeddings are stored in the data_table_name table.

    $ podman exec -it pgvector bash
    $ psql -U postgres
    psql (16.4 (Debian 16.4-1.pgdg120+2))
    Type "help" for help.
    
    postgres=# \dt
                     List of relations
     Schema |          Name          | Type  |  Owner
    --------+------------------------+-------+----------
     public | data_table_name        | table | postgres
    (1 row)

Llama-Stack Vector Stores

Important: Embedding Model Path Portability

When using Llama-Stack vector stores (Faiss or SQLite-vec), the embedding model path specified via the -md (or --model-dir) parameter is written into the generated llama-stack.yaml configuration file as an absolute path. This path is also registered in the llama-stack kv_store database.

When llama-stack later consumes the vector database, it reads the embedding model location from the kv_store. Therefore, the embedding model must be available at the exact same path that was specified during database creation.

Recommendation:

  • Use absolute paths for the -md parameter to avoid ambiguity (e.g., -md /app/embeddings instead of -md embeddings_model).
  • Alternatively, set -md '' (empty string) and use only the -mn flag with a HuggingFace model ID (e.g., -md "" -mn sentence-transformers/all-mpnet-base-v2). Setting -md to empty forces the tool to use the HuggingFace model ID instead of checking for a local directory. This allows llama-stack to download the model from HuggingFace automatically, making the vector database fully portable without path dependencies.

Llama-Stack Faiss

Important

When using the --auto-chunking flag, chunking happens within llama-stack using the OpenAI-compatible Files API. This makes vector stores significantly larger than manual chunking because the Files API stores a redundant copy of the embeddings. Manual chunking results in smaller database files.

The process is basically the same as in the Llama-Index Faiss Vector Store but passing the --vector-store-type parameter; so you generate the documentation using the custom_processor.py script from earlier section (Generating the Vector Database):

pdm run ./custom_processor.py \
  -o ./vector_db/custom_docs/0.1 \
  -f ./custom_docs/0.1/ \
  -md embeddings_model/ \
  -mn sentence-transformers/all-mpnet-base-v2 \
  -i custom_docs-0_1 \
  --vector-store-type=llamastack-faiss

Once the command is done, you can find the vector database (embedded with the registry metadata) at ./vector_db/custom_docs/0.1 with the name faiss_store.db as well as a barebones llama-stack configuration file named llama-stack.yaml for reference, since it's not necessary for the final deployment.

The vector-io will be named custom-docs-0_1:

providers:
 vector_io:
   - provider_id: custom-docs-0_1
     provider_type: inline::faiss
     config:
       kvstore:
         type: sqlite
         namespace: null
         db_path: /home/<user>/rag-content/vector_db/custom_docs/0.1/faiss_store.db

Once we have a database we can use script query_rag.py to check some results:

python scripts/query_rag.py \
  -p vector_db/custom_docs/0.1 \
  -x custom-docs-0_1 \
  -m embeddings_model \
  -k 5 \
  -q "how can I configure a cinder backend"

Llama-Stack SQLite-vec

The process is the same as in the Llama-Stack Faiss but passing a different value on the --vector-store-type parameter; so you generate the documentation using the custom_processor.py script from earlier section (Generating the Vector Database):

pdm run ./custom_processor.py \
  -o ./vector_db/custom_docs/0.1 \
  -f ./custom_docs/0.1/ \
  -md embeddings_model/ \
  -mn sentence-transformers/all-mpnet-base-v2 \
  -i custom_docs-0_1 \
  --vector-store-type=llamastack-sqlite-vec

Once the command is done, you can find the vector database at ./vector_db/custom_docs/0.1 with the name sqlitevec_store.db as well as a barebones llama-stack configuration file named llama-stack.yaml for reference, since it's not necessary for the final deployment.

The vector-io will be named custom-docs-0_1:

providers:
 vector_io:
   - provider_id: custom-docs-0_1
     provider_type: inline::sqlite-vec
     config:
       db_path: /home/<user>/rag-content/vector_db/custom_docs/0.1/sqlitevec_store.db

Once we have a database we can use script query_rag.py to check some results:

python scripts/query_rag.py \
  -p vector_db/custom_docs/0.1 \
  -x custom-docs-0_1 \
  -m embeddings_model \
  -k 5 \
  -q "how can I configure a cinder backend"

Llama-Stack Postgres (PGVector) Vector Store

To generate a vector database stored in Postgres (PGVector) for Llama-Stack, run the following commands:

  1. Start Postgres with the pgvector extension by running:

    make start-postgres-debug

    The data folder of Postgres is created at ./postgresql/data. Note that this command also creates ./output, which is not used for the Llama-Stack version while it is used for Llama-Index version.

  2. Run:

    POSTGRES_USER=postgres \
    POSTGRES_PASSWORD=somesecret \
    POSTGRES_HOST=localhost \
    POSTGRES_PORT=15432 \
    POSTGRES_DATABASE=postgres \
    uv run python ./custom_processor.py \
     -o ./output \
     -f custom_docs/0.1/ \
     -md embeddings_model/ \
     -mn sentence-transformers/all-mpnet-base-v2 \
     -i custom_docs-0_1 \
     --vector-store-type llamastack-pgvector

    Which generates embeddings on PostgreSQL, which can be used for RAG.

  3. When you run query_rag.py to check some results, specify these environment variables for database access:

     POSTGRES_USER=postgres \
     POSTGRES_PASSWORD=somesecret \
     POSTGRES_HOST=localhost \
     POSTGRES_PORT=15432 \
     POSTGRES_DATABASE=postgres \
     uv run python scripts/query_rag.py \
     -p vector_db/custom_docs/0.1 \
     -x custom-docs-0_1 \
     -m embeddings_model \
     -k 5 \
     -q "how can I configure a cinder backend"

Update lockfiles

The lock file is used in this repository:

uv.lock

The lock file needs to be regenerated when new updates (dependencies) are available. Use following commands in order to do it:

uv lock --upgrade
uv sync

Updating Dependencies for Hermetic Builds

Konflux builds run in hermetic mode (air-gapped from the internet), so all dependencies must be prefetched and locked. When you add or update dependencies, you need to regenerate the lock files.

When to Update Dependency Files

Update these files when you:

  • Add/remove/update Python packages in the project
  • Add/remove/update RPM packages in the Containerfile
  • Change the base image version

Updating Python Dependencies

Quick command:

make konflux-requirements

This compiles Python dependencies from pyproject.toml using uv, splits packages by their source index (PyPI vs Red Hat's internal registry), and generates hermetic requirements files with pinned versions and hashes for Konflux builds.

Files produced:

  • requirements.hashes.source.txt – PyPI packages with hashes
  • requirements.hashes.wheel.txt – Red Hat registry packages with hashes
  • requirements.hashes.wheel.pypi.txt - PyPI wheels packages with hashes
  • requirements-build.txt – Build-time dependencies for source packages

The script also updates the Tekton pipeline configurations (.tekton/lightspeed-stack-*.yaml) with the list of pre-built wheel packages.

Updating RPM Dependencies

Prerequisites:

Steps:

  1. List your RPM packages in rpms.in.yaml under the packages field

  2. If you changed the base image, extract its repo file:

# UBI images
podman run -it $BASE_IMAGE cat /etc/yum.repos.d/ubi.repo > ubi.repo
# RHEL images, the current base image.
podman run -it $BASE_IMAGE cat /etc/yum.repos.d/redhat.repo > redhat.repo

If the repo file contains too many entries, we can filter them and keep only required repositories. Here is the command to check active repositories:

dnf repolist

Replace the architecture tag (uname -m) to $basearch so that rpm-lockfile-prototype can replace it with requested architecture names.

sed -i "s/$(uname -m)/\$basearch/g" redhat.repo
  1. Generate the lock file:
make konflux-rpm-lock

This creates rpms.lock.yaml with pinned RPM versions.

About

Shared codebase for generating Retrieval-Augmented Generation (RAG) vector databases

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors 15