RAG Content

RAG Content provides a shared codebase for generating vector databases. It serves as the core framework for Lightspeed-related projects (e.g., OpenShift Lightspeed, OpenStack Lightspeed, etc.) to generate their own vector databases that can be used for RAG.

Installing the Python Library

The lightspeed_rag_content library is not available via pip, but it's included:

in the base container image or
it can be installed via UV.

Via uv

To install the library via uv, do:

Run the command uv sync
```
uv sync
```

Test if the library can be imported (expect lightspeed-rag-content in the output):

uv run python -c "import lightspeed_rag_content; print(lightspeed_rag_content.__name__)"

Via Container Image

The base container image can be manually generated or pulled from a container registry.

Prebuilt Image

There are prebuilt two images. One with CPU support only (size cca 3.7 GB) and image with GPU support with CUDA support (size cca 12 GB).

Pull the CPU variant:

podman pull quay.io/lightspeed-core/rag-content-cpu:latest

Pull the GPU variant:

podman pull quay.io/lightspeed-core/rag-content-gpu:latest

Build image locally

To build the image locally, follow these steps:

Install the requirements: make and podman.

Generate the container image:

podman build -t localhost/lightspeed-rag-content-cpu:latest .

The lightspeed_rag_content and its dependencies will be installed in the image (expect lightspeed-rag-content in the output):

podman run localhost/lightspeed-rag-content-cpu:latest python -c "import lightspeed_rag_content; print(lightspeed_rag_content.__name__)"

Generating the Vector Database

You can generate the vector database either using

Llama-Index Faiss Vector Store
Llama-Index Postgres (PGVector) Vector Store
Llama-Stack Faiss Vector-IO
Llama-Stack SQLite-vec Vector-IO
Llama-Stack Postgres (PGVector) Vector Store

Llama-Index approaches require you to download the embedding model, and we also recommend it for Llama-Stack targets even though it should work even without manually downloading the model, model-download.

All cases require you to prepare documentation in text format that is going to be chunked and map to embeddings generated using the model:

Download the embedding model (sentence-transformers/all-mpnet-base-v2) from HuggingFace as follows:

mkdir ./embeddings_model
pdm run python ./scripts/download_embeddings_model.py -l ./embeddings_model/ -r sentence-transformers/all-mpnet-base-v2

Prepare dummy documentation:

mkdir -p ./custom_docs/0.1
echo "Vector Database is an efficient way how to provide information to LLM" > ./custom_docs/0.1/info.txt

Prepare a custom script (./custom_processor.py) for populating the vector database. We provide an example of how such a script might look like using the lightspeed_rag_content library. Note that in your case the script will be different:

from lightspeed_rag_content.metadata_processor import MetadataProcessor
from lightspeed_rag_content.document_processor import DocumentProcessor
from lightspeed_rag_content import utils

class CustomMetadataProcessor(MetadataProcessor):

    def __init__(self, url):
        self.url = url

    def url_function(self, file_path: str) -> str:
        # Return a URL for the file, so it can be referenced when used
        # in an answer
        return self.url

if __name__ == "__main__":
    parser = utils.get_common_arg_parser()
    args = parser.parse_args()

    # Instantiate custom Metadata Processor
    metadata_processor = CustomMetadataProcessor("https://www.redhat.com")

    # Instantiate Document Processor
    document_processor = DocumentProcessor(
        chunk_size=args.chunk,
        chunk_overlap=args.overlap,
        model_name=args.model_name,
        embeddings_model_dir=args.model_dir,
        num_workers=args.workers,
        vector_store_type=args.vector_store_type,
    )

    # Load and embed the documents, this method can be called multiple times
    # for different sets of documents
    document_processor.process(args.folder, metadata=metadata_processor)

    # Save the new vector database to the output directory
    document_processor.save(args.index, args.output)

Faiss Vector Store

Generate the documentation using the script from the previous section (Generating the Vector Database):

uv run ./custom_processor.py -o ./vector_db/custom_docs/0.1 -f ./custom_docs/0.1/ -md embeddings_model/ -mn sentence-transformers/all-mpnet-base-v2 -i custom_docs-0_1

Once the command is done, you can find the vector database at ./vector_db, the embedding model at ./embeddings_model, and the Index ID set to custom-docs-0_1.

Postgres (PGVector) Vector Store

To generate a vector database stored in Postgres (PGVector), run the following commands:

Start Postgres with the pgvector extension by running:
```
make start-postgres-debug
```
The data folder of Postgres is created at ./postgresql/data. This command also creates ./output for the output directory, in which the metadata is saved.

Run:

POSTGRES_USER=postgres \
POSTGRES_PASSWORD=somesecret \
POSTGRES_HOST=localhost \
POSTGRES_PORT=15432 \
POSTGRES_DATABASE=postgres \
uv run python ./custom_processor.py \
 -o ./output \
 -f custom_docs/0.1/ \
 -md embeddings_model/ \
 -mn sentence-transformers/all-mpnet-base-v2 \
 -i custom_docs-0_1 \
 --vector-store-type postgres

Which generates embeddings on PostgreSQL, which can be used for RAG, and metadata.json in ./output. Generated embeddings are stored in the data_table_name table.

$ podman exec -it pgvector bash
$ psql -U postgres
psql (16.4 (Debian 16.4-1.pgdg120+2))
Type "help" for help.

postgres=# \dt
                 List of relations
 Schema |          Name          | Type  |  Owner
--------+------------------------+-------+----------
 public | data_table_name        | table | postgres
(1 row)

Llama-Stack Vector Stores

Important: Embedding Model Path Portability

When using Llama-Stack vector stores (Faiss or SQLite-vec), the embedding model path specified via the -md (or --model-dir) parameter is written into the generated llama-stack.yaml configuration file as an absolute path. This path is also registered in the llama-stack kv_store database.

When llama-stack later consumes the vector database, it reads the embedding model location from the kv_store. Therefore, the embedding model must be available at the exact same path that was specified during database creation.

Recommendation:

Use absolute paths for the -md parameter to avoid ambiguity (e.g., -md /app/embeddings instead of -md embeddings_model).
Alternatively, set -md '' (empty string) and use only the -mn flag with a HuggingFace model ID (e.g., -md "" -mn sentence-transformers/all-mpnet-base-v2). Setting -md to empty forces the tool to use the HuggingFace model ID instead of checking for a local directory. This allows llama-stack to download the model from HuggingFace automatically, making the vector database fully portable without path dependencies.

Llama-Stack Faiss

Important

When using the --auto-chunking flag, chunking happens within llama-stack using the OpenAI-compatible Files API. This makes vector stores significantly larger than manual chunking because the Files API stores a redundant copy of the embeddings. Manual chunking results in smaller database files.

The process is basically the same as in the Llama-Index Faiss Vector Store but passing the --vector-store-type parameter; so you generate the documentation using the custom_processor.py script from earlier section (Generating the Vector Database):

pdm run ./custom_processor.py \
  -o ./vector_db/custom_docs/0.1 \
  -f ./custom_docs/0.1/ \
  -md embeddings_model/ \
  -mn sentence-transformers/all-mpnet-base-v2 \
  -i custom_docs-0_1 \
  --vector-store-type=llamastack-faiss

Once the command is done, you can find the vector database (embedded with the registry metadata) at ./vector_db/custom_docs/0.1 with the name faiss_store.db as well as a barebones llama-stack configuration file named llama-stack.yaml for reference, since it's not necessary for the final deployment.

The vector-io will be named custom-docs-0_1:

providers:
 vector_io:
   - provider_id: custom-docs-0_1
     provider_type: inline::faiss
     config:
       kvstore:
         type: sqlite
         namespace: null
         db_path: /home/<user>/rag-content/vector_db/custom_docs/0.1/faiss_store.db

Once we have a database we can use script query_rag.py to check some results:

python scripts/query_rag.py \
  -p vector_db/custom_docs/0.1 \
  -x custom-docs-0_1 \
  -m embeddings_model \
  -k 5 \
  -q "how can I configure a cinder backend"

Llama-Stack SQLite-vec

The process is the same as in the Llama-Stack Faiss but passing a different value on the --vector-store-type parameter; so you generate the documentation using the custom_processor.py script from earlier section (Generating the Vector Database):

pdm run ./custom_processor.py \
  -o ./vector_db/custom_docs/0.1 \
  -f ./custom_docs/0.1/ \
  -md embeddings_model/ \
  -mn sentence-transformers/all-mpnet-base-v2 \
  -i custom_docs-0_1 \
  --vector-store-type=llamastack-sqlite-vec

Once the command is done, you can find the vector database at ./vector_db/custom_docs/0.1 with the name sqlitevec_store.db as well as a barebones llama-stack configuration file named llama-stack.yaml for reference, since it's not necessary for the final deployment.

The vector-io will be named custom-docs-0_1:

providers:
 vector_io:
   - provider_id: custom-docs-0_1
     provider_type: inline::sqlite-vec
     config:
       db_path: /home/<user>/rag-content/vector_db/custom_docs/0.1/sqlitevec_store.db

Once we have a database we can use script query_rag.py to check some results:

python scripts/query_rag.py \
  -p vector_db/custom_docs/0.1 \
  -x custom-docs-0_1 \
  -m embeddings_model \
  -k 5 \
  -q "how can I configure a cinder backend"

Llama-Stack Postgres (PGVector) Vector Store

To generate a vector database stored in Postgres (PGVector) for Llama-Stack, run the following commands:

Start Postgres with the pgvector extension by running:
```
make start-postgres-debug
```
The data folder of Postgres is created at ./postgresql/data. Note that this command also creates ./output, which is not used for the Llama-Stack version while it is used for Llama-Index version.

Run:

POSTGRES_USER=postgres \
POSTGRES_PASSWORD=somesecret \
POSTGRES_HOST=localhost \
POSTGRES_PORT=15432 \
POSTGRES_DATABASE=postgres \
uv run python ./custom_processor.py \
 -o ./output \
 -f custom_docs/0.1/ \
 -md embeddings_model/ \
 -mn sentence-transformers/all-mpnet-base-v2 \
 -i custom_docs-0_1 \
 --vector-store-type llamastack-pgvector

Which generates embeddings on PostgreSQL, which can be used for RAG.

When you run query_rag.py to check some results, specify these environment variables for database access:

 POSTGRES_USER=postgres \
 POSTGRES_PASSWORD=somesecret \
 POSTGRES_HOST=localhost \
 POSTGRES_PORT=15432 \
 POSTGRES_DATABASE=postgres \
 uv run python scripts/query_rag.py \
 -p vector_db/custom_docs/0.1 \
 -x custom-docs-0_1 \
 -m embeddings_model \
 -k 5 \
 -q "how can I configure a cinder backend"

Update lockfiles

The lock file is used in this repository:

uv.lock

The lock file needs to be regenerated when new updates (dependencies) are available. Use following commands in order to do it:

uv lock --upgrade
uv sync

Updating Dependencies for Hermetic Builds

Konflux builds run in hermetic mode (air-gapped from the internet), so all dependencies must be prefetched and locked. When you add or update dependencies, you need to regenerate the lock files.

When to Update Dependency Files

Update these files when you:

Add/remove/update Python packages in the project
Add/remove/update RPM packages in the Containerfile
Change the base image version

Updating Python Dependencies

Quick command:

make konflux-requirements

This compiles Python dependencies from pyproject.toml using uv, splits packages by their source index (PyPI vs Red Hat's internal registry), and generates hermetic requirements files with pinned versions and hashes for Konflux builds.

Files produced:

requirements.hashes.source.txt – PyPI packages with hashes
requirements.hashes.wheel.txt – Red Hat registry packages with hashes
requirements.hashes.wheel.pypi.txt - PyPI wheels packages with hashes
requirements-build.txt – Build-time dependencies for source packages

The script also updates the Tekton pipeline configurations (.tekton/lightspeed-stack-*.yaml) with the list of pre-built wheel packages.

Updating RPM Dependencies

Prerequisites:

Install rpm-lockfile-prototype
Have an active RHEL Subscription, get activation keys from RH console
Have dnf installed in system

Steps:

List your RPM packages in rpms.in.yaml under the packages field
If you changed the base image, extract its repo file:

# UBI images
podman run -it $BASE_IMAGE cat /etc/yum.repos.d/ubi.repo > ubi.repo
# RHEL images, the current base image.
podman run -it $BASE_IMAGE cat /etc/yum.repos.d/redhat.repo > redhat.repo

If the repo file contains too many entries, we can filter them and keep only required repositories. Here is the command to check active repositories:

dnf repolist

Replace the architecture tag (uname -m) to $basearch so that rpm-lockfile-prototype can replace it with requested architecture names.

sed -i "s/$(uname -m)/\$basearch/g" redhat.repo

Generate the lock file:

make konflux-rpm-lock

This creates rpms.lock.yaml with pinned RPM versions.

Name		Name	Last commit message	Last commit date
Latest commit History 161 Commits
.github		.github
.tekton		.tekton
scripts		scripts
src/lightspeed_rag_content		src/lightspeed_rag_content
tests		tests
.coderabbit.yaml		.coderabbit.yaml
.gitignore		.gitignore
Containerfile		Containerfile
Containerfile-gpu		Containerfile-gpu
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
artifacts.lock.yaml		artifacts.lock.yaml
build-args-konflux.conf		build-args-konflux.conf
pyproject.toml		pyproject.toml
redhat.repo		redhat.repo
requirements-build.txt		requirements-build.txt
requirements.hashes.source.txt		requirements.hashes.source.txt
requirements.hashes.wheel.pypi.txt		requirements.hashes.wheel.pypi.txt
requirements.hashes.wheel.txt		requirements.hashes.wheel.txt
requirements.hermetic.txt		requirements.hermetic.txt
requirements.overrides.txt		requirements.overrides.txt
rpms.in.yaml		rpms.in.yaml
rpms.lock.yaml		rpms.lock.yaml
ubi.repo		ubi.repo
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG Content

Installing the Python Library

Via uv

Via Container Image

Prebuilt Image

Build image locally

Generating the Vector Database

Faiss Vector Store

Postgres (PGVector) Vector Store

Llama-Stack Vector Stores

Important: Embedding Model Path Portability

Llama-Stack Faiss

Llama-Stack SQLite-vec

Llama-Stack Postgres (PGVector) Vector Store

Update lockfiles

Updating Dependencies for Hermetic Builds

When to Update Dependency Files

Updating Python Dependencies

Updating RPM Dependencies

About

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Contributors 15

Uh oh!

Languages

License

lightspeed-core/rag-content

Folders and files

Latest commit

History

Repository files navigation

RAG Content

Installing the Python Library

Via uv

Via Container Image

Prebuilt Image

Build image locally

Generating the Vector Database

Faiss Vector Store

Postgres (PGVector) Vector Store

Llama-Stack Vector Stores

Important: Embedding Model Path Portability

Llama-Stack Faiss

Llama-Stack SQLite-vec

Llama-Stack Postgres (PGVector) Vector Store

Update lockfiles

Updating Dependencies for Hermetic Builds

When to Update Dependency Files

Updating Python Dependencies

Updating RPM Dependencies

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors 15

Uh oh!

Languages

Packages