This release completely refactors the directory structure of the repository for a more seamless and intuitive developer journey. It also adds support to deploy the latest accelerated embedding and reranking models across the cloud, data center, and workstation using NVIDIA NeMo Retriever NIM microservices.
Added
- End-to-end RAG examples enhancements
- Single-command deployment for all the examples using Docker Compose.
- All end to end RAG examples are now more encapsulated with documentation, code and deployment assets residing in dedicated example specific directory.
- Segregated examples into basic and advanced RAG with dedicated READMEs.
- Added reranker model support to multi-turn RAG example.
- Added dedicated prompt configuration file for every example.
- Removed Python dev packages from containers to enhance security.
- Updated to latest version of langchain-nvidia-ai-endpoints.
- Speech support using RAG Playground
- Added support to access RIVA speech models from NVIDIA API Catalog.
- Speech support in RAG Playground is opt-in.
- Documentation enhancements
- Added more comprehensive how-to guides for end to end RAG examples.
- Added example specific architecture diagrams in each example directory.
- Added a new industry specific top level directory
- Added notebooks showcasing new usecases
- Basic langchain based RAG pipeline using latest NVIDIA API Catalog connectors.
- Basic llamaindex based RAG pipeline using latest NVIDIA API Catalog connectors.
- NeMo Guardrails with basic langchain RAG.
- NVIDIA NIM microservices using NeMo Guardrails based RAG.
- Using NeMo Evaluator using Llama 3.1 8B Instruct.
- Agentic RAG pipeline with Nemo Retriever and NIM for LLMs.
- Added new
community
(beforeexperimental
) example- Create a simple web interface to interact with different selectable NIM endpoints. The provided interface of this project supports designing a system prompt to call the LLM.
Changed
- Major restructuring and reorganisation of the assets within the repository
- Top level
experimental
directory has been renamed ascommunity
. - Top level
RetrievalAugmentedGeneration
directory has been renamed as justRAG
. - The Docker Compose files inside top level
deploy
directory has been migrated to example-specific directories underRAG/examples
. The vector database and on-prem NIM microservices deployment files are underRAG/examples/local_deploy
. - Top level
models
has been renamed tofinetuning
. - Top level
notebooks
directory has been moved to underRAG/notebooks
and has been organised framework wise. - Top level
tools
directory has been migrated toRAG/tools
. - Top level
integrations
directory has been moved intoRAG/src
. RetreivalAugmentedGeneration/common
is now residing underRAG/src/chain_server
.RetreivalAugmentedGeneration/frontend
is now residing underRAG/src/rag_playground/default
.5 mins RAG No GPU
example under top levelexamples
directory, is now undercommunity
.
- Top level
Deprecated
- Github pages based documentation is now replaced with markdown based documentation.
- Top level
examples
directory has been removed. - Following notebooks were removed