A modular Python framework for building natural language web applications with vector database retrieval, LLM-based ranking, and multiple protocol support (HTTP, MCP, A2A).
- Modular Package Architecture: Install only what you need - core, network, data loading, and provider packages
- Vector Database Support: Azure AI Search, Qdrant, Elasticsearch, OpenSearch, PostgreSQL, Milvus, Snowflake, and more
- Multiple LLM Providers: OpenAI, Azure OpenAI, Anthropic, Google Gemini, Hugging Face, and more
- Multiple Embedding Providers: OpenAI, Azure OpenAI, Google Gemini, Ollama, Snowflake
- Flexible Authentication: API key and Azure Managed Identity support for Azure services
- Multiple Protocol Support:
- HTTP: REST API with JSON and Server-Sent Events (SSE) streaming
- MCP (Model Context Protocol): JSON-RPC 2.0 for AI model integration
- A2A (Agent-to-Agent): JSON-RPC 2.0 for multi-agent communication
- NLWeb Protocol v0.5: Standardized response format with metadata and content
# Core packages
pip install nlweb-dataload # Standalone data loading tools
pip install nlweb-core # Core framework
pip install nlweb-network # Network interfaces (HTTP/MCP/A2A)
# Provider packages (optional)
pip install nlweb-azure-vectordb # Azure AI Search
pip install nlweb-azure-models # Azure OpenAI
# Or install bundles
pip install nlweb-retrieval # All vector database providers
pip install nlweb-models # All LLM/embedding providersgit clone https://github.com/nlweb-ai/NLWeb_Core.git
cd NLWeb_Core
pip install -e packages/dataload
pip install -e packages/core
pip install -e packages/networkCreate config.yaml:
# Vector database
retrieval:
nlweb_azure:
enabled: true
api_key_env: AZURE_VECTOR_SEARCH_API_KEY
api_endpoint_env: AZURE_VECTOR_SEARCH_ENDPOINT
index_name: your-index-name
db_type: azure_ai_search
# LLM provider
llm:
azure_openai:
llm_type: azure_openai
api_key_env: AZURE_OPENAI_API_KEY
endpoint_env: AZURE_OPENAI_ENDPOINT
api_version: "2024-10-21"
models:
high: gpt-4o
low: gpt-4o-mini
# Embedding provider
embedding:
azure_openai:
embedding_type: azure_openai
api_key_env: AZURE_OPENAI_API_KEY
endpoint_env: AZURE_OPENAI_ENDPOINT
api_version: "2024-10-21"
deployment_name: text-embedding-3-large
# Server
server:
host: localhost
port: 8080
enable_cors: trueexport AZURE_VECTOR_SEARCH_API_KEY=your_key
export AZURE_VECTOR_SEARCH_ENDPOINT=https://your-search.search.windows.net
export AZURE_OPENAI_API_KEY=your_key
export AZURE_OPENAI_ENDPOINT=https://your-openai.openai.azure.com# Using the network package
python -m nlweb_network.server.main config.yaml
# Or programmatically
from nlweb_core import init
from nlweb_network.server import main
init('config.yaml')
main()# Health check
curl http://localhost:8080/health
# Non-streaming query
curl "http://localhost:8080/ask?query=pasta+recipes&streaming=false&num_results=5"
# Streaming query (SSE)
curl "http://localhost:8080/ask?query=pasta+recipes&streaming=true"
# With site filter
curl "http://localhost:8080/ask?query=spicy+snacks&site=seriouseats"
# POST request
curl -X POST http://localhost:8080/ask \
-H 'Content-Type: application/json' \
-d '{"query": "best pizza recipes", "num_results": 10}'# List available tools
curl -X POST http://localhost:8080/mcp \
-H 'Content-Type: application/json' \
-d '{
"jsonrpc": "2.0",
"method": "tools/list",
"id": 1
}'
# Call a tool
curl -X POST http://localhost:8080/mcp \
-H 'Content-Type: application/json' \
-d '{
"jsonrpc": "2.0",
"method": "tools/call",
"params": {
"name": "ask",
"arguments": {
"query": "healthy breakfast recipes",
"num_results": 5
}
},
"id": 2
}'# Get agent card
curl -X POST http://localhost:8080/a2a \
-H 'Content-Type: application/json' \
-d '{
"jsonrpc": "2.0",
"method": "agent/card",
"id": 1
}'
# Send message to agent
curl -X POST http://localhost:8080/a2a \
-H 'Content-Type: application/json' \
-d '{
"jsonrpc": "2.0",
"method": "message/send",
"params": {
"message": {
"role": "user",
"parts": [{"kind": "text", "text": "vegan desserts"}]
}
},
"id": 2
}'Use nlweb-dataload to index content into your vector database:
# Load from JSON file
python -m nlweb_dataload.load_from_json \
--json-file recipes.json \
--config config.yaml
# Load from sitemap
python -m nlweb_dataload.load_from_sitemap \
--sitemap-url https://example.com/sitemap.xml \
--config config.yamlSee the dataload README for more details.
nlweb_azure:
enabled: true
api_key_env: AZURE_VECTOR_SEARCH_API_KEY
api_endpoint_env: AZURE_VECTOR_SEARCH_ENDPOINT
index_name: embeddings
db_type: azure_ai_searchnlweb_azure:
enabled: true
api_endpoint_env: AZURE_VECTOR_SEARCH_ENDPOINT
index_name: embeddings
db_type: azure_ai_search
auth_method: azure_ad # Uses DefaultAzureCredentialazure_openai:
llm_type: azure_openai
endpoint_env: AZURE_OPENAI_ENDPOINT
api_version: "2024-10-21"
auth_method: azure_ad
models:
high: gpt-4o
low: gpt-4o-mini- nlweb-dataload: Standalone data loading tools (no dependencies on other NLWeb packages)
- nlweb-core: Core framework and abstractions
- nlweb-network: HTTP/MCP/A2A server and interfaces
- nlweb-azure-vectordb: Azure AI Search provider
- nlweb-azure-models: Azure OpenAI LLM and embedding providers
- nlweb-retrieval: Bundle of all retrieval providers
- nlweb-models: Bundle of all LLM and embedding providers
GET/POST /ask- Query endpoint- Parameters:
query(required),site,num_results,streaming,db
- Parameters:
GET /health- Health check
tools/list- List available toolstools/call- Call a tool with arguments
agent/card- Get agent capabilitiesmessage/send- Send message to agent
- Azure AI Search
- Qdrant (local or cloud)
- Elasticsearch
- OpenSearch
- PostgreSQL with pgvector
- Milvus
- Snowflake Cortex Search
- Cloudflare AutoRAG
- Bing Search API
- OpenAI
- Azure OpenAI
- Anthropic (Claude)
- Google Gemini
- Hugging Face
- Azure Llama
- Azure DeepSeek
- Snowflake Cortex
- Inception
We provide comprehensive test scripts to verify end-to-end functionality:
# Test all protocols (HTTP, MCP, A2A) from PyPI packages
./scripts/test_packages.sh
# Test from TestPyPI
./scripts/test_from_testpypi.sh
# Test from production PyPI
./scripts/test_from_pypi.shSee scripts/ for all available test scripts.
Complete examples are available in the examples/ directory:
- azure_hello_world: Minimal Azure setup with API key auth
- azure_managed_identity: Using Azure Managed Identity
- multi_provider: Multiple LLM and vector database providers
pip install -e "packages/core[dev]"
pytestblack packages/
flake8 packages/See scripts/setup_pypi.md for detailed instructions.
# Upload to TestPyPI
./scripts/upload_to_pypi.sh test
# Upload to production PyPI
./scripts/upload_to_pypi.sh prodMIT License - see LICENSE file for details
Contributions are welcome! Please feel free to submit a Pull Request.
For issues and questions, please open an issue on GitHub.