Basic RAG chat sample app with ChatGPT style.
- All works locally
- Ollama llama3.1 for local LLM
 - Azure AI Search Emulator for local search
 
 - LiteLLM
- as a adaptor for various LLM
 
 - minimal sample code
- backend is Python FastAPI
 - frontend is plain html (instead of React stuff)
 - two types: simple response and stream response like ChatGPT
 
 - Chainlit python low-code UI
 
- start AzureSearchEmulator (see setup below)
 
cd AzureSearchEmulator
docker compose up -d
docker compose logs -f- setup ollama local LLM (see setup below)
 - start dev server
 
poetry install
./start_devserver.sh- Simple Chat: http://127.0.0.1:8000/static/index.html
 - Stream Chat: http://127.0.0.1:8000/static/chat-stream.html
 - Chainlit UI: http://127.0.0.1:8000/chainlit/
 - FastAPI Doc: http://127.0.0.1:8000/docs
 
# open with VSCode
poetry shell
export REQUESTS_CA_BUNDLE=~/.aspnet/https/certificate.pem
code .- local docker server
 
docker build -t rag-chat-app .
docker run --rm --env-file=.env.docker -p 8000:8000 rag-chat-appcurl -v -c cookies.txt -X POST "http://127.0.0.1:8000/chat-stream" \
  -H "Content-Type: application/json" -d '{"input":"hello"}'
curl -v -b cookies.txt -X GET \
  "http://127.0.0.1:8000/chat-history?session_id=521b158d-9daa-4a70-b419-1074cef0c768"- system role message: setting context and guiding the model
 - adding a placeholder for the assistant can be a good practice
- it’s common practice to include a placeholder message with an empty content string.
 - especially if you want to clearly indicate that the assistant's response is expected next.
 
 
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is the weather today?"},
    {"role": "assistant", "content": "It's sunny and warm."},
    {"role": "user", "content": "What about tomorrow?"},  # User's last input
    {"role": "assistant", "content": ""}  # Placeholder for the next assistant response
]- FastAPI: https://github.com/fastapi/fastapi
 
pyenv install 3.12.5
pyenv local 3.12.5
poetry init -n
poetry config virtualenvs.in-project true --local
# notebook cells for dev convenience
poetry add -G dev ipykernel
# Notes:
# fastapi includes httpx (cf. poetry show fastapi)
# pydantic-settings includes python-dotenv (cf. poetry show pydantic-settings)
poetry add fastapi[standard] pydantic-settings
poetry add litellm azure-search-documents- 
linter, formatter, type checker (using VSCode extension)
- linter: flake8
 - formatter: black, isort
 - type checker: Pyright (included in Pylance) instead of mypy
"python.analysis.typeCheckingMode": "basic"in.vscode/settings.json- https://blog.yhiraki.com/nodes/type-checking-with-pyright/
 
 
 - 
pytest
 
poetry add pytest pytest-cov pytest-asyncio pytest-mock --group dev
# to debug in case ModuleNotFoundError (cf. pyproject.toml ini_options)
poetry run pytest ./tests --collect-only
# run tests
poetry run pytest -vv ./tests
# coverage report
poetry run pytest --cov=src --cov-report=html ./tests
open htmlcov/index.html- Ollama: https://github.com/ollama/ollama
- used with LiteLLM-Ollama: https://docs.litellm.ai/docs/providers/ollama
 
 
brew install ollama
ollama run llama3.1  # or phi3 or something you prefer
# >>> Hello!
# Hello there! How can I help you today?
# >>> /bye- drawdown.js: markdown to html converter
 
curl -OL https://raw.githubusercontent.com/adamvleggett/drawdown/refs/heads/master/drawdown.js- Azure Search Emulator: https://github.com/feature23/AzureSearchEmulator
 - dotnet-sdk: https://formulae.brew.sh/cask/dotnet-sdk
 - setup local https: https://qiita.com/j_kitayama_hoge000/items/26cd7a5ef0b2fac53fce
 
dotnet dev-certs https --check
# maybe you will need this
dotnet dev-certs https --trust
# create new certs (path can be different)
dotnet dev-certs https -ep ~/.aspnet/https/aspnetapp.pfx -p password- clone AzureSearchEmulator and edit 
docker-compose.yml- update 
ASPNETCORE_Kestrel__Certificates__Default__Pathandvolumes 
 - update 
 
services:
  web:
    build: .
    ports:
      - 5080:80
      - 5081:443
    environment:
      - ASPNETCORE_URLS=https://+;http://+
      - ASPNETCORE_HTTPS_PORT=5081
      - ASPNETCORE_Kestrel__Certificates__Default__Password=password
      - ASPNETCORE_Kestrel__Certificates__Default__Path=/https/aspnetapp.pfx
    volumes:
      - indexes:/app/indexes
      - ~/.aspnet/https:/https:ro
volumes:
  indexes:- start server and try from curl
 
docker compose up -d
docker compose logs -f
# you should see some json from curl output
curl https://localhost:5081/- convert pfx to pem for Python (MacOS user)
 
cd ~/.aspnet/https
openssl pkcs12 -in aspnetapp.pfx -out certificate.pem -nodes
# Python requires this environment variable
export REQUESTS_CA_BUNDLE=~/.aspnet/https/certificate.pem- you may want to try from Postman or Insomnia for debug in case of trouble
 - (another option?): https://github.com/tomasloksa/azure-search-emulator
 
- Note: collection/ComplexField are not implemented in AzureSearchEmulator
 - make sure to install Jupyter extension in your VSCode
 - open and run with VSCode src/notes/azure-ai-search-notes.py
 
popular Python UI libraries
- install chainlit will be error 
version solving failed.- FastAPI needs 
starlette >=0.37.2,<0.39.0(poetry show fastapi) - Chainlit needs 
starlette >=0.37.2,<0.38.0 - Chainlit needs 
fastapi >=0.110.1,<0.113(poetry show chainlit) 
 - FastAPI needs 
 
# need to remove the current fastapi
# because chainlit (1.2.0) depends on fastapi (>=0.110.1,<0.113)
poetry remove fastapi
poetry show starlette
# Package starlette not found
# this will be version solving failed.
# poetry add "fastapi[standard]" chainlit
# install with required version
poetry add "fastapi[standard]"@^0.112.0 chainlit
# this ensures all dependencies are resolved properly
poetry update
# verify the versions of fastapi and its dependencies
poetry show fastapi
# verify the versions of installed packages
poetry showpoetry show --outdated
poetry update