Skip to content

codelibs/intaste

Repository files navigation

Intaste — Intelligent Assistive Search Technology

License

An open platform for intelligent, assistive, and human-centered search

Intaste is an open-source platform that combines enterprise search with intelligent assistance. Designed with a human-centered philosophy, Intaste keeps users in control while providing AI-powered intent extraction and evidence-based guidance. Search results from Fess serve as transparent evidence, with LLM usage carefully limited to assistance rather than replacement. Built with Next.js (UI) and FastAPI (API).


1. Overview

  • Architecture: intaste-ui (Next.js) / intaste-api (FastAPI) / fess (Search) / opensearch (For Fess) / ollama (LLM)
  • Principle: Intaste does not directly access OpenSearch (only via Fess REST/OpenAPI)
  • Default Model: Ollama gpt-oss (configurable)
  • License: Apache License 2.0

2. Requirements

  • Docker 24+ / Docker Compose v2+
  • CPU x86_64 (arm64 also works, depending on Ollama model compatibility)
  • Memory: 6–8GB recommended (including OpenSearch/Fess/Ollama)

3. Quick Start (5 Minutes)

# 1) Clone repository
$ git clone https://github.com/codelibs/intaste.git
$ cd intaste

# 2) Setup environment variables
$ cp .env.example .env
$ sed -i.bak \
  -e "s/INTASTE_API_TOKEN=.*/INTASTE_API_TOKEN=$(openssl rand -hex 24)/" \
  -e "s/INTASTE_UID=.*/INTASTE_UID=$(id -u)/" \
  -e "s/INTASTE_GID=.*/INTASTE_GID=$(id -g)/" \
  .env

# 3) Initialize data directories (Linux only)
$ make init-dirs
# Note: macOS/Windows users can skip this step

# 4) Start services (build on first run)
$ docker compose up -d --build

# 5) Pull LLM model (first time only)
$ docker compose exec ollama ollama pull gpt-oss

# 6) Health check
$ curl -fsS http://localhost:3000 > /dev/null && echo 'UI OK'
$ curl -fsS http://localhost:8000/api/v1/health && echo 'API OK'

# 7) Access in browser
# http://localhost:3000

Note: Initial startup of OpenSearch/Fess may take several minutes. Compose uses depends_on + healthcheck to control startup order.


4. Health Checks

Intaste provides multiple health check endpoints for monitoring and orchestration:

4.1 Basic Health Check

curl http://localhost:8000/api/v1/health
# Returns: {"status":"ok"}

4.2 Liveness Probe (Kubernetes)

curl http://localhost:8000/api/v1/health/live
# Returns: {"status":"ok"}
  • Use for Kubernetes livenessProbe
  • Checks if the process is alive
  • Does NOT check dependencies

4.3 Readiness Probe (Kubernetes)

curl http://localhost:8000/api/v1/health/ready
# Returns: {"status":"ready"} or {"status":"not_ready"}
  • Use for Kubernetes readinessProbe
  • Checks if service is ready to accept traffic
  • Verifies Fess and Ollama are healthy
  • Returns HTTP 503 if not ready

4.4 Detailed Health Check

curl http://localhost:8000/api/v1/health/detailed | jq .

Example response:

{
  "status": "healthy",
  "timestamp": "2025-01-10T12:34:56.789Z",
  "version": "0.1.0",
  "dependencies": {
    "fess": {
      "status": "healthy",
      "response_time_ms": 45,
      "error": null
    },
    "ollama": {
      "status": "healthy",
      "response_time_ms": 123,
      "error": null
    }
  }
}

Status values:

  • healthy - All dependencies are healthy
  • degraded - Some dependencies are degraded but service still operational
  • unhealthy - Critical dependencies are down

See intaste-api/kubernetes-example.yaml for Kubernetes deployment configuration.


5. Testing the System

5.1 API Smoke Test

# Use INTASTE_API_TOKEN from .env for X-Intaste-Token header
TOKEN=$(grep ^INTASTE_API_TOKEN .env | cut -d= -f2)

curl -sS -H "X-Intaste-Token: $TOKEN" \
     -H 'Content-Type: application/json' \
     -X POST http://localhost:8000/api/v1/assist/query \
     -d '{"query":"What is the latest security policy?"}' | jq .
  • Success if you receive answer.text with [1][2]… style citations.

5.2 Model List / Selection

curl -sS -H "X-Intaste-Token: $TOKEN" http://localhost:8000/api/v1/models | jq .
# Example: {"default":"gpt-oss","available":["gpt-oss","mistral","llama3"]}

curl -sS -H "X-Intaste-Token: $TOKEN" \
     -H 'Content-Type: application/json' \
     -X POST http://localhost:8000/api/v1/models/select \
     -d '{"model":"mistral","scope":"session","session_id":"00000000-0000-0000-0000-000000000000"}'

5.3 Streaming Responses (SSE)

Intaste streams LLM responses in real-time using Server-Sent Events (SSE).

Using the UI:

  • Streaming is enabled by default
  • Answer text appears incrementally as it's generated
  • Citations are displayed as soon as search completes

Testing the API:

# Server-Sent Events (SSE) endpoint
curl -sS -H "X-Intaste-Token: $TOKEN" \
     -H 'Content-Type: application/json' \
     -X POST http://localhost:8000/api/v1/assist/query \
     -d '{"query":"What is the latest security policy?"}'

# Event stream format:
# event: start
# data: {"message":"Processing query..."}
#
# event: intent
# data: {"normalized_query":"...","filters":{...}}
#
# event: citations
# data: {"citations":[...]}
#
# event: chunk
# data: {"text":"Answer text..."}
#
# event: complete
# data: {"answer":{...},"session":{...},"timings":{...}}

Note: All queries use streaming by default. The unified /api/v1/assist/query endpoint supports SSE for real-time updates.


6. Development Mode (Hot Reload)

# Specify dev compose as layer
$ docker compose -f compose.yaml -f compose.dev.yaml up -d --build

# Follow logs
$ docker compose logs -f intaste-api intaste-ui
  • intaste-api: uvicorn --reload
  • intaste-ui: npm run dev -p 3000

7. Configuration (.env)

Variable Default Description
INTASTE_API_TOKEN UI→API authentication key (required)
INTASTE_DEFAULT_MODEL gpt-oss Default Ollama model
INTASTE_SEARCH_PROVIDER fess Search provider (v0.1 supports fess only)
INTASTE_LLM_PROVIDER ollama LLM provider (v0.1 supports ollama only)
FESS_BASE_URL http://intaste-fess:8080 Internal URL for API to call Fess
OLLAMA_BASE_URL http://intaste-ollama:11434 Internal URL for API to call Ollama
INTASTE_LLM_WARMUP_ENABLED true Preload model on startup for faster first requests
INTASTE_UID 1000 Docker user ID for file permissions
INTASTE_GID 1000 Docker group ID for file permissions
NEXT_PUBLIC_API_BASE /api/v1 API base path from UI
REQ_TIMEOUT_MS 15000 Total request timeout budget (ms)
TZ UTC Timezone

Security: Set INTASTE_API_TOKEN to a sufficiently long random value.


8. Directory Structure

intaste/
├─ compose.yaml                # Production deployment
├─ compose.dev.yaml            # Development (hot reload)
├─ compose.gpu.yaml            # GPU support configuration
├─ compose.test.yaml           # Docker-based testing
├─ .env.example                # Environment variables sample
├─ Makefile                    # Common commands (up/down/logs/test)
├─ intaste-ui/                  # Next.js (App Router)
│   ├─ app/                    # Pages
│   ├─ src/                    # State/Components/Libs
│   │   ├─ components/         # UI components (answer/history/input/sidebar/common)
│   │   ├─ libs/               # Utilities (apiClient/streamingClient/sanitizer)
│   │   ├─ store/              # Zustand state management
│   │   └─ types/              # TypeScript type definitions
│   └─ Dockerfile
├─ intaste-api/                 # FastAPI
│   ├─ app/                    # Routers/Services
│   ├─ core/                   # LLM/Search provider abstractions
│   └─ Dockerfile
└─ docs/                       # Comprehensive design documentation

9. Using the UI

  1. Enter a natural language question in the input field at the top and press Enter
  2. View the brief answer with citation markers like [1][2]… in the center
  3. Check selected document snippets in the right panel
  4. Click "Open in Fess" to view the original document
  5. Click suggested follow-ups at the bottom for conversational drill-down

If no citations are found, the UI provides hints for refining the search.


10. Security Considerations

  • Only intaste-ui:3000 should be externally exposed. Keep intaste-api, fess, opensearch, and ollama on internal network
  • UI→API authentication uses X-Intaste-Token header (no cookies)
  • UI CSP/CORS configured with minimal privileges (see Security Design Document v0.1)

11. Troubleshooting

Symptom Cause / Solution
Permission denied on data/ directory (Linux) Container UID/GID mismatch with host. Run make init-dirs or manually: sudo chown -R 1000:1000 data/{opensearch,dictionary}
UI returns 404/timeout API health check failed. Check docker compose ps and docker compose logs intaste-api
Search always returns 0 results Fess index not built. Check Fess admin panel / Crawl configuration
LLM error 503 ollama pull gpt-oss not executed / insufficient memory. Switch to lighter model
API 401 error X-Intaste-Token not set or mismatch. Sync .env value with UI
Slow startup OpenSearch/Fess initialization in progress. Wait until health status shows green/yellow

12. Contributing

  1. Create an Issue with reproduction steps and expected behavior
  2. Fork → Create branch (feat/*, fix/*)
  3. Pass lint/unit tests and create PR
  4. Update design documents (docs/) for major changes

Code conventions (recommended):

  • API: ruff + black, UI: ESLint + Prettier
  • Commit messages: Conventional Commits (feat:, fix:, docs: …)

13. License

Apache License 2.0
Copyright (c) 2025 CodeLibs
  • Copyright notices in NOTICE
  • Dependent OSS licenses consolidated in THIRD-PARTY-NOTICES (future)

14. Testing

# API tests (local)
cd intaste-api
uv run pytest --cov

# UI unit tests (local)
cd intaste-ui
npm test

# E2E tests (local)
cd intaste-ui
npm run test:e2e

# Docker-based testing (isolated environment)
make test-docker                # Run all tests in Docker
make test-docker-api            # Run API tests in Docker
make test-docker-ui             # Run UI tests in Docker
make check-docker               # Run all checks (lint + test)

See TESTING.md for detailed documentation.

15. Documentation

About

Intelligent Assistive Search

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •