Skip to content

knoopx/llm-workbench

Repository files navigation

LLM Workbench

LLM Workbench is a user-friendly web interface designed for large language models, built with React and MobX, styled using Shadcn UI. It serves as a one-stop solution for all your large language model needs, enabling you to harness the power of free, open-source language models on your local machine.

Chat Playground Agent

Getting Started

To start your journey, choose between a HuggingFace Text Inference Generation Endpoint or Ollama.

HuggingFace Text Inference Generation Endpoint

docker run --gpus all --shm-size 1g -p 8080:80 -v (pwd)/models:/data ghcr.io/huggingface/text-generation-inference:1.1.0 --trust-remote-code --model-id TheBloke/deepseek-coder-33B-instruct-AWQ --quantize awq

Ollama

OLLAMA_ORIGINS="https://knoopx.github.io" ollama serve

or add this line to /etc/systemd/system/ollama.service:

Environment=OLLAMA_ORIGINS="https://knoopx.github.io"

Restart Ollama using these commands:

systemctl daemon-reload
systemctl restart ollama

🎭 Features

💬 Chat Interface

  • Simple, clean interface: We've designed a user-friendly interface that makes it easy for you to interact with the AI model.
  • Output streaming: See the generated text in real-time as you type your prompt.
  • Regenerate/Continue/Undo/Clear: Use these buttons to control the generation process.
  • Markdown Rendering: The AI will generate text that supports Markdown formatting, making it easy for you to create styled content.
  • Generation canceling: Stop the generation process at any time by clicking the "Cancel" button.
  • Dark mode: Prefer working in the dark? Toggle on Dark mode for a more comfortable experience.
  • Attachments: Attach files to your chat messages (pdf, docx, and plain-text supported only).

🛹 Playground

  • Copilot-alike inline completion: Type your prompt and let the AI suggest completions as you type.
  • Tab to accept: Press the Tab key to accept the suggested completion.
  • Cltr+Enter to re-generate: Hit Ctrl+Enter to re-generate the response with the same prompt.

🤖 Agents

  • Connection Adapters: We support various connection adapters, including Ollama and HuggingFace TGI (local or remote).
  • Complete generation control: Customize the agent behavior with system prompts, conversation history, and chat prompt templates using liquidjs.

Future Ideas

  • Import/Export chats - Importing and exporting chat data for convenience.
  • Token Counter - A feature to count tokens in text.
  • Copy fenced block to clipboard - The ability to copy a code block and paste it into the clipboard.
  • Collapsible side panels - Side panels that can be expanded or collapsed for better organization.
  • window.ai integration

Code Interpreters:

Model management features:

RAG, embeddings and vector search:

Other potential pipelines to consider: