Awesome Open Source AI

A curated list of battle-tested, production-proven open-source AI models, libraries, infrastructure, and developer tools. Only elite-tier projects make this list.

_{by Boring Dystopia Development}

🌱 Emerging • Explore the List • Submission Guidelines • License

📋 Contents

🧬 1. Core Frameworks & Libraries
🧠 2. Open Foundation Models
⚡ 3. Inference Engines & Serving
🤖 4. Agentic AI & Multi-Agent Systems
🔍 5. Retrieval-Augmented Generation (RAG) & Knowledge
🎨 6. Generative Media Tools
🛠️ 7. Training & Fine-tuning Ecosystem
📊 8. MLOps / LLMOps & Production
📈 9. Evaluation, Benchmarks & Datasets
🛡️ 10. AI Safety, Alignment & Interpretability
🧩 11. Specialized Domains
🖥️ 12. User Interfaces & Self-hosted Platforms
🧪 13. Developer Tools & Integrations
📚 14. Resources & Learning

🧬 1. Core Frameworks & Libraries

Core libraries and frameworks used to build, train, and run AI and machine learning systems.

Deep Learning Frameworks

PyTorch - Dynamic computation graphs, Pythonic API, dominant in research and production. The current standard for most frontier AI work.
TensorFlow - End-to-end platform with excellent production deployment, TPU support, and large-scale serving tools.
JAX + Flax - High-performance numerical computing with composable transformations (JIT, vmap, grad). Rising favorite for research and scientific ML.
NumPyro - Probabilistic programming with NumPy powered by JAX for autograd and JIT compilation. Bayesian modeling and inference at scale.
Keras - High-level, beginner-friendly API that now runs on multiple backends (TensorFlow, JAX, PyTorch). Perfect for rapid experimentation.
tinygrad - Minimalist deep learning framework with tiny code footprint. The "you like pytorch? you like micrograd? you love tinygrad!" philosophy - simple yet powerful.
PyTorch Geometric - Library for deep learning on irregular input data such as graphs, point clouds, and manifolds. Part of the PyTorch ecosystem.

Rust ML Frameworks

Burn - Next-generation deep learning framework in Rust. Backend-agnostic with CPU, GPU, WebAssembly support.
Candle (Hugging Face) - Minimalist ML framework for Rust. PyTorch-like API with focus on performance and simplicity.
linfa - Comprehensive Rust ML toolkit with classical algorithms. scikit-learn equivalent for Rust with clustering, regression, and preprocessing.

Julia ML Frameworks

Flux.jl - 100% pure-Julia ML stack with lightweight abstractions on top of native GPU and AD support. Elegant, hackable, and fully integrated with Julia's scientific computing ecosystem.

NLP & Transformers

Transformers (Hugging Face) - The de facto standard library for pretrained NLP models. 1M+ models, 250,000+ downloads/day. BERT, GPT, Llama, Qwen, and hundreds more.
sentence-transformers - Classic library for sentence and image embeddings.
tokenizers (Hugging Face) - Fast state-of-the-art tokenizers for training and inference.

Data Processing & Manipulation

Pandas - The gold standard for data analysis and manipulation in Python.
Polars - Blazing-fast DataFrame library (Rust backend) - modern alternative to pandas for large-scale workloads.
cuDF - GPU DataFrame library from RAPIDS. Accelerates pandas workflows on NVIDIA GPUs with zero code changes using cuDF.pandas accelerator mode.
Modin - Parallel pandas DataFrames. Scale pandas workflows by changing a single line of code - distributes data and computation automatically.
Dask - Parallel computing for big data - scales pandas/NumPy/scikit-learn to clusters.
NumPy - Fundamental array computing library that powers almost every AI stack.
SciPy - Scientific computing algorithms (optimization, linear algebra, statistics, signal processing).
NetworkX - Creation, manipulation, and study of complex networks. The foundational graph analysis library for Python data science.

Classical ML & Gradient Boosting

scikit-learn - Industry-standard library for traditional machine learning (classification, regression, clustering, pipelines).
XGBoost - Scalable, high-performance gradient boosting library. Still dominates Kaggle and tabular competitions.
LightGBM - Microsoft's ultra-fast gradient boosting framework, optimized for speed and memory.
CatBoost - Gradient boosting that handles categorical features natively with great out-of-the-box performance.
sktime - Unified framework for machine learning with time series. Scikit-learn compatible API for forecasting, classification, clustering, and anomaly detection.
StatsForecast - Lightning-fast statistical forecasting with ARIMA, ETS, CES, and Theta models. Optimized for high-performance time series workloads.

AutoML & Hyperparameter Optimization

Optuna - Modern, define-by-run hyperparameter optimization with pruning and visualizations. Extremely popular in 2026.
AutoGluon - AWS AutoML toolkit for tabular, image, text, and multimodal data - state-of-the-art with almost zero code.
FLAML - Microsoft's fast & lightweight AutoML focused on efficiency and low compute.
AutoKeras - Neural architecture search on top of Keras.

Model Training & Optimization Utilities

Hugging Face Accelerate - Simple API to make training scripts run on any hardware (multi-GPU, TPU, mixed precision) with minimal code changes.
DeepSpeed - Microsoft's deep learning optimization library for extreme-scale training (ZeRO, offloading, MoE).
Transformers - Library of pretrained transformer models and utilities for text, vision, audio, and multimodal training and inference.
FlashAttention - Fast exact attention kernels that reduce memory usage and accelerate transformer training and inference.
xFormers - Optimized transformer building blocks and attention operators for PyTorch.
PyTorch Lightning - High-level wrapper for PyTorch that removes boilerplate and adds best practices.
ONNX Runtime - High-performance inference and training for ONNX models across hardware.
einops - Flexible, powerful tensor operations for readable and reliable code. Supports PyTorch, JAX, TensorFlow, NumPy, MLX.
safetensors - Simple, safe way to store and distribute tensors. Fast, secure alternative to pickle for model serialization.
torchmetrics - Machine learning metrics for distributed, scalable PyTorch applications. 80+ metrics with built-in distributed synchronization.
torchao - PyTorch native quantization and sparsity for training and inference. Drop-in optimizations for production deployment.
SHAP - Game theoretic approach to explain the output of any machine learning model. Industry standard for model interpretability.

🧠 2. Open Foundation Models

Pretrained language, multimodal, speech, and video models with publicly available weights.

Large Language Models (Base + Chat)

Qwen3.6-Plus (Alibaba) - Latest flagship series released April 2026 with 1M context window, agentic coding performance competitive with Claude 4.5 Opus, and enhanced multimodal capabilities.
Gemma 4 (Google) - Released April 2026 in four sizes (E2B, E4B, 26B MoE, 31B Dense). First major update in a year with Apache 2.0 license, complex logic, and agentic workflows.
Kimi K2.5 (Moonshot AI) - Frontier open-weight MoE model with 256K context, strong coding and reasoning performance, and native multimodal + tool-use support for agentic workflows.
Phi-4 (Microsoft) - Small but highly capable models optimized for reasoning, edge devices, and on-device inference. Includes Phi-4-reasoning variants with thinking capabilities.
GLM-5 (Zhipu AI) - Strong open model line with solid coding, reasoning, and agentic-task performance.
OLMo 2 (Allen AI) - Fully open-source LLMs (1B–32B) with complete transparency: models, data, training code, and logs. Designed by scientists, for scientists.
Llama 4 (Meta) - First native multimodal MoE open-source models (Scout: 10M context, Maverick: 400B+ params). Released April 2025 with enterprise-grade capabilities.

Coding & Reasoning Models

DeepSeek-Coder-V2 / R1-Coder - Best-in-class open coding model (236B MoE). Outperforms closed models on many code benchmarks.
Qwen3-Coder-Next (Alibaba) - Leading open coding model. Strong Pareto frontier for cost-effective agent deployment.

Multimodal Models (Vision + Language)

Qwen3-VL (Alibaba) - Latest flagship VLM with native 256K context (expandable to 1M), visual agent capabilities, 3D grounding, and superior multimodal reasoning. Major leap over Qwen2.5-VL.
GLM-4.5V / GLM-4.1V-Thinking (Zhipu AI) - Strong multimodal reasoning with scalable reinforcement learning. Compares favorably with Gemini-2.5-Flash on benchmarks.
MiniCPM-V 2.6 - Handles images up to 1.8M pixels with top-tier OCR performance. Excellent for on-device deployment.
Gemma 4 (Google) - Multimodal model supporting vision-language input, optimized for efficiency, complex logic, and on-device use.

Speech & Audio Models (TTS, STT, Music)

Whisper (OpenAI → community forks) - The gold-standard open speech-to-text model. Massive community fine-tunes available.
OuteTTS / CosyVoice 2 - High-quality open TTS with natural prosody and multilingual support.
Fish Speech / StyleTTS 2 - Zero-shot TTS with excellent voice cloning. Extremely popular in 2026.
MusicGen / AudioCraft (Meta) - Open music and audio generation models.
VibeVoice (Microsoft) - Open-source frontier voice AI with expressive, longform conversational speech synthesis. 7B parameter TTS with streaming support.
Chatterbox (Resemble AI) - State-of-the-art open TTS family with 350M parameter Turbo variant. Single-step generation with native paralinguistic tags for realistic dialogue.
Dia (Nari Labs) - 1.6B parameter TTS generating ultra-realistic dialogue in one pass with nonverbal communications (laughter, coughing). Emotion and tone control via audio conditioning.
Step-Audio (StepFun) - 130B-parameter production-ready audio LLM for intelligent speech interaction. Supports multilingual conversations (Chinese, English, Japanese), emotional tones, regional dialects (Cantonese, Sichuanese), adjustable speech rates, and prosodic styles including rap. Apache 2.0 licensed.
Voxtral TTS (Mistral) - 4B parameter state-of-the-art TTS with zero-shot voice cloning, 9-language support, and ~90ms time-to-first-audio for voice agents.

Video & Animation Models

CogVideoX (Zhipu AI / community) - High-quality open text-to-video model (5B-12B).
Mochi 1 (Genmo) - 10B open video model with impressive motion and consistency.

⚡ 3. Inference Engines & Serving

Inference runtimes, serving systems, and optimization tools for running models locally or in production.

Local / On-device Inference

llama.cpp - Pure C/C++ inference engine with GGUF format support. The gold standard for CPU/GPU/Apple Silicon on-device running. Includes llama-server for OpenAI-compatible API.
Ollama - Dead-simple local LLM runner with a one-line install, model registry, and OpenAI-compatible API.
MLX (Apple) - High-performance array framework + LLM inference optimized for Apple Silicon.
MLC-LLM - Deployment engine that compiles and runs LLMs across browsers, mobile devices, and local hardware.
WebLLM - High-performance in-browser LLM inference engine. Runs models directly in the browser with WebGPU acceleration.
llama-cpp-python - Official Python bindings for llama.cpp.
KoboldCpp - User-friendly llama.cpp fork focused on role-playing and creative writing.

High-performance Serving & API Servers

llm-d - Kubernetes-native distributed LLM inference framework. Donated to CNCF by RedHat, Google, and IBM. Intelligent scheduling, KV-cache optimization, and state-of-the-art performance across accelerators.
LMDeploy - Toolkit for compressing, deploying, and serving LLMs from OpenMMLab. 4-bit inference with 2.4x higher performance than FP16, distributed multi-model serving across machines.
vLLM - State-of-the-art serving engine with PagedAttention and continuous batching. Currently the fastest production-grade LLM server.
SGLang - Next-gen serving framework with RadixAttention. Powers xAI's production workloads at 100K+ GPUs scale.
TensorRT-LLM - NVIDIA's official high-performance inference backend.
Aphrodite Engine - vLLM fork optimized for role-play and creative writing.
Triton Inference Server - NVIDIA's production-grade open-source inference serving software. Supports multiple frameworks (TensorRT, PyTorch, ONNX) with optimized cloud and edge deployment.
mistral.rs - Fast, flexible Rust-native LLM inference engine built on Candle. Supports text, vision, audio, image generation, and embeddings with hardware-aware auto-tuning.
KTransformers - Flexible framework for heterogeneous CPU-GPU LLM inference and fine-tuning. Enables running large MoE models by offloading experts to CPU with BF16/FP8 precision support.
llamafile - Mozilla's single-file distributable LLM solution. Bundle model weights, inference engine, and runtime into one portable executable that runs on six OSes without installation.
Xinference - Unified, production-ready inference API for LLMs, speech, and multimodal models. Drop-in GPT replacement with single-line code changes. Supports thousands of models with auto-batching and distributed inference.
LightLLM - Pure Python-based LLM inference and serving framework with lightweight design, easy extensibility, and high-speed performance. Integrates optimizations from FasterTransformer, TGI, vLLM, and SGLang.
TabbyAPI - FastAPI-based API server for ExLlamaV2/V3 backends. OpenAI-compatible API with support for model loading/unloading, embeddings, speculative decoding, multi-LoRA, and streaming.

Quantization, Distillation & Optimization

GGUF (part of llama.cpp) - Modern quantized format that powers most local inference.
bitsandbytes - 8-bit and 4-bit optimizers + quantization.
ExLlamaV2 - Highly optimized CUDA kernels for 4-bit/8-bit inference.
Optimum - Hardware-specific acceleration and quantization.

🤖 4. Agentic AI & Multi-Agent Systems

Frameworks and platforms for building agent-based systems and multi-agent workflows.

Single-Agent Frameworks

LangGraph - Stateful, controllable agent orchestration.
CrewAI - Role-based agent framework.
AutoGen (AG2) - Flexible multi-agent conversation framework.
DSPy - Framework for programming language model pipelines with modules, optimizers, and evaluation loops.
Semantic Kernel - SDK for building and orchestrating AI agents and workflows across multiple programming languages.
smolagents - Lightweight agent framework centered on tool use and code-executing workflows.
LangChain - Foundational library for agents, chains, and memory.
Hermes Agent (NousResearch) - The agent that grows with you. Autonomous server-side agent with persistent memory that learns and improves over time.
Agno - Build, run, and manage agentic software at scale. High-performance framework for multi-agent systems with memory, knowledge, and tools.
Upsonic - Agent framework for fintech and banking with built-in MCP support, guardrails, and tool server architecture.
VoltAgent - TypeScript-first AI agent engineering platform with memory, RAG, workflows, MCP integration, and voice support.

Multi-Agent Orchestration

MetaGPT - Simulates an entire "AI software company".
CAMEL - First and best multi-agent framework for building scalable agent systems. Apache 2.0 licensed with extensive tooling for agent communication and task automation.
Swarms - Bleeding-edge enterprise multi-agent orchestration.
Llama-Agents - Async-first multi-agent system.
Mastra - TypeScript-first agent framework with built-in RAG, workflows, tool integrations, observability and observational memory.
Deer-Flow (ByteDance) - Open-source long-horizon SuperAgent harness that researches, codes, and creates. Handles tasks from minutes to hours with sandboxes, memories, tools, skills, subagents, and message gateway.
OpenAI Agents SDK - Production-ready lightweight framework for multi-agent workflows. The evolution of Swarm with enhanced orchestration capabilities and enterprise-grade features.
AgentScope - Alibaba's production-ready multi-agent framework with 23K+ stars. Features built-in MCP and A2A support, message hub for flexible orchestration, and AgentScope Runtime for production deployment.
Microsoft Agent Framework - Microsoft's official framework combining AutoGen's agent abstractions with Semantic Kernel's enterprise features. Supports Python and .NET with graph-based workflows.
Agency Swarm - Reliable multi-agent orchestration framework built on top of the OpenAI Assistants API with organizational structure modeling.

Autonomous Coding Agents

OpenHands (ex-OpenDevin) - Full-featured open-source AI software engineer.
Goose - Extensible on-machine AI agent for development tasks.
OpenCode - Terminal-native autonomous coding agent.
Aider - Command-line pair-programming agent.
Pi (badlogic) - Terminal coding agent with hash-anchored edits, LSP integration, subagents, MCP support, and package ecosystem.
Mistral-Vibe (Mistral) - Minimal CLI coding agent by Mistral. Lightweight, fast, and designed for local development workflows.
Nanocoder (Nano-Collective) - Beautiful local-first coding agent running in your terminal. Built for privacy and control with support for multiple AI providers via OpenRouter.
Gemini CLI (Google) - Open-source AI agent that brings Gemini's power directly into your terminal. Supports code generation, shell execution, and file editing with full Apache 2.0 licensing.

Domain-Specific Agents

Langflow - Visual low-code platform for agentic workflows.
Dify - Production-ready agentic workflow platform.
OWL (camel-ai/owl) - Advanced multi-agent collaboration system.
AI-Scientist-v2 (SakanaAI) - Workshop-level automated scientific discovery via agentic tree search. Generates novel research ideas, runs experiments, and writes papers.
PraisonAI - 24/7 AI employee team for automating complex challenges. Low-code multi-agent framework with handoffs, guardrails, memory, RAG, and 100+ LLM providers.
Agent-S (Simular AI) - Open agentic framework that uses computers like a human. SOTA on OSWorld benchmark (72.6%) for GUI automation and computer control.

Agent Memory & State

Letta (ex-MemGPT) - Platform for building stateful agents with advanced memory that learn and self-improve over time.
Mem0 - Universal memory layer for AI agents. Persistent, multi-session memory across models and environments.
Hindsight - State-of-the-art long-term memory for AI agents by Vectorize. Fully self-hosted, MIT-licensed, with integrations for LangChain, CrewAI, LlamaIndex, Vercel AI SDK, and more.

🔍 5. Retrieval-Augmented Generation (RAG) & Knowledge

Retrieval systems, vector databases, embedding models, and related tooling for RAG pipelines.

Vector Databases & Search Engines

Chroma - Most popular open-source embedding database.
Qdrant - High-performance vector search engine in Rust.
Weaviate - GraphQL-native vector search engine.
Milvus - Scalable cloud-native vector database.
Faiss - Similarity search and clustering library for dense vectors with CPU and GPU implementations.
LanceDB - Serverless vector DB optimized for multimodal data.
Vespa - AI + Data platform with hybrid search (vector + keyword) and real-time indexing at scale. Battle-tested serving billions of queries daily.
pgvector - PostgreSQL extension for vector similarity search.

Embedding Models

BGE (FlagEmbedding) - BAAI's best-in-class embedding family.
E5 (Microsoft) - High-performance text embeddings for retrieval.

Embedding Benchmarks

MTEB - Massive Text Embedding Benchmark covering 1000+ languages and diverse tasks. The industry standard for evaluating and comparing embedding models.

RAG Frameworks & Advanced Retrieval Tools

LlamaIndex - Full-featured RAG pipeline with advanced indexing.
Haystack - End-to-end NLP and RAG framework.
RAGFlow - Deep-document-understanding RAG engine.
GraphRAG (Microsoft) - Knowledge-graph-based RAG.
Docling - Document processing toolkit for turning PDFs and other files into structured data for GenAI workflows.
Unstructured - Best-in-class document preprocessing.
MinerU - High-accuracy document parsing for LLM and RAG workflows. Converts PDFs, Word, PPTs, and images into structured Markdown/JSON with VLM+OCR dual engine.
ColPali / ColQwen - Vision-language models for document retrieval.
LightRAG - Graph-based RAG with dual-level retrieval system. Simple and fast with comprehensive knowledge discovery (EMNLP 2025).
RAG-Anything - All-in-One Multimodal RAG system for seamless processing of text, images, tables, and equations. Built on LightRAG.
txtai - All-in-one AI framework for semantic search, LLM orchestration and language model workflows. Embeddings database with customizable pipelines.
Infinity - High-throughput, low-latency serving engine for text-embeddings, reranking, CLIP, and ColPali. OpenAI-compatible API.

Web Data Ingestion

Crawl4AI - LLM-friendly web crawler that turns websites into clean Markdown for RAG and agentic workflows.
Lightpanda - Machine-first headless browser in Zig; rendering-free and ultra-lightweight for AI agent browsing.
Paperless-AI - Automated document analyzer for Paperless-ngx with RAG-powered semantic search across your document archive.
Firecrawl - Web Data API for AI - search, scrape, and interact with the web at scale. Clean markdown/JSON output with proxy rotation and JS-blocking handled automatically.

🎨 6. Generative Media Tools

Open-source models and applications for image, video, audio, and 3D generation and editing.

Image Generation & Editing

ComfyUI - Node-based visual workflow editor for Stable Diffusion, FLUX, etc.
Stable Diffusion WebUI Forge - Neo - Actively maintained Forge-based Stable Diffusion web UI with the familiar extension-driven workflow.
Fooocus - Midjourney-style UI with beautiful out-of-the-box results.
Diffusers - PyTorch library for diffusion pipelines spanning image, video, and audio generation.
InvokeAI - Full-featured creative studio.
PowerPaint (OpenMMLab) - Versatile image inpainting model supporting text-guided inpainting, object removal, and outpainting (ECCV 2024).

Video Generation

Wan2.2 (Alibaba) - Leading open Mixture-of-Experts text-to-video model.
HunyuanVideo (Tencent) - 13B-parameter systematic video generation framework. Leading quality among open models.
SkyReels V2/V3 (Skywork) - First open-source infinite-length film generative model using AutoRegressive Diffusion-Forcing.
Mochi 1 (Genmo) - 10B-parameter open video model.
LTX-Video (Lightricks) - Fast native 4K video generation.
Stable Video Diffusion (Stability AI) - Official image-to-video and text-to-video implementation within Stability AI's generative models repository.
Helios (PKU-YuanGroup) - Efficient long-video generation framework with 24GB VRAM support for up to 10,000 frames (5+ minutes) and 1280×768 resolution. Apache 2.0 licensed.

Audio / Music / Voice Generation

AudioCraft / MusicGen (Meta) - Controllable text-to-music and audio models.
ACE-Step 1.5 - Local-first music generation model with broad hardware support across Mac, AMD, Intel, and CUDA devices.
Fish Speech - Zero-shot TTS and voice cloning.
CosyVoice 2 - Natural multilingual TTS with emotional control.
OuteTTS - High-quality open TTS.
Amphion - Comprehensive toolkit for Audio, Music, and Speech Generation (9.7K stars).

3D & Creative Tools

Hunyuan3D-2 (Tencent) - State-of-the-art open image-to-3D and text-to-3D.
Trellis (Microsoft) - Structured 3D latents for high-quality generation.
gsplat (3D Gaussian Splatting tools) - High-performance 3D Gaussian Splatting library.
LichtFeld-Studio - Native application for training, editing, and exporting 3D Gaussian Splatting scenes with MCMC optimization and timelapse generation. GPL-3.0 licensed.

🛠️ 7. Training & Fine-tuning Ecosystem

Tools for model training, fine-tuning, synthetic data generation, and distributed training.

Full Training Frameworks

LLaMA-Factory - One-stop unified framework for SFT, DPO, ORPO, KTO with web UI.
Axolotl - YAML-driven full pipeline for SFT, DPO, GRPO.
ms-swift - Unified training framework for 600+ LLMs and 300+ MLLMs with CPT/SFT/DPO/GRPO (AAAI 2025).
Unsloth - 2× faster, 70% less memory fine-tuning.
LitGPT - Clean from-scratch implementations of 20+ LLMs.
LLM Foundry - Databricks' training framework for composable LLM training with StreamingDataset and Composer.
torchtune - PyTorch-native library for post-training, fine-tuning, and experimentation with LLMs.
TRL (Transformers Reinforcement Learning) - Official library for RLHF, SFT, DPO, ORPO.
verl - Volcano Engine Reinforcement Learning for LLMs with PPO, GRPO, REINFORCE++, DAPO (EuroSys 2025).
NeMo-RL - Scalable toolkit for efficient model reinforcement with DTensor and Megatron backends.

LoRA / PEFT Tools

PEFT (Parameter-Efficient Fine-Tuning) - Official library with LoRA, QLoRA, DoRA, etc.
Liger Kernel - Ultra-fast custom kernels for training speedup.
MergeKit - Advanced model merging tools.

Synthetic Data Generation

distilabel - End-to-end pipeline for synthetic instruction data.
Data-Juicer - High-performance data processing for LLM training.
Argilla - Open-source data labeling + synthetic data platform.
SDV (Synthetic Data Vault) - High-fidelity tabular and relational synthetic data.

Distributed Training

DeepSpeed - Extreme-scale training optimizations.
Colossal-AI - Unified system for 100B+ models.
Megatron-LM - Distributed training framework and reference codebase for large transformer models at scale.
Composer - MosaicML's PyTorch library for scalable, efficient neural network training with algorithmic speedups.
Ray Train - Scalable distributed training.

📊 8. MLOps / LLMOps & Production

Tooling for tracking, deploying, monitoring, and operating AI systems in production.

Experiment Tracking & Versioning

MLflow - End-to-end open platform for the ML/LLM lifecycle.
DVC (Data Version Control) - Git-like versioning for data and models.
ClearML - Open-source platform for experiment tracking, orchestration, data management, and model serving.
Weights & Biases Weave - Open-source tracing and experiment tracking.
Feast - Open source feature store for ML. Manages offline/online feature storage with point-in-time correctness to prevent data leakage. Apache 2.0 licensed.

Deployment & Orchestration

BentoML - Unified framework to build, ship, and scale AI apps.
Ray Serve - Scalable model serving library.
ZenML - Pipeline and orchestration framework for taking ML and LLM systems from development to production.
Kubeflow - Kubernetes-native ML/LLM platform.
KServe - Kubernetes-based model serving.
Metaflow - Netflix's ML platform for building and managing real-world AI systems. Powers thousands of projects at Netflix, Amazon, and DoorDash. Apache 2.0 licensed.
Flyte - Kubernetes-native workflow orchestration platform for AI/ML pipelines. Dynamic, resilient orchestration with strong type safety and reproducibility. Used by Lyft, Spotify, and Gojek. Apache 2.0 licensed.

Monitoring, Evaluation & Observability

Langfuse - #1 open-source LLM observability platform.
Phoenix (Arize) - AI observability & evaluation platform.
Evidently - ML & LLM monitoring framework.
Opik (Comet) - Production-ready LLM evaluation platform.
LiteLLM - AI Gateway to call 100+ LLM APIs in OpenAI format with unified cost tracking, guardrails, load balancing, and logging.
OpenLIT - OpenTelemetry-native LLM observability platform with GPU monitoring, evaluations, prompt management, and guardrails.
OpenLLMetry (Traceloop) - Open-source observability for GenAI/LLM applications based on OpenTelemetry with 25+ integration backends.
Agenta - Open-source LLMOps platform combining prompt playground, prompt management, LLM evaluation, and observability.
Helicone - Open-source LLM observability with request logging, caching, rate limiting, and cost analytics.
Giskard - Open-source evaluation and testing library for LLM agents. Red teaming, vulnerability scanning, RAG evaluation, and safety testing with modular architecture. Apache 2.0 licensed.
Portkey Gateway - Blazing fast AI Gateway to route 200+ LLMs with unified API. Integrated guardrails, load balancing, fallbacks, and cost tracking. MIT licensed.

Guardrails & Safety Tools

NVIDIA NeMo Guardrails - Programmable guardrails toolkit for LLM-based conversational systems. Uses Colang to define dialog flows with input/output rails, jailbreak detection, fact-checking, and hallucination detection. Apache 2.0 licensed.
Guardrails AI - Python framework for adding input/output guardrails to LLM applications. Detects and mitigates risks like PII leakage, toxic language, competitor mentions, with 50+ validators in Guardrails Hub. Apache 2.0 licensed.
LLM Guard - Comprehensive security toolkit for LLM interactions with input/output scanners for prompt injection, PII anonymization, toxic content, secrets detection, and adversarial attack prevention. MIT licensed.
LlamaGuard (Meta) - Open safety classifier models.
Garak - LLM vulnerability scanner.
Promptfoo - LLM testing and red-teaming framework.

📈 9. Evaluation, Benchmarks & Datasets

Benchmarks, evaluation frameworks, datasets, and supporting tools for model assessment.

Benchmark Suites

lm-evaluation-harness (EleutherAI) - De-facto standard for generative model evaluation.
HELM (Stanford) - Holistic Evaluation of Language Models.
SWE-bench - Evaluates LLMs on real-world GitHub issues from 15+ Python repositories.
GAIA - Real-world multi-step agentic benchmark.
OpenCompass - Evaluation platform for benchmarking language and multimodal models across large benchmark suites.
MLPerf Inference - Industry-standard ML inference benchmarks with reference implementations for AI accelerators.
SWE-rebench (Nebius) - Continuously updated benchmark with 21,000+ real-world SWE tasks for evaluating agentic LLMs. Decontaminated, mined from GitHub.
AgentBench (THUDM) - Comprehensive benchmark to evaluate LLMs as agents across 8 diverse environments including household, web shopping, OS interaction, and database tasks. ICLR 2024. Apache 2.0 licensed.

Evaluation Frameworks

DeepEval - The "Pytest for LLMs".
Inspect AI - Framework for large language model evaluations from the UK AI Security Institute.
RAGAs - End-to-end RAG evaluation framework.
Lighteval - Evaluation toolkit for LLMs across multiple backends with reusable tasks, metrics, and result tracking.
Hugging Face Evaluate - Standardized evaluation metrics.
OpenAI Evals - Framework for evaluating LLMs and LLM systems with an open-source registry of 100+ community-contributed benchmarks. MIT licensed.

High-quality Open Datasets & Data Tools

Hugging Face Datasets - Largest open repository of datasets.
FineWeb / FineWeb-2 (Hugging Face) - Curated 15T+ token web dataset for pre-training.
OSWorld - Multimodal agent benchmark dataset.

🛡️ 10. AI Safety, Alignment & Interpretability

Tools for alignment, interpretability, safety evaluation, and adversarial testing.

Safety Evaluation Frameworks

Inspect AI - Framework for large language model evaluations from the UK AI Safety Institute. Systematic capability and safety assessments with built-in scaffolding for multi-turn dialog, tool use, and adversarial testing. MIT licensed.
DeepEval - LLM evaluation framework with built-in safety metrics including hallucination detection, bias detection, toxicity evaluation, and prompt alignment checking. Apache 2.0 licensed.

Alignment & RLHF Tools

Safe-RLHF - Safe reinforcement learning from human feedback.
Alignment Handbook - Complete recipes for full-stack alignment.
OpenRLHF - High-performance distributed RLHF framework.

Interpretability & Explainability

TransformerLens - Gold-standard for mechanistic interpretability.
SAELens - Sparse autoencoders for interpretable features.
Captum - PyTorch's official interpretability library.
SHAP - Game theoretic approach to explain the output of any machine learning model. Industry standard for model interpretability.
XAI - eXplainability toolbox for machine learning with bias evaluation and production monitoring tools.

Fairness & Bias Mitigation

AI Fairness 360 - Comprehensive toolkit for detecting, understanding, and mitigating unwanted algorithmic bias in datasets and ML models.

Adversarial & Red-teaming Tools

Garak - Automated LLM vulnerability scanner.
Promptfoo - Systematic prompt testing and red-teaming.
LLM Guard - Input/output scanner for LLMs.
Adversarial Robustness Toolbox - Python library for machine learning security (evasion, poisoning, extraction, inference attacks).
DeepTeam - Framework to red team LLMs and LLM systems.

🧩 11. Specialized Domains

Scientific AI & Drug Discovery

Boltz - Open-source biomolecular interaction prediction models. Boltz-1 was the first fully open source model to approach AlphaFold3 accuracy; Boltz-2 adds binding affinity prediction for drug discovery. MIT licensed.
OpenFold - Trainable PyTorch reproduction of AlphaFold2. Complete open-source pipeline for protein structure prediction with competitive accuracy to the original. Apache 2.0 licensed.

Medical Imaging & Healthcare AI

MONAI - Medical Open Network for AI. End-to-end framework for healthcare imaging with state-of-the-art, production-ready training workflows. Apache 2.0 licensed.

Game AI & Simulations

Unity ML-Agents - Toolkit for training intelligent agents in games and simulations using deep reinforcement learning. Enables NPC behavior control, automated testing, and game design evaluation. Apache 2.0 licensed.
OpenSpiel - Collection of environments and algorithms for research in general reinforcement learning and search/planning in games from Google DeepMind. Apache 2.0 licensed.

Finance & Quantitative AI

OpenBB - Financial data platform for analysts, quants and AI agents. Open-source investment research infrastructure with extensive data integrations. AGPL-3.0 licensed.
FinGPT - Open-source financial large language models. Democratizing financial AI with data-centric training pipeline and multiple model releases for trading, analysis, and robo-advising. MIT licensed.
FinRL - Financial reinforcement learning framework for quantitative trading. Deep RL library for stock trading, portfolio allocation, and market execution with pre-built environments and benchmarks. MIT licensed.

Computer Vision

OpenCV - World's most widely used computer vision library.
Ultralytics YOLO - State-of-the-art real-time object detection.
Detectron2 - High-performance object detection library.
SAM 2 - Promptable image and video segmentation model with released checkpoints and training code.
Kornia - Differentiable computer vision library.
MediaPipe - Cross-platform multimodal pipelines.

Reinforcement Learning & Robotics

Stable-Baselines3 - Production-ready RL algorithms.
Isaac Lab - GPU-accelerated robot learning framework.
MuJoCo - General-purpose physics simulator for robotics, biomechanics, and ML research. High-fidelity contact dynamics with native Python and C++ bindings. Apache 2.0 licensed.
Gymnasium (ex-OpenAI Gym) - Standard RL environment API.

Time Series & Scientific AI

Time Series Library (TSLib) - Comprehensive benchmark for time-series models.
Chronos (Amazon) - Pretrained foundation models for time-series forecasting.
Darts - Easy-to-use time-series forecasting library.
AutoTS - Automated time series forecasting with broad model selection, ensembling, anomaly detection, and holiday effects. Designed for production deployment with minimal setup.

Edge / On-device AI

TensorFlow Lite - Lightweight on-device ML.
ONNX Runtime - Cross-platform high-performance inference.
ExecuTorch - PyTorch runtime and toolchain for deploying AI models on mobile, embedded, and edge devices.
OpenVINO - Intel's toolkit for edge deployment.
MicroTVM (Apache TVM) - Compiler stack for microcontrollers.

Legal AI & Contract Analysis

OpenContracts - Self-hosted document annotation platform for legal AI. Semantic search, contract analysis, version control, and MCP integration for building legal knowledge bases. AGPL-3.0 licensed.

🖥️ 12. User Interfaces & Self-hosted Platforms

Local AI Chat UIs & Personal Assistants

OpenClaw - Local-first personal AI assistant with multi-channel integrations and full agentic task execution.
Open WebUI - Most popular self-hosted ChatGPT-style interface.
text-generation-webui - Web UI for running local LLMs with multiple backends, extensions, and model formats.
LobeChat - Sleek modern chat UI.
LibreChat - Feature-packed multi-LLM interface.
HuggingChat (self-hosted) - Official open-source codebase for HuggingChat.
Khoj - Self-hostable personal AI assistant for search, chat, automation, and workflows over local and web data.
Newelle - GNOME/Linux desktop virtual assistant with integrated file editor, global hotkeys, and profile manager.
NextChat - Light and fast AI assistant supporting Web, iOS, macOS, Android, Linux, and Windows. One-click deploy with multi-model support. MIT licensed.
big-AGI - AI suite for power users with multi-model "Beam" chats, AI personas, voice, text-to-image, code execution, and PDF import. MIT licensed.
Leon - Your open-source personal assistant. Built around tools, context, memory, and agentic execution. Self-hosted, privacy-focused, and extensible. MIT licensed.

Full Self-hosted AI Platforms

AnythingLLM - All-in-one RAG + agents platform.
Dify - Complete AI application platform with visual builder.
Langflow - Visual low-code platform for LangChain flows.
Flowise - Drag-and-drop LLM app builder.
LocalAI - Open-source AI engine running LLMs, vision, voice, image, and video models on any hardware. Self-hosted OpenAI-compatible API. MIT licensed.
Onyx - Full-featured AI platform with Chat, RAG, Agents, and Actions. 40+ document connectors and every LLM support. MIT licensed (Community Edition).
AI Chatbot Framework - Open-source, self-hosted DIY chatbot building platform with visual conversation builder and NLU capabilities. MIT licensed.

Desktop & Mobile AI Apps

Jan - Local-first AI app framework.
SillyTavern - Highly customizable role-playing frontend.
Chatbox - Powerful desktop AI client for ChatGPT, Claude, and other LLMs. Cross-platform with modern UI. GPLv3 licensed (Community Edition).
Maid - Free and open-source Android app for interfacing with llama.cpp models locally and remote APIs (Anthropic, DeepSeek, Mistral, Ollama, OpenAI). MIT licensed.

Agent & Voice Infrastructure

Pipecat - Open-source framework for voice and multimodal conversational AI. Build real-time voice agents with support for speech-to-text, LLMs, text-to-speech, and live video. BSD-2-Clause licensed.
Agent Chat UI - Web app for interacting with any LangGraph agent (Python & TypeScript) via a chat interface. Stream messages, handle interruptions, and view agent state. MIT licensed.

🧪 13. Developer Tools & Integrations

AI Coding Assistants (open-source)

Continue - Open-source AI coding autopilot for VS Code & JetBrains.
Tabby - Self-hosted AI coding assistant.
Cline - Open-source IDE coding agent that can edit files, run commands, and use tools with user approval.
Open Interpreter - Lets LLMs run code locally.
Roo Code - Open-source editor-based coding agent with multiple modes and tool integrations.
Aider - Terminal-based AI pair programmer.

IDE Plugins & Extensions

llama.vim - Local LLM-powered code completion plugin for Vim/Neovim using llama.cpp. Fast, privacy-first, no API key needed.
CodeCompanion.nvim - AI-powered coding assistant for Neovim. Inline code generation, chat, actions, and tool use with support for multiple LLM providers.
Continue VS Code / JetBrains - Most installed open-source AI extension.
Jupyter AI - Chat and code generation inside notebooks.

UI Components & Chat Libraries

Assistant UI - React/TypeScript library for building production-grade AI chat interfaces. Drop-in components for streaming messages, tool calls, and multi-modal inputs.

Testing & Debugging Tools

Promptfoo - Systematic LLM testing framework.
DeepEval - LLM unit-testing framework.
Garak - LLM vulnerability scanner.
Phoenix (Arize) - AI observability for development.

📚 14. Resources & Learning

Papers with Open Implementations

Papers with Code - Definitive database linking papers to open code and datasets.
Hugging Face Papers - Daily-updated feed of the latest arXiv papers with open weights.
Open LLM Leaderboard (Hugging Face) - Real-time ranking of open models.

Communities, Forums & Newsletters

Hugging Face Discussions - Largest open AI forum.
r/LocalLLaMA - Go-to subreddit for local/open-source LLM topics.

Courses & Interactive Playgrounds

Hugging Face Course - Free hands-on courses using only open models.
Fast.ai - Legendary practical deep learning course.
LangChain Academy - Free courses on agents and RAG.

Starter Projects & Examples

TensorFlow Tutorials - Official guides for beginners to advanced users.
Hugging Face Transformers Notebooks - Run Transformers, Datasets, and more in Colab.

Contributing

Contributions are highly welcome! Please read the CONTRIBUTING.md for guidelines (quality standards, formatting, license requirements, etc.).

Only OSI-approved licenses
Projects must be actively maintained (commits in last 6 months)
High-quality, well-documented, real adoption

License

This list itself is licensed under CC0 1.0 Universal. Feel free to use it for any purpose.

Made with ❤️ for the open-source AI community. Star the repo if you find it useful - it helps more people discover the best open tools!

Name		Name	Last commit message	Last commit date
Latest commit History 136 Commits
.github/workflows		.github/workflows
.moltfounders		.moltfounders
assets		assets
tools		tools
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
EMERGING.md		EMERGING.md
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation