Skip to content

Releases: opea-project/GenAIStudio

Generative AI Studio v1.5 Release Notes

22 Dec 08:09
7370588

Choose a tag to compare

OPEA Release Notes v1.5

We are excited to announce the release of OPEA version 1.5, which includes significant contributions from the open-source community.

More information about how to get started with OPEA v1.5 can be found on the Getting Started page. All project source code is maintained in the opea-project organization. To pull Docker images, please access the Docker Hub. For instructions on deploying Helm Charts, please refer to the guide.

Table of Contents

What's New in OPEA v1.5

This release includes new features, optimizations, and user-focused updates.

GenAI Examples

  • Browser-use Agent: a new use case to empower anyone to automate repetitive web tasks. It controls your web browser to perform tasks like visiting websites and extracting data. (GenAIExamples#2312)
  • Arbitration Post Hearing Assistant Application: a new use case designed to process and summarize post-hearing transcripts or arbitration-related documents. (GenAIExamples#2309)
  • Polylingua Translation Service: a new use case for translation. (GenAIComps#2298)
  • OpenAI-Compatible Endpoint Support: ChatQnA now supports OpenAI API-Compatible endpoints. (GenAIComps#2091)

GenAI Microservices

  • Text2Query: a specialized, independent service designed to translate natural language queries into structured query languages. (GenAIComps#1931)
  • Arbitration Post-Hearing: a new microservice for Arbitration Post-Hearing with LLM-Based Entity Extraction. (GenAIComps#1938)
  • FunASR/paraformer: Add a FunASR toolkit-based backend to ASR microservice to support Paraformer, a non-autoregressive end-to-end speech recognition model. (GenAIComps#1914)
  • LLM Scaler: Boosted LLM/LVM performance on ARC GPU by llm-scaler-vllm v0.10.0-b4. (GenAIComps#1914)
  • openEuler OS Support: Enabling openEuler OS Support for OPEA Components. (GenAIComps#1813, GenAIComps#1875, GenAIComps#1879, GenAIComps#1913, GenAIComps#1913)
  • MCP Compliance: Enabled MCP server for some of the OPEA components. (GenAIComps#1849, GenAIComps#1855)
  • OPEA Store: Enhanced data access using OPEA Store for ChatHistory, FeedbackManagement, and PromptRegistry. (GenAIComps#1916)

Productization

Validated Hardware

  • Intel® Gaudi® AI Accelerators (2nd)
  • Intel® Xeon® Scalable processor (3rd)

Validated Software

  • Docker version 28.5.1
  • Docker Compose version v2.40.3
  • Intel® Gaudi® software and drivers v1.22.1
  • TEI v1.7
  • TGI v2.4.0 (Xeon), v2.3.1 (Gaudi)
  • Ubuntu 22.04
  • vLLM v0.10.1 (Xeon), opea/vllm-gaudi:1.22.0 (Gaudi)

Full Changelogs

Contributors

This release would not have been possible without the contributions of the following organizations and individuals.

Contributing Organizations

  • Bud: Polylingua Translation Service, Components as MCP Servers.
  • Intel: Development and improvements to GenAI examples, components, infrastructure, evaluation, and studio.
  • openEuler: openEuler OS support.
  • Zensar: Arbitration Post Hearing Assistant.

Individual Contributors

For a comprehensive list of individual contributors, please refer to the Full Changelogs section.

Generative AI Studio v1.4 Release Notes

25 Aug 00:32
be11083

Choose a tag to compare

OPEA Release Notes v1.4

We are excited to announce the release of OPEA version 1.4, which includes significant contributions from the open-source community. This release addresses over 330 pull requests.

More information about how to get started with OPEA v1.4 can be found on the Getting Started page. All project source code is maintained in the opea-project organization. To pull Docker images, please access the Docker Hub. For instructions on deploying Helm Charts, please refer to the guide.

Table of Contents

What's New in OPEA v1.4

This release includes new features, optimizations, and user-focused updates.

Advanced Agent Capabilities

Components as MCP Servers

OPEA components can now serve as Model Context Protocol (MCP) servers, allowing external MCP-compatible frameworks and applications to integrate with OPEA seamlessly. (GenAIComps#1652)

KubeAI Operator for OPEA

The KubeAI Operator now features an improved autoscaler, monitoring support, optimized resource placement via NRI plugins, and expanded support for new models on Gaudi. (GenAIInfra#967, GenAIInfra#1052, GenAIInfra#1054, GenAIInfra#1089, GenAIInfra#1113, GenAIInfra#1144, GenAIInfra#1150)

New GenAI Capabilities

  • Fine-Tuning of Reasoning Models: This feature is compatible with the dataset format used in FreedomIntelligence/medical-o1-reasoning-SFT, enabling you to customize models with your own data. (GenAIComps#1839)
  • HybridRAG: Combined GraphRAG (knowledge graph-based retrieval) and VectorRAG (vector database retrieval) for enhanced accuracy and contextual relevance. (GenAIExamples#1968)
  • LLM Router: LLM Router decides which downstream LLM serving endpoint is best suited for an incoming prompt. (GenAIComps#1716)
  • OPEA Store: Redis and MongoDB have been integrated into OPEA Store. (GenAIComps#1816, GenAIComps#1818)
  • Guardrails: Added Input/Output Guardrails to enforce content safety and prevent the creation of inappropriate outputs. (GenAIComps#1798)
  • Language Detection: The microservice is used to ensure the pipeline's response matches the query's language. (GenAIComps#1774)
  • Prompt Template: The microservice can dynamically generate system and user prompts based on structured inputs and document context. (GenAIComps#1826)
  • Air-gapped Environment Support: Some OPEA microservices can now be deployed in an air-gapped Docker environment. (GenAIComps#1480)
  • Remote Inference Endpoints Support: Added support for remote inference endpoints for OPEA examples. (GenAIExamples#1973)

Better User Experience

  • One-click Deployment: You can now deploy 8 OPEA examples with one click. ChatQnA can deploy in an air-gapped Docker environment. (GenAIExamples#1727)
  • GenAIStudio: Added support for drag-and-drop creation of documentation summarization and code generation applications. (GenAIStudio#61)
  • Documentation Refinement: Refined READMEs for key examples and components to help readers easily locate documentation tailored to deployment, customization, and hardware. (GenAIExamples#1673, GenAIComps#1398)

Newly Supported Models

OPEA introduces support for the following models in this release.

Model TGI-Gaudi vLLM-CPU vLLM-Gaudi vLLM-ROCm OVMS Optimum-Habana PredictionGuard SGLANG-CPU
meta-llama/Llama-4-Scout-17B-16E-Instruct - - - - - - -
meta-llama/Llama-4-Maverick-17B-128E-Instruct - - - - - - -

(✓: supported; -: not validated; x: unsupported)

Newly Supported Hardware

Newly Supported OS

Updated Dependencies

Dependency Hardware Scope Version Version in OPEA v1.3 Comments
huggingface/text-embeddings-inference all all supported examples cpu-1.7 cpu-1.6
vllm Xeon all supported examples except EdgeCraftRAG v0.10.0 v0.8.3

Changes to Default Behavior

  • CodeTrans: The default model changed from mistralai/Mistral-7B-Instruct-v0.3 to Qwen/Qwen2.5-Coder-7B-Instruct on Xeon and Gaudi.

Validated Hardware

  • Intel® Gaudi® AI Accelerators (2nd)
  • Intel® Xeon® Scalable processor (3rd)
  • Intel® Arc™ Graphics GPU (A770)
  • AMD® EPYC™ processors (4th, 5th)

Validated Software

  • Docker version 28.3.3
  • Docker Compose version v2.39.1
  • Intel® Gaudi® software and drivers v1.21
  • Kubernetes v1.32.7
  • TEI v1.7
  • TGI v2.4.0 (Xeon, EPYC), v2.3.1 (Gaudi), v2.4.1 (ROCm)
  • Torch v2.5.1
  • Ubuntu 22.04
  • vLLM v0.10.0 (Xeon, EPYC), v0.6.6.post1+Gaudi-1.20.0 (Gaudi)

Known Issues

Full Changelogs

Contributors

This release would not have been possible without the contributions of the following organizations and individuals.

Contributing Organizations

  • AMD: AMD EPYC support.
  • Bud: Components as MCP Servers.
  • Intel: Development and improvements to GenAI examples, components, infrastructure, evaluation, and studio.
  • MariaDB: Added ChatQnA docker-compose example on Intel Xeon using Mari...
Read more

Generative AI Studio v1.3 Release Notes

14 May 05:32

Choose a tag to compare

OPEA Release Notes v1.3

We are excited to announce the release of OPEA version 1.3, which includes significant contributions from the open-source community. This release addresses over 520 pull requests.

More information about how to get started with OPEA v1.3 can be found on the Getting Started page. All project source code is maintained in the opea-project organization. To pull Docker images, please access the Docker Hub. For instructions on deploying Helm Charts, please refer to the guide.

Table of Contents

What's New in OPEA v1.3

This release introduces exciting capabilities, optimizations, and user-centric enhancements:

Advanced Agent Capabilities

  • Multi-Turn Conversation: Enhanced the OPEA agent framework for dynamic, context-aware dialogues. (GenAIComps#1248)
  • Finance Agent Example: A financial agent example for automating financial data aggregation and leveraging LLMs to generate insights, forecasts, and strategic recommendations. (GenAIExamples#1539)

Performance and Scalability

  • vLLM Enhancement: Integrated vLLM as the default LLM serving backend for key GenAI examples across Intel® Xeon® processors, Intel® Gaudi® accelerators, and AMD® GPUs. (GenAIExamples#1436)
  • KubeAI Operator for OPEA (Alpha release): Simplified OPEA inference operations in cloud environment and enabled optimal out-of-the-box performance for specific models and hardware using profiles. (GenAIInfra#945)

Ecosystem Integrations

  • Haystack Integration: Enabled OPEA as a backend of Haystack. (Haystack-OPEA#1)
  • Cloud Readiness: Expanded automated Terraform deployment for ChatQnA to include support for Azure, and enabled CodeGen deployments on AWS and GCP. (GenAIExamples#1731)

New GenAI Capabilities

  • OPEA Store: Delivered a unified data store access API and a robust data store integration layer that streamlines data store integration. ArangoDB was integrated. (GenAIComps#1493)
  • CodeGen using RAG and Agent: Leveraged RAG and code agent to provide an additional layer of intelligence and adaptability for CodeGen example. (GenAIExamples#1757)
  • Enhanced Multimodality: Added support for additional audio file types (.mp3) and supported spoken audio captions with image ingestion. (GenAIExamples#1549)
  • Struct to Graph: Supported transforming structured data to graphs using Neo4j graph database. (GenAIComps#1502)
  • Text to Graph: Supported creating graphs from text by extracting graph triplets. (GenAIComps#1357, GenAIComps#1472)
  • Text to Cypher: Supported generating and executing Cypher queries from natural language for graph database retrieval. (GenAIComps#1319)

Enhanced Evaluation

  • Enhanced Long-Context Model Evaluation: Supported evaluating long-context model on Intel® Gaudi® with vLLM. (HELMET#20)
  • TAG-Bench for SQL Agents: Integrated TAG-Bench to evaluate complex SQL query generation (GenAIEval#230).
  • DocSum Support: GenAIEval now supports evaluating the performance of DocSum. (GenAIEval#252)
  • Toxicity Detection Evaluation: Introduced a workflow to evaluate the capability of detecting toxic language based on LLMs. (GenAIEval#241)
  • Model Card: Added a model card generator for generating reports containing model performance and fairness metrics. (GenAIEval#236)

Observability

  • OpenTelemetry Tracing: Leveraged OpenTelemetry to enable tracing for ChatQnA and AgentQnA along with TGI and TEI. (GenAIExamples#1542)
  • Application dashboards: Helm installed application E2E performance dashboard(s). (GenAIInfra#800)
  • E2E (end-to-end) metric improvements: E2E metrics are summed together for applications that use multiple megaservice instances. Tests for the E2E metrics + fixes. (GenAIComps#1301, (GenAIComps#1343)

Better User Experience

  • GenAIStudio: Supported drag-and-drop creation of agentic applications. (GenAIStudio#50)
  • Documentation Refinement: Refined READMEs for key examples to help readers easily locate documentation tailored to deployment, customization, and hardware. (GenAIExamples#1741)
  • Optimized Dockerfiles: Simplified application Dockerfiles for faster image builds. (GenAIExamples#1585)

Exploration

  • SQFT: Supported low-precision sparse parameter-efficient fine-tuning on LLMs. (GenAIResearch#1)

Newly Supported Models

OPEA introduced the support for the following models in this release.

Model TGI-Gaudi vLLM-CPU vLLM-Gaudi vLLM-ROCm OVMS Optimum-Habana PredictionGuard
deepseek-ai/DeepSeek-R1-Distill-Llama-8B - -
deepseek-ai/DeepSeek-R1-Distill-Llama-70B - -
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B - -
deepseek-ai/DeepSeek-R1-Distill-Qwen-7B - -
deepseek-ai/DeepSeek-R1-Distill-Qwen-14B - -
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B - -
deepseek-ai/Deepseek-v3 - - -
Hermes-3-Llama-3.1-8B - - - - -
ibm-granite/granite-3.2-8b-instruct - - - - -
Phi-4-mini x x x x -
Phi-4-multimodal-instruct x x x x -
mistralai/Mistral-Small-24B-Instruct-2501 - - -
mistralai/Mistral-Large-Instruct-2411 x - - -

(✓: supported; -: not validated; x: unsupported)

Newly Supported Hardware

Other Notable Changes

Expand the following lists to read:

GenAIExamples
  • Functionalities
    • [AgentQnA] Added web search tool support and simplify the run instructions. (#1656) (e8f2313)
    • [ChatQnA] Added support for latest deepseek models on Gaudi (#1491) (9adf7a6)
    • [EdgeCraftRAG] A sleek new UI based on Vue and Ant Design for enhanced user experience, supporting concurrent multi-requests on vLLM, JSON pipeline configuration, and API-based prompt modification. (#1665) (5a50ae0)
    • [EdgeCraftRAG] Supported multi-card deployment of Intel ARC GPU for vllm inference ([#1729](https://github.com/opea-project/Gen...
Read more

Generative AI Studio v1.2 Release Notes

27 Jan 02:23

Choose a tag to compare

OPEA Release Notes v1.2

We are excited to announce the release of OPEA version 1.2, which includes significant contributions from the open-source community. This release addresses over 320 pull requests.

More information about how to get started with OPEA v1.2 can be found at Getting Started page. All project source code is maintained in the repository. To pull Docker images, please access the Docker Hub. For instructions on deploying Helm Charts, please refer to the guide.

What's New in OPEA v1.2

This release focuses on code refactoring for GenAIComps, the epic efforts aimed at reducing redundancy, addressing technical debt, and enhancing overall maintainability and code quality. As a result, OPEA users can expect a more robust and reliable OPEA with clearer guidance and improved documentation.

OPEA v1.2 also introduces more scenarios with general availability, including:

  • LlamaIndex and LangChain Integration: Enabling OPEA as a backend. LlamaIndex integration currently supports ChatQnA only.
  • Model Context Protocol(MCP) Support: Experimental support for MCP at Retriever.
  • Cloud Service Providers(CSP) Support: Supported automated Terraform deployment using Intel® Optimized Cloud Modules for Terraform, available for major cloud platforms, including Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure.
  • Enhanced Security: Istio Mutual TLS (mTLS) and OIDC (Open ID Connect) based Authentication with APISIX.
  • Enhancements for GenAI Evaluation: Specialized evaluation benchmarks tailored for Chinese language models, focusing on their performance and accuracy within Chinese dataset.
  • Helm Charts Deployment: Add supports for the examples Text2Image, SearchQnA and their microservices.

Highlights

Code Factoring for GenAIComps

This is an epic task in v1.2. We refactored the entire GenAIComps codebase. This comprehensive effort focused on reducing redundancy, addressing accumulated technical debt, and enhancing the overall maintainability and code quality. The refactoring not only streamlined the architecture but also laid a stronger foundation for future scalability and development.

At the architecture level, OPEA introduces OpeaComponentRegistry and OpeaComponentLoader. The OpeaComponentRegistry manages the lifecycle of component classes, including their registration and deregistration, while the OpeaComponentLoader instantiates components based on the classes in the registry and execute as needed. Unlike previous implementations, this approach ensures that the lifecycle of a component class is transparent to the user, and components are instantiated only when actively used. This design enhances efficiency, clarity, and flexibility in the system.

At the component level, each OPEA component is structured into two layers: the service wrapper and the service provider (named as integrations in the code). The service wrapper, which is optional, acts as a protocol hub and manages service access, while the service provider delivers the actual functionality. This architecture allows components to be seamlessly integrated or removed without requiring code changes, enabling a modular and adaptable system. All the existing components have ported to the new architecture.

Additionally, we reduced code redundancy, merged overlapping modules, and implemented adjustments to align with the new architectural changes.

Note

We suggest users and contributors to review the documentation to understand the impacts of the code refactoring.

Supporting Cloud Service Providers

OPEA offers automated Terraform deployment using Intel® Optimized Cloud Modules for Terraform, available for major cloud platforms, including AWS, GCP, and Azure. To explore this option, check out the Terraform deployment guide.

Additionally, OPEA supports manual deployment on virtual servers across AWS, GCP, IBM Cloud, Azure, and Oracle Cloud Infrastructure (OCI). For detailed instructions, refer to the manual deployment guide.

Enhanced GenAI Components

  • vLLM support for embeddings and rerankings: Integrate vLLM as a serving framework to enhance the performance and scalability of embedding and reranking models.
  • Agent Microservice:
    • SQL agent strategy: Take user question, hints (optional) and history (when available), and think step by step to solve the problem by interacting with a SQL database. OPEA currently has two types of SQL agents: sql_agent_llama for using with open-source LLMs and sql_agent: for using with OpenAI models.
    • Enabled user-customized tool subsets: Added support for user-defined subsets of tools for the ChatCompletion API and Assistant APIs.
    • Enabled persistence: Introduced Redis to persist Agent configurations and historical messages for Agent recovery and multi-turn conversations.
  • Long-context Summarization: Supported multiple modes: auto, stuff, truncate, map_reduce, and refine.
  • Standalone Microservice Deployment: Enabled the deployment of OPEA components as independent services, allowing for greater flexibility, scalability, and modularity in various application scenarios.
  • PDF Inputs Support: Support PDF inputs for dataprep, embeddings, LVMs, and retrievers.

New GenAI Components

  • Bedrock: OPEA LLM now supports Amazon Bedrock as the backend of the text generation microservice. Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies through a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible AI.
  • OpenSearch Vector Database: OPEA vectorstores now supports AWS OpenSearch. OpenSearch is an open-source, enterprise-grade search and observability suite that brings order to unstructured data at scale.
  • Elasticsearch Vector Database: OPEA vectorestores now supports Elasticsearch vector database, Elasticsearch's open source vector database offering an efficient way to create, store, and search vector embeddings.
  • Guardrail Hallucination Detection: Added the capability of detecting Hallucination which spans a wide range of issues that can impact reliability, trustworthiness, and utility of AI-generated content.

Enhanced GenAI Examples

Enhanced GenAIStudio

In this release, GenAI Studio enables Keycloak for multi-user management, supporting sandbox environment for multi-workflow execution and enables Grafana based visualization dashboards with built-in performance metric on Prometheus for model evaluation and functional nodes performance.

Newly Supported Models

  • bge-base-zh-v1.5
  • Falcon2-40B/11B
  • Falcon3

Newly Supported Hardware

Read more

Generative AI Studio v1.1 Release Notes

26 Nov 00:47
4042d18

Choose a tag to compare

OPEA Release Notes v1.1

We are pleased to announce the release of OPEA version 1.1, which includes significant contributions from the open-source community. This release addresses over 470 pull requests.

More information about how to get started with OPEA v1.1 can be found at Getting Started page. All project source code is maintained in the repository. To pull Docker images, please access the Docker Hub. For instructions on deploying Helm Charts, please refer to the guide.

What's New in OPEA v1.1

This release introduces more scenarios with general availability, including:

Highlights

New GenAI Examples

  • AvatarChatbot: a chatbot that combines a virtual "avatar" that can run on either Intel Gaudi 2 AI Accelerator or Intel Xeon Scalable Processors.
  • DBQnA: for seamless translation of natural language queries into SQL and deliver real-time database results.
  • EdgeCraftRAG: a customizable and tunable RAG example for edge solutions on Intel® Arc™ GPUs.
  • GraphRAG: a Graph RAG-based approach to summarization.
  • Text2Image: an application that generates images based on text prompts.
  • WorkflowExecAgent: a workflow executor example to handle data/AI workflow operations via LangChain agents to execute custom-defined workflow-based tools.

Enhanced GenAI Examples

New GenAI Components

Enhanced GenAI Components

GenAIStudio

GenAI Studio, a new project of OPEA, streamlines the creation of enterprise Generative AI applications by providing an alternative UI-based processes to create end-to-end solutions. It supports GenAI application definition, evaluation, performance benchmarking, and deployment. The GenAI Studio empowers developers to effortlessly build, test, optimize their LLM solutions, and create a deployment package. Its intuitive no-code/low-code interface accelerates innovation, enabling rapid development and deployment of cutting-edge AI applications with unparalleled efficiency and precision.

Enhanced Observability

Observability offers real-time insights into component performance and system resource utilization. We enhanced this capability by monitoring key system metrics, including CPU, host memory, storage, network, and accelerators (such as Intel Gaudi), as well as tracking OPEA application scaling.

Helm Charts Support

OPEA examples and microservices support Helm Charts as the packaging format on Kubernetes (k8s). The newly supported examples include AgentQnA, AudioQnA, FaqGen, VisualQnA. The newly supported microservices include chathistory, mongodb, prompt, and Milvus for data-prep and retriever. Helm Charts have now option to get Prometheus metrics from the applications.

Long-context Benchmark Support

We added the following two benchmark kits to response to the community's requirements of long-context language models.

  • HELMET: a comprehensive benchmark for long-context language models covering seven diverse categories of tasks. The datasets are application-centric and are designed to evaluate models at different lengths and levels of complexity.
  • LongBench: a benchmark tool for bilingual, multitask, and comprehensive assessment of long context understanding capabilities of large language models.

Newly Supported Models

  • llama-3.2 (1B/3B/11B/90B)
  • glm-4-9b-chat
  • Qwen2/2.5 (7B/32B/72B)

Newly Supported Hardware

Notable Changes

GenAIExamples
  • Functionalities
    • New GenAI Examples
      • [AvatarChatbot] Initiate "AvatarChatbot" (audio) example (cfffb4c, 960805a)
      • [DBQnA] Adding DBQnA example in GenAIExamples (c0643b7, 6b9a27d)
      • [EdgeCraftRag] Add EdgeCraftRag as a GenAIExample (c9088eb, 7949045, 096a37a)
      • [GraphRAG] Add GraphRAG example a65640b
      • [Text2Image]: Add example for text2image 085d859
      • [WorkflowExecAgent] Add Workflow Executor Example bf5c391
    • Enhanced GenAI Examples
Read more