Release Release Notes: Intel® AI for Enterprise Inference on IBM Cloud – Version 1.1.0 · opea-project/Enterprise-Inference

What's New

Intel® AI for Enterprise Inference with IBM Cloud Deployable Architecture

We're excited to announce the release of Intel AI for Enterprise Inference as an IBM Cloud deployable architecture. This solution automates the deployment of OpenAI-compatible AI inference endpoints powered by Intel® Gaudi® 3 AI accelerators and Intel® Xeon® processors.

System Requirements

Component	Requirement
Operating System	Ubuntu 22.04
Hardware	3rd-6th Gen Intel® Xeon® Scalable processors
AI Accelerators	Intel® Gaudi® 2 & 3 AI Accelerators
Gaudi Firmware	v1.21.0
Storage	Minimum 30GB (varies by model)

Key Features

Automated Infrastructure Deployment: Complete Kubernetes, model serving and authentication
Intel® Gaudi® 3 AI Accelerator Support: Optimized for high-performance AI workloads
OpenAI-Compatible API Endpoints: Seamless integration with existing applications
Two Deployment Patterns: Flexible options for different infrastructure requirements

Deployment Patterns

Quickstart Pattern

Documentation: Quickstart Deployment Guide

Standard Pattern

Documentation: Standard Deployment Guide

Supported Models

Model Name	Cards Required	Storage	Model ID
meta-llama/Llama-3.1-8B-Instruct	1	20GB	1
meta-llama/Llama-3.3-70B-Instruct	4	150GB	12
meta-llama/Llama-3.1-405B-Instruct	8	900GB	11

Prerequisites

Required for All Deployments

IBM Cloud API Key
IBM Cloud SSH Key
Hugging Face Token

Additional requirements for Production deployment

Custom domain name
TLS certificate

Documentation

Sizing Guide - Choose optimal configuration for your workloads
Quickstart Deployment - Deploy with existing infrastructure
Standard Deployment - Complete infrastructure automation
Accessing Deployed Models - API usage instructions
Complete Documentation

Quick Links

IBM Documentation: IBM Quick Start Guide
Project Repository: Enterprise-Inference
License: Apache License 2.0

Support

For technical support and documentation, please refer to the Enterprise-Inference GitHub repository or consult the comprehensive documentation guides listed above.

Thank you for using Intel® AI for Enterprise Inference!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Release Notes: Intel® AI for Enterprise Inference on IBM Cloud – Version 1.1.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's New

Intel® AI for Enterprise Inference with IBM Cloud Deployable Architecture

System Requirements

Key Features

Deployment Patterns

Quickstart Pattern

Standard Pattern

Supported Models

Prerequisites

Required for All Deployments

Additional requirements for Production deployment

Documentation

Quick Links

Support

Uh oh!