Skip to content

Release Notes: Intel® AI for Enterprise Inference on IBM Cloud – Version 1.1.0

Choose a tag to compare

@AhmedSeemalK AhmedSeemalK released this 05 Aug 08:01
· 9 commits to main since this release
cdaeb99

What's New

Intel® AI for Enterprise Inference with IBM Cloud Deployable Architecture

We're excited to announce the release of Intel AI for Enterprise Inference as an IBM Cloud deployable architecture. This solution automates the deployment of OpenAI-compatible AI inference endpoints powered by Intel® Gaudi® 3 AI accelerators and Intel® Xeon® processors.

System Requirements

Component Requirement
Operating System Ubuntu 22.04
Hardware 3rd-6th Gen Intel® Xeon® Scalable processors
AI Accelerators Intel® Gaudi® 2 & 3 AI Accelerators
Gaudi Firmware v1.21.0
Storage Minimum 30GB (varies by model)

Key Features

  • Automated Infrastructure Deployment: Complete Kubernetes, model serving and authentication
  • Intel® Gaudi® 3 AI Accelerator Support: Optimized for high-performance AI workloads
  • OpenAI-Compatible API Endpoints: Seamless integration with existing applications
  • Two Deployment Patterns: Flexible options for different infrastructure requirements

Deployment Patterns

Quickstart Pattern

Standard Pattern

Supported Models

Model Name Cards Required Storage Model ID
meta-llama/Llama-3.1-8B-Instruct 1 20GB 1
meta-llama/Llama-3.3-70B-Instruct 4 150GB 12
meta-llama/Llama-3.1-405B-Instruct 8 900GB 11

Prerequisites

Required for All Deployments

  • IBM Cloud API Key
  • IBM Cloud SSH Key
  • Hugging Face Token

Additional requirements for Production deployment

  • Custom domain name
  • TLS certificate

Documentation

Quick Links

Support

For technical support and documentation, please refer to the Enterprise-Inference GitHub repository or consult the comprehensive documentation guides listed above.


Thank you for using Intel® AI for Enterprise Inference!