Release Notes: Intel® AI for Enterprise Inference on IBM Cloud – Version 1.1.0
·
9 commits
to main
since this release
What's New
Intel® AI for Enterprise Inference with IBM Cloud Deployable Architecture
We're excited to announce the release of Intel AI for Enterprise Inference as an IBM Cloud deployable architecture. This solution automates the deployment of OpenAI-compatible AI inference endpoints powered by Intel® Gaudi® 3 AI accelerators and Intel® Xeon® processors.
System Requirements
| Component | Requirement |
|---|---|
| Operating System | Ubuntu 22.04 |
| Hardware | 3rd-6th Gen Intel® Xeon® Scalable processors |
| AI Accelerators | Intel® Gaudi® 2 & 3 AI Accelerators |
| Gaudi Firmware | v1.21.0 |
| Storage | Minimum 30GB (varies by model) |
Key Features
- Automated Infrastructure Deployment: Complete Kubernetes, model serving and authentication
- Intel® Gaudi® 3 AI Accelerator Support: Optimized for high-performance AI workloads
- OpenAI-Compatible API Endpoints: Seamless integration with existing applications
- Two Deployment Patterns: Flexible options for different infrastructure requirements
Deployment Patterns
Quickstart Pattern
- Documentation: Quickstart Deployment Guide
Standard Pattern
- Documentation: Standard Deployment Guide
Supported Models
| Model Name | Cards Required | Storage | Model ID |
|---|---|---|---|
| meta-llama/Llama-3.1-8B-Instruct | 1 | 20GB | 1 |
| meta-llama/Llama-3.3-70B-Instruct | 4 | 150GB | 12 |
| meta-llama/Llama-3.1-405B-Instruct | 8 | 900GB | 11 |
Prerequisites
Required for All Deployments
- IBM Cloud API Key
- IBM Cloud SSH Key
- Hugging Face Token
Additional requirements for Production deployment
- Custom domain name
- TLS certificate
Documentation
- Sizing Guide - Choose optimal configuration for your workloads
- Quickstart Deployment - Deploy with existing infrastructure
- Standard Deployment - Complete infrastructure automation
- Accessing Deployed Models - API usage instructions
- Complete Documentation
Quick Links
- IBM Documentation: IBM Quick Start Guide
- Project Repository: Enterprise-Inference
- License: Apache License 2.0
Support
For technical support and documentation, please refer to the Enterprise-Inference GitHub repository or consult the comprehensive documentation guides listed above.
Thank you for using Intel® AI for Enterprise Inference!