Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Jul 30, 2025

This PR introduces an alternate implementation of the MNIST neural network training problem that leverages Intel's OneDNN (Deep Neural Network Library) for optimized CPU performance.

Overview

The new implementation provides the same OptimizationProblem interface as the existing Candle-based MNIST implementation but uses Intel's OneDNN library for highly optimized matrix operations and neural network primitives.

Key Features

Performance Optimizations

  • Optimized GEMM operations: Uses OneDNN's highly tuned general matrix multiplication routines
  • Hardware-aware activation functions: Leverages CPU-specific optimizations for ReLU, Tanh, and Logistic functions
  • Memory layout optimization: OneDNN automatically selects optimal memory formats for the target CPU
  • Architecture awareness: Automatically detects and utilizes CPU features like AVX, AVX2, and AVX-512

Implementation Details

  • Same Interface: Implements the OptimizationProblem trait, making it a drop-in replacement for benchmarking
  • Feature Gated: Conditionally compiled with the onednn feature flag to avoid requiring OneDNN installation
  • Thread Safe: Uses interior mutability patterns for safe concurrent access
  • Configurable: Supports multiple activation functions and network architectures

Usage Example

use qqn_optimizer::{MnistOneDnnNeuralNetwork, benchmarks::mnist_onednn::ActivationType};
use rand::{rngs::StdRng, SeedableRng};

let mut rng = StdRng::seed_from_u64(42);
let network = MnistOneDnnNeuralNetwork::create(
    Some(1000),                    // 1000 samples
    &[64, 32],                     // Hidden layers: 64 and 32 neurons
    Some(64),                      // Batch size
    &mut rng,
    Some(ActivationType::ReLU),    // ReLU activation
)?;

// Use with any optimizer
let loss = network.evaluate_f64(&network.initial_point())?;

Benchmarking Support

The implementation includes comprehensive benchmarking tools:

# Compare OneDNN vs Candle performance (requires OneDNN installation)
cargo run --example benchmark_comparison --features onednn --release

# Run OneDNN-specific examples
cargo run --example onednn_mnist --features onednn

Installation

OneDNN must be installed separately. The PR includes:

  • Automated installation script (install_onednn.py) for Ubuntu/Debian systems
  • Comprehensive documentation in docs/onednn_mnist.md
  • Installation instructions for multiple platforms

Files Added

  • src/benchmarks/mnist_onednn.rs - Core OneDNN implementation
  • docs/onednn_mnist.md - Comprehensive documentation and usage guide
  • examples/onednn_mnist.rs - Basic usage example
  • examples/benchmark_comparison.rs - Performance comparison tool
  • install_onednn.py - Automated OneDNN installation script

Compatibility

  • Backward Compatible: Existing code continues to work unchanged
  • Optional Dependency: OneDNN is only required when using the onednn feature
  • Cross-Platform: Works on any system where OneDNN can be installed
  • API Consistent: Maintains the same interface patterns as existing implementations

This implementation enables researchers and practitioners to leverage Intel's highly optimized neural network primitives while maintaining full compatibility with the existing QQN optimization framework.


💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.

Copilot AI changed the title [WIP] Create an alternate version of the mnist problem suite that leverages intel's onednn Add OneDNN-based MNIST neural network implementation for optimized performance Jul 30, 2025
Copilot AI requested a review from acharneski July 30, 2025 22:59
@acharneski acharneski marked this pull request as ready for review July 30, 2025 23:40
@acharneski acharneski merged commit 91666b5 into master Jul 30, 2025
12 of 16 checks passed
@acharneski acharneski deleted the copilot/fix-d3736f03-bf86-4ac3-9968-cfb797e43c6f branch July 30, 2025 23:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants