Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
113 changes: 63 additions & 50 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,11 @@

[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE)

Media + AI video analytics benchmarking application utilizing [DLStreamer](https://github.com/open-edge-platform/edge-ai-libraries/tree/main/libraries/dl-streamer) and [GStreamer](https://gstreamer.freedesktop.org)*. Measures end-to-end throughput in fps, pipeline stream density, package power, and workload efficiency as FPS per Package Watt.
Edge Workloads and Benchmarks are performance-optimized pipelines that utilizes the [GStreamer multimedia framework](https://gstreamer.freedesktop.org) and the [Deep Learning Streamer (DL Streamer)](https://github.com/open-edge-platform/edge-ai-libraries/tree/main/libraries/dl-streamer) for validating media and edge AI video analytics. The pipelines measure end-to-end throughput in frames per second (fps), pipeline stream density in fps, package power in watts, and workload efficiency in fps per package watt.

### Pipeline Architecture

HEVC 1080p Video Decode (GPU HW-Accelerated) → Object Detection (GPU/NPU) → Object Tracking → 1-2x Object Classification (GPU/NPU)
HEVC 1080p Video Decode (GPU Hardware-Accelerated) → Object Detection (GPU or NPU) → Object Tracking → 1-2x Object Classification (GPU or NPU)

## Pipeline Configurations
| Config | Video | Detection | Classification |
Expand All @@ -15,42 +15,50 @@ HEVC 1080p Video Decode (GPU HW-Accelerated) → Object Detection (GPU/NPU) →
| medium | apple.h265 (1 obj/frame) | YOLOv5m (640x640) INT8 | ResNet‑50 + MobileNet‑V2 (224x224) INT8 |
| heavy | bears.h265 (2 obj/frame) | YOLOv11m (640x640) INT8 | ResNet‑50 + MobileNet‑V2 (224x224) INT8 |

Pipeline configurations include single-device pipelines (GPU/NPU-only), pipelines with multiple devices (GPU Detect + NPU Classify), and multiple single-device pipelines running concurrently in separate processes (GPU-Only + NPU-Only concurrently).
Pipeline configurations include single-device pipelines (GPU or NPU only), pipelines with multiple devices (GPU Detect and NPU Classify), and multiple single-device pipelines running concurrently in separate processes (GPU only and NPU only, concurrently).

## Prerequisites
**System Requirements:**
A GPU with VA-API media support is required for this workload.
- Validated against Ubuntu 24.04.3 LTS with Kernel 6.16+
- Docker installed and user in docker group
- Integrated GPU
- NPU (optional)


**Required Software:**
- Docker 20.10+ ([installation guide](https://docs.docker.com/engine/install/ubuntu/))
- Python 3.8+ with venv support
- Network connectivity for model/media download

**Storage Space Requirements:**
- Models: 230MB
- COCO dataset: 950MB
- CIFAR-100: 162MB
- Videos: 1.9GB
- Virtual Environment: 7.7GB
- Total Required: 10.9GB

Optional ImageNet Dataset Download: +6.5GB (manual download required, see [model-conversion](model-conversion/README.md))

**Display Pipeline:**
The display pipeline sample requires access to the display. Run the following commands to allow the X server connection in Docker:

### System Requirements

A GPU with video acceleration API (VA-API) media support that:
- Is validated on Ubuntu OS version 24.04.3 LTS with kernel version 6.16 and above
- Has Docker software installed and the user added in the docker group
- Has Integrated GPU
- Has NPU (optional)


### Software Requirements

- Docker software version 20.10 and above ([installation guide](https://docs.docker.com/engine/install/ubuntu/))
- Python programming language version 3.8 and above with virtual environment (venv) support
- Network connectivity for model or media download

### Storage Space Requirements

Breakdown of space required:
- Models: 230 MB
- COCO dataset: 950 MB
- CIFAR-100 dataset: 162 MB
- Videos: 1.9 GB
- Virtual Environment: 7.7 GB

Total space required: 10.9 GB

Optional ImageNet Dataset Download: 6.5GB. You will need to download this manually. See [model-conversion](model-conversion/README.md).

### Display Pipeline

Run the following commands to allow X server connection in the Docker container, so that the display pipeline sample can access the host's display:

```bash
xhost local:root
setfacl -m user:1000:r ~/.Xauthority
```

## Get Started

The Makefile automates the entire workflow. use `make help` to display the following:
The Makefile automates the entire workflow. Use `make help` to display the following:

```bash
# Three-step setup
Expand All @@ -71,49 +79,55 @@ make display # Visualized pipeline demo (params: CONFIG={light,medium,h

# Cleanup
make clean # Remove all results
make clean-all # Remove all generated collateral (models, media, results, venv)
make clean-all # Remove all generated collateral (models, media, results, and venv)
```

#### Examples
### Examples

- Prereqs: `make prereqs INCLUDE_GPU=True INCLUDE_NPU=True`
- Display: `make display CONFIG=light DETECT=GPU CLASSIFY=NPU DURATION=60`
- Benchmarks: `sudo make benchmarks CORES=ecore DURATION=60`

#### Makefile Variables
- `IMAGENET_ROOT` - Path to pre-downloaded ImageNet dataset for accuracy validation on Resnet and Mobilenet (see [model-conversion](model-conversion/README.md))
### Makefile Variables

- `IMAGENET_ROOT` - Path to pre-downloaded the ImageNet dataset for accuracy validation on ResNet architecture and MobileNet networks (see [model-conversion](model-conversion/README.md))
- `INCLUDE_GPU` - Install GPU drivers during setup
- `INCLUDE_NPU` - Install NPU drivers during setup (requires reboot)
- `DURATION` - Duration to run pipeline in seconds
- `CONFIG=light|medium|heavy` - Pipeline configuration, tiered by compute complexity
- `DETECT/CLASSIFY=CPU|GPU|NPU` - Inference device assignment
- `CORES=pcore|ecore|lpecore` - CPU core pinning based off of core type
- `CORES=pcore|ecore|lpecore` - CPU core pinning based on core type
- `PORT` - HTTP server port for dashboard (default: 8000)

### Manual Setup (Alternative)

If you prefer step-by-step control:

#### Step 1: Prerequisites
Step 1. Prerequisites:

```bash
cd setup/
./install_prerequisites.sh
# Optional: --reinstall-gpu-driver=yes and/or --reinstall-npu-driver=yes
```

#### Step 2: Models
Step 2. Models:

```bash
cd ../model-conversion/
./convert_models.sh
# Optional: -i "$HOME/datasets/imagenet-packages" for ImageNet quantization
```

#### Step 3: Media
Step 3. Media:

```bash
cd ../media-downloader/
./download_and_encode.sh
```

#### Step 4: Run Benchmark
Step 4. Run benchmark:

```bash
cd ..
./benchmark_edge_pipelines.sh \
Expand All @@ -136,10 +150,12 @@ cd ..
* `-c` Classification device: `CPU` | `GPU` | `GPU.<idx>` | `NPU` (default: CPU)
* `-i` Duration in seconds (default: 120)
* `-t` CPU core type for pinning, e.g., `"ecore"` (optional)
* `--concurrent` Enable concurrent GPU/NPU execution mode (optional)
* `--concurrent` Enable concurrent GPU or NPU execution mode (optional)

**Note:** Intel recommends the GPU or NPU for AI inference workloads.

Step 5. Display Results:

**Note:** GPU or NPU strongly recommended for AI inference workloads.
#### Step 5: Display Results
```bash
# Generate and view dashboard
python3 html/generate_report.py
Expand All @@ -148,30 +164,27 @@ cd html && python3 -m http.server 8000 # Access at http://localhost:8000

### Output

Results are saved to `results/` organized by execution mode:
Results are saved to the `results/` folder, organized by execution mode:

* `*.log` – Full GStreamer pipeline output (stdout/stderr)
* `*.csv` – Performance metrics (FPS, stream density, configuration)
* `*.log` – Full GStreamer pipeline output (stdout or stderr)
* `*.csv` – Performance metrics (FPS, stream density, and configuration)

## Get Help or Contribute

If you want to participate in the GitHub community for Edge Workloads and Benchmarks, you can
contribute code, propose a design, download and try out a release, open an issue,
benchmark application performance, and participate in
[Discussions](https://github.com/open-edge-platform/edge-workloads-and-benchmarks/discussions).

To learn more, check out the following resources:

- [Open an issue](https://github.com/open-edge-platform/edge-workloads-and-benchmarks/issues)
- [Submit a pull request](https://github.com/open-edge-platform/edge-workloads-and-benchmarks/pulls)
- [Read the Contribution Guide](https://github.com/open-edge-platform/edge-microvisor-toolkit/blob/3.0/docs/developer-guide/emt-contribution.md)
- [Report a security vulnerability](https://github.com/open-edge-platform/edge-workloads-and-benchmarks/blob/main/SECURITY.md)

Before submitting a new report, check the existing issues to see if a similar one has not
been filed already.
Before submitting a new report, check the existing issues to see if a similar one has been filed.

## License
## Notices

The **Edge Workload and Benchmarks** project is licensed under the [APACHE 2.0](./LICENSE) license.

---
\* Other names and brands may be claimed as the property of others.
14 changes: 7 additions & 7 deletions media-downloader/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,15 @@ Downloads and prepares video files for Edge Workloads and Benchmarks pipelines.

## Overview

1. Downloads two 4K test videos from Pexels
1. Downloads two 4K test videos from Pexels platform.
- `apple.mp4` - Single object per frame
- `bears.mp4` - Two objects per frame

2. Transcodes to H.265 format (1080p30, 2Mbps, no B-frames)
- Uses VA-API hardware acceleration via Docker
- Requires `/dev/dri` GPU/VA-API device access
2. Transcodes to H.265 format (1080p30, 2Mbps, no B-frames).
- Uses video acceleration API (VA-API) hardware acceleration via Docker software.
- Requires `/dev/dri` GPU or VA-API device access.

3. Loops each video 100x for long-duration benchmarks
3. Loops each video 100 times for long-duration benchmarks.

4. Saves looped videos to pipeline directories:
- `pipelines/light/video/bears.h265`
Expand All @@ -27,5 +27,5 @@ Downloads and prepares video files for Edge Workloads and Benchmarks pipelines.

## Requirements

- Docker with DLStreamer container (`intel/dlstreamer:latest`)
- GPU with VA-API support (integrated/discrete GPU)
- Docker software with Deep Learning Streamer (DL Streamer) container (`intel/dlstreamer:latest`).
- GPU with VA-API support (integrated or discrete GPU).
8 changes: 5 additions & 3 deletions model-conversion/README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
# Model Conversion and Quantization

Download, convert, and quantize AI models for DLStreamer pipelines.
Download, convert, and quantize AI models for Deep Learning Streamer (DL Streamer) pipelines.

### Model Collateral Matrix

| Model Name | Task | Dimensions | Dataset | Source Model |
|--------------|----------------|----------------|----------|--------------|
| Yolo-v11n | Detection | 640x640 (INT8) | COCO | [source](https://docs.ultralytics.com/models/yolo11/) |
Expand All @@ -12,13 +13,14 @@ Download, convert, and quantize AI models for DLStreamer pipelines.
| Mobilenet-V2 | Classification | 224x224 (INT8) | ImageNet | [source](https://pytorch.org/hub/pytorch_vision_mobilenet_v2/) |

### Usage

```bash
./convert_models.sh
```

## ImageNet Accuracy Check for Classification Networks (Optional)

CIFAR is used as a proxy dataset for classification network quantization. For classification accuracy validation, ImageNet is required.
The CIFAR dataset is used as a proxy dataset for classification network quantization. For classification accuracy validation, the ImageNet dataset is required.

### ImageNet Dataset Setup

Expand Down Expand Up @@ -61,4 +63,4 @@ pipelines/
└── classification/
├── resnet-v1-50-tf/INT8/
└── mobilenet-v2-1.0-224-tf/INT8/
```
```
6 changes: 3 additions & 3 deletions setup/drivers/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Compute Driver Installation Scripts

GPU and NPU compute drivers validated for Edge Workloads and Benchmarks performance reproducibility.
Intel has validated the GPU and NPU compute drivers in terms of performance reproducibility, for Edge Workloads and Benchmarks.

## Usage

Expand Down Expand Up @@ -34,8 +34,8 @@ drivers/

Downloaded packages are saved locally for offline reinstallation.

## Integration with Main Prerequisite Script
## Integration with the Main Prerequisite Script

Driver installation is **optional** by default. The main `install_prerequisites.sh` script does not automatically install drivers to maintain system stability.

Use these dedicated scripts or `install_prerequisites.sh` with the `--reinstall-gpu-driver=yes` / `--reinstall-npu-driver=yes` flags to install the compute drivers.
Use these dedicated scripts or the `install_prerequisites.sh` script with the `--reinstall-gpu-driver=yes` / `--reinstall-npu-driver=yes` flags to install the compute drivers.