feat: Add support for DGX Spark (GB10) and Unified Memory Architecture NVIDIA GPUs

## Problem / Background

NVIDIA has released DGX Spark, a desktop AI system based on the GB10 Grace Blackwell chip. This system uses **Unified Memory Architecture (UMA)** where CPU and GPU share the same physical memory, which is fundamentally different from traditional discrete GPUs with dedicated VRAM.

Current `all-smi` NVIDIA GPU monitoring assumes discrete GPUs with:
- Dedicated GPU memory (VRAM) separate from system RAM
- `device.memory_info()` returning GPU-specific memory metrics
- Clear distinction between `used_memory` and `total_memory` for the GPU

On UMA systems like DGX Spark:
- CPU and GPU share the same physical memory pool
- Traditional memory reporting concepts may not apply directly
- NVML may report memory differently or require different API calls
- Memory usage attribution between CPU and GPU workloads may differ

### Affected Products
- **NVIDIA DGX Spark** (GB10 Grace Blackwell)
- **Future Grace-based products** with unified memory
- Similar architectures that NVIDIA may release

## Proposed Solution

### Phase 1: Investigation
1. **Research NVML behavior on UMA systems**
   - Determine how `nvmlDeviceGetMemoryInfo()` behaves on GB10
   - Check if new NVML APIs exist for unified memory reporting
   - Investigate `nvmlDeviceGetMemoryInfo_v2()` and related functions

2. **Identify detection mechanism**
   - How to detect if a GPU uses UMA vs discrete memory
   - Check device properties, architecture flags, or memory type indicators

3. **Review existing implementations**
   - Reference the `nvidia_jetson.rs` implementation which already handles integrated GPUs with shared memory
   - Consider patterns from Apple Silicon support where unified memory is used

### Phase 2: Implementation
1. **Add UMA detection logic**
   - Detect Grace Blackwell and similar UMA architectures
   - Add appropriate flags/metadata to distinguish UMA devices

2. **Implement appropriate memory reporting**
   - Handle shared memory pool reporting
   - Consider adding new fields like `shared_memory` or `unified_memory_total`
   - Ensure `used_memory` and `total_memory` remain meaningful

3. **Update device details**
   - Add "Memory Type: Unified" or similar indicator
   - Report relevant UMA-specific metrics if available

4. **Handle edge cases**
   - Graceful fallback if NVML doesn't support certain queries
   - Consistent behavior across different driver versions

## Acceptance Criteria

- [ ] Document NVML behavior on DGX Spark / GB10 systems
- [ ] Implement detection for UMA-based NVIDIA GPUs
- [ ] Memory metrics are reported accurately and meaningfully for UMA systems
- [ ] Device details include memory architecture type (Discrete/Unified)
- [ ] No regression in existing discrete GPU support
- [ ] Unit tests cover UMA detection and reporting logic
- [ ] Documentation updated with UMA-specific notes

## Technical Considerations

### NVML API Research Areas
- `nvmlDeviceGetMemoryInfo()` vs `nvmlDeviceGetMemoryInfo_v2()`
- `nvmlDeviceGetArchitecture()` - check for Blackwell/Grace identification
- `nvmlDeviceGetBrand()` - may indicate DGX Spark
- Memory bus type and width queries

### Architecture Reference
Current relevant implementations:
- `/src/device/readers/nvidia.rs` - Standard NVIDIA GPU reader using NVML
- `/src/device/readers/nvidia_jetson.rs` - Jetson reader handling integrated GPU with shared memory (uses tegrastats fallback)

### Potential New Fields in `GpuInfo`
```rust
// Consider adding to device detail or as new fields
memory_type: Option<String>,  // "Discrete", "Unified", "Shared"
unified_memory_total: Option<u64>,  // Total unified memory pool
```

### Graceful Degradation
If NVML on UMA systems doesn't provide expected metrics:
1. Fall back to system memory reporting (similar to Jetson approach)
2. Use `/proc/meminfo` or similar for unified memory systems
3. Log warnings for unsupported queries

## Additional Context

### Related Implementations
- **NVIDIA Jetson** (`nvidia_jetson.rs`): Uses tegrastats and system memory fallback for integrated GPU
- **Apple Silicon** (`apple.rs`): Unified memory architecture with shared CPU/GPU memory pool

### References
- [NVIDIA DGX Spark Announcement](https://www.nvidia.com/en-us/autonomous-machines/dgx-spark/)
- [NVIDIA Grace Blackwell Architecture](https://www.nvidia.com/en-us/data-center/grace-blackwell/)
- [NVML Documentation](https://docs.nvidia.com/deploy/nvml-api/)

### Hardware Access
- Testing requires access to actual DGX Spark hardware or equivalent GB10 system
- Consider adding mock device templates for testing without hardware

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add support for DGX Spark (GB10) and Unified Memory Architecture NVIDIA GPUs #80

Problem / Background

Affected Products

Proposed Solution

Phase 1: Investigation

Phase 2: Implementation

Acceptance Criteria

Technical Considerations

NVML API Research Areas

Architecture Reference

Potential New Fields in `GpuInfo`

Graceful Degradation

Additional Context

Related Implementations

References

Hardware Access

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat: Add support for DGX Spark (GB10) and Unified Memory Architecture NVIDIA GPUs #80

Description

Problem / Background

Affected Products

Proposed Solution

Phase 1: Investigation

Phase 2: Implementation

Acceptance Criteria

Technical Considerations

NVML API Research Areas

Architecture Reference

Potential New Fields in GpuInfo

Graceful Degradation

Additional Context

Related Implementations

References

Hardware Access

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Potential New Fields in `GpuInfo`