Skip to content

fix: use with_global_system() in Jetson and Tenstorrent readers to prevent FD leak #120

@inureyes

Description

@inureyes

Summary

PR #119 fixed a file descriptor leak in the NVIDIA reader by caching the sysinfo::System instance. However, the same System::new() per-call anti-pattern still exists in NVIDIA Jetson and Tenstorrent readers, causing identical FD leaks in API mode (long-running metrics loop).

Problem

Both readers create a new System::new() on every get_process_info() call. In API mode, this runs every interval seconds indefinitely, leaking /proc file descriptors each cycle.

NVIDIA Jetson (src/device/readers/nvidia_jetson.rs)

  • Line 176: System::new() in get_process_info()
  • Line 261: System::new() in get_gpu_processes() helper

Tenstorrent (src/device/readers/tenstorrent.rs)

  • Line 201: System::new() in get_process_info()

Solution

Replace per-call System::new() with with_global_system() from src/utils/system.rs, which is the standard pattern already used by:

  • AMD reader (amd.rs:583-585)
  • Apple Silicon reader (apple_silicon_native.rs:388)
  • Local collector (local_collector.rs:304, 433)

This reuses a single global Mutex<System> instance (GLOBAL_SYSTEM) instead of allocating new ones.

Additionally consider

  • NVIDIA reader (nvidia.rs): PR fix: prevent file descriptor leak in API mode #119 used a struct-level Mutex<System> field. Consider migrating this to with_global_system() as well for consistency.
  • Furiosa reader (furiosa.rs:231): list_devices() is called per-cycle in RS mode. Investigate whether this creates persistent file handles in the furiosa-smi-rs library.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions