Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uses perf_event to grab the instruction and cycle count for target process #914

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -6,9 +6,13 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## Unreleased

## [0.22.0-rc2]
### Added
- Incorporate perf-event telemetry on Linux for CPU data.

## [0.22.0-rc1]
### Fixed
- Target observer was not exposed through CLI.
- Target observer was not exposed through CLI, eg `--target-container my-container-name`

## [0.22.0-rc0]
### Changed
62 changes: 61 additions & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion lading/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "lading"
version = "0.22.0-rc1"
version = "0.22.0-rc2"
authors = [
"Brian L. Troutwine <brian.troutwine@datadoghq.com>",
"George Hahn <george.hahn@datadoghq.com>",
@@ -57,6 +57,7 @@ zstd = "0.13.1"
cgroups-rs = "0.3"
procfs = { version = "0.15", default-features = false, features = [] }
async-pidfd = "0.1"
perf-event2 = "0.7.4"

[dev-dependencies]
proptest = "1.4"
76 changes: 74 additions & 2 deletions lading/src/observer/linux.rs
Original file line number Diff line number Diff line change
@@ -1,11 +1,13 @@
use perf_event::events::Hardware;
use perf_event::{Builder, Counter, ReadFormat};
use std::{collections::VecDeque, io, path::Path, sync::atomic::Ordering};

use cgroups_rs::cgroup::Cgroup;
use metrics::gauge;
use nix::errno::Errno;
use procfs::process::Process;
use rustc_hash::{FxHashMap, FxHashSet};
use tracing::warn;
use tracing::{info, warn};

use crate::observer::memory::{Regions, Rollup};

@@ -55,6 +57,8 @@ pub(crate) struct Sampler {
previous_samples: FxHashMap<(i32, String, String), Sample>,
previous_totals: Sample,
previous_gauges: Vec<Gauge>,
perf_counter_cycles: Option<Counter>,
perf_counter_instructions: Option<Counter>,
}

struct Gauge(metrics::Gauge);
@@ -78,10 +82,45 @@ impl Gauge {
}
}

/// Attempts to initialize `perf_event` counters.
/// If either fail, then (None, None) is returned and errors are logged.
fn init_perf_counters(parent_pid: i32) -> (Option<Counter>, Option<Counter>) {
let mut cycles = match Builder::new(Hardware::CPU_CYCLES)
.observe_pid(parent_pid)
.read_format(ReadFormat::GROUP)
.build()
{
Ok(counter) => counter,
Err(e) => {
warn!("Failed to create CPU_CYCLES counter: {}", e);
return (None, None);
}
};
let instructions = match Builder::new(Hardware::INSTRUCTIONS)
.observe_pid(parent_pid)
.read_format(ReadFormat::GROUP)
.build_with_group(&mut cycles)
{
Ok(counter) => counter,
Err(e) => {
warn!("Failed to create INSTRUCTIONS counter: {}", e);
return (None, None);
}
};

(Some(cycles), Some(instructions))
}

impl Sampler {
pub(crate) fn new(parent_pid: u32) -> Result<Self, Error> {
let parent = Process::new(parent_pid.try_into().expect("PID coercion failed"))?;
info!("Constructing observer for {parent_pid}");
let parent_pid: i32 = parent_pid.try_into().expect("PID coercion failed");
let parent = Process::new(parent_pid)?;

// TODO Once the observer is cgroup based, swap out `observe_pid` for `observe_cgroup`
let (cycles, instructions) = init_perf_counters(parent_pid);

info!("init success for observer for {parent_pid}");
Ok(Self {
parent,
num_cores: num_cpus::get(), // Cores, logical on Linux, obeying cgroup limits if present
@@ -90,6 +129,8 @@ impl Sampler {
previous_samples: FxHashMap::default(),
previous_totals: Sample::default(),
previous_gauges: Vec::default(),
perf_counter_cycles: cycles,
perf_counter_instructions: instructions,
})
}

@@ -101,6 +142,14 @@ impl Sampler {
clippy::cast_possible_wrap
)]
pub(crate) async fn sample(&mut self) -> Result<(), Error> {
// Disable perf counters before making observations
// Thought here is that some of this work (eg, requesting smaps) may be
// done "on behalf" of the process.
// However the counts should exclude kernel by default, so maybe this is not necessary
if let Some(cycles) = &mut self.perf_counter_cycles {
cycles.disable_group()?;
}

let mut joinset = tokio::task::JoinSet::new();
// Key for this map is (pid, basename/exe, cmdline)
let mut samples: FxHashMap<(i32, String, String), Sample> = FxHashMap::default();
@@ -375,6 +424,29 @@ impl Sampler {
}
}

if let (Some(cycles), Some(instructions)) = (
&mut self.perf_counter_cycles,
&self.perf_counter_instructions,
) {
let data = cycles.read_group()?;
let cpu_cycles = data[cycles];
let instructions = data[instructions];

let cpi = if instructions > 0 {
cpu_cycles as f64 / instructions as f64
} else {
0.0
};

gauge!("cpi").set(cpi);
gauge!("cpu_cycles").set(cpu_cycles as f64);
gauge!("instructions").set(instructions as f64);

// Since data has been read, lets reset and enable to collect more data
cycles.reset_group()?;
cycles.enable_group()?;
};

gauge!("num_processes").set(total_processes as f64);
RSS_BYTES.store(total_rss, Ordering::Relaxed); // stored for the purposes of throttling


Unchanged files with check annotations Beta

# Update the rust version in-sync with the version in rust-toolchain.toml
FROM docker.io/rust:1.79.0-bullseye as builder

Check warning on line 2 in Dockerfile

GitHub Actions / container

The 'as' keyword should match the case of the 'from' keyword

FromAsCasing: 'as' and 'FROM' keywords' casing do not match More info: https://docs.docker.com/go/dockerfile/rule/from-as-casing/
RUN apt-get update && apt-get install -y \
protobuf-compiler \