Addition of Collectors and Sample Abstraction Request #290

Maxusmusti · 2021-07-05T12:29:49Z

Hi! I'm using this space to write about my current progress, updates and plans. The biggest piece right now is the addition of "collectors to the benchmark wrapper":

COLLECTORS

Introduces the ability to run data collection tools alongside benchmark runs
For example, running pbench data collection alongside a fio or uperf benchmark
Structure:
- Added --collector and --collector-config options to run_snafu (as well as --upload)
- Abstract collector class specifies need for config file, as well as:
  - set_config_vars(config): Parses through the config file and sets any specified vars / defaults, makes sure everything that needs to be in the file is.
  - startup(): Performs env setup and initialization of necessary processes, as well as begins any persistent data collection processes (for continuous data collection over the course of the benchmark).
  - start_sample(): Starts a data collection sample (for sample-based data collection tools, synced with benchmark samples). Return the sample archive dir.
  - stop_sample(): Stops a data collection sample.
  - shutdown(): Stops persistent data collcetion and other processes, performs cleanup and any desired post-processing.
  - upload(): Uploads a collector's archives using a specified procedure to a desired location.
- The necessary collector_factory, _load_collectors, __init__ files, and changes to registry have also been made
- In run_snafu, if a collector has been specified with a config, it will create/initialize the desired collector class and run that startup tasks. Then, before each benchmark sample, it will start a data collection sample which will be stopped once the benchmark sample ends. After the benchmark is complete, the collector shutdown/post-process tasks are run.
- The upload option:
  - A user can also run snafu with the --upload option while still including collector + config
  - This will then upload the created archives for a specific collector to a location specified/authorized through a combination of the upload() method and the config file.
  - For example, this option is used for moving pbench archives to the pbench server, and auth info/locations can be specified through the config file
- Adding a Collector:
  - A really simple process
  - Just create a collector_name dir under the main collector dir
  - Add an init file, a sample config, and the collector_name.py file that creates a Collector_name class based on the abstract collector.
  - Robert Krawitz is also currently interested in implementing his prom_extract tool as another collector

PBENCH

Note that all of this structure already exists and has been written, and pbench has been fully implemented using it (upload option still needs a bit of work).
To see all of this work, go to: my pbench integration PR
This has all been built off of upstream branch: learnitall/feature-extend-oo-structure
I also have a small demo in the latest tools meeting recording, and can do another whenever desired

ISSUES/TO-DO

Add support to specify/run multiple collectors for a given benchmark run
SAMPLE ISSUES (described in more detail below)
How to handle dockerfiles to have environments with both necessary collector bits and benchmark bits

SAMPLE ISSUE

Currently no universal notion of a sample, samples are defined within the benchmark, with each benchmark having its own for-loop iterating through the number of samples and yielding results.
No way to sync collector samples with benchmark samples (or do any other work revolving around benchmark samples) without hard-coding hacks into each individual benchmark.
Instead, would like to request that the new benchmark rewrite fix this problem by abstracting the main sample loop and introducing a universal definition of sample:
- Basically, instead of having a collect() method in the benchmarks where a for-loop is run over the number of samples with each iteration yielding one sample result, simply have a collect_sample() method that collects one sample of the benchmark (essentially the same code minus the for-loop), then have that called in the new sample loop in run_snafu at each iteration instead.
- That way, in the main sample loop, I could start a collector sample, run a benchmark sample, then stop the collector sample, and then the rest of the benchmark sample processing/indexing would stay the same for each iteration.
- PSEUDO-CODE:
  - OLD:
    for data in benchmark.run():
    (current indexing stuff)
    ...and then inside the .run() there would be a for loop for sample in range(samples)....(code to collect sample)
  - NEW:
    for sample in range(samples):
    collector.start_sample()
    data = benchmark.collect_sample() ...now just has (code to collect sample), no for sample in range(samples)
    collector.stop_sample()
    (current indexing stuff)
- Would help a lot with current/future work, and also would likely help with unit testing and whatnot.
- This is more of an example of a solution rather than the "one perfect solution", so I'm sure people may think of better ones. Rather, I'm just looking to add this level of abstraction to allow for proper integration of stuff like collectors, as well as new features in the future.
- Currently, what collectors are ACTUALLY doing are just running one collection sample over the course of the whole benchmark run process, because that's all that is currently possible.

The text was updated successfully, but these errors were encountered:

jtaleric · 2021-07-09T16:53:42Z

Hi! I'm using this space to write about my current progress, updates and plans. The biggest piece right now is the addition of "collectors to the benchmark wrapper":

COLLECTORS

Introduces the ability to run data collection tools alongside benchmark runs

SNAFU/Benchmark-wrapper is to run benchmarks, not orchestrate outside tools.

For example, running pbench data collection alongside a fio or uperf benchmark

I think if we want pbench collection in a K8s deployment, we orchestrate the pbench agents via Rispaw/Benchmark-Operator.

Structure:

Added --collector and --collector-config options to run_snafu (as well as --upload)

Abstract collector class specifies need for config file, as well as:

set_config_vars(config): Parses through the config file and sets any specified vars / defaults, makes sure everything that needs to be in the file is.

startup(): Performs env setup and initialization of necessary processes, as well as begins any persistent data collection processes (for continuous data collection over the course of the benchmark).

start_sample(): Starts a data collection sample (for sample-based data collection tools, synced with benchmark samples). Return the sample archive dir.

stop_sample(): Stops a data collection sample.

shutdown(): Stops persistent data collcetion and other processes, performs cleanup and any desired post-processing.

upload(): Uploads a collector's archives using a specified procedure to a desired location.

The necessary collector_factory, _load_collectors, __init__ files, and changes to registry have also been made

In run_snafu, if a collector has been specified with a config, it will create/initialize the desired collector class and run that startup tasks. Then, before each benchmark sample, it will start a data collection sample which will be stopped once the benchmark sample ends. After the benchmark is complete, the collector shutdown/post-process tasks are run.

Most of this functionality could be implemented in the Operator.

The upload option:

A user can also run snafu with the --upload option while still including collector + config

This will then upload the created archives for a specific collector to a location specified/authorized through a combination of the upload() method and the config file.

For example, this option is used for moving pbench archives to the pbench server, and auth info/locations can be specified through the config file

Should the PBench agents simply upload the data to the pbench-server, or pbench-controller after the collection is completed?

Adding a Collector:

A really simple process

Just create a collector_name dir under the main collector dir

Add an init file, a sample config, and the collector_name.py file that creates a Collector_name class based on the abstract collector.

Robert Krawitz is also currently interested in implementing his prom_extract tool as another collector

I discussed this with Robert and the CNV team. Ripsaw/Benchmark-Operator already has a task to collect system metrics and send to ES. https://github.com/cloud-bulldozer/benchmark-operator/blob/master/roles/system-metrics/tasks/main.yml

The role here could be used for the PBench collection work.

PBENCH

Note that all of this structure already exists and has been written, and pbench has been fully implemented using it (upload option still needs a bit of work).

To see all of this work, go to: my pbench integration PR

This has all been built off of upstream branch: learnitall/feature-extend-oo-structure

I also have a small demo in the latest tools meeting recording, and can do another whenever desired

ISSUES/TO-DO

Add support to specify/run multiple collectors for a given benchmark run

SAMPLE ISSUES (described in more detail below)

How to handle dockerfiles to have environments with both necessary collector bits and benchmark bits

SAMPLE ISSUE

Currently no universal notion of a sample, samples are defined within the benchmark, with each benchmark having its own for-loop iterating through the number of samples and yielding results.

No way to sync collector samples with benchmark samples (or do any other work revolving around benchmark samples) without hard-coding hacks into each individual benchmark.

Instead, would like to request that the new benchmark rewrite fix this problem by abstracting the main sample loop and introducing a universal definition of sample:

Basically, instead of having a collect() method in the benchmarks where a for-loop is run over the number of samples with each iteration yielding one sample result, simply have a collect_sample() method that collects one sample of the benchmark (essentially the same code minus the for-loop), then have that called in the new sample loop in run_snafu at each iteration instead.

That way, in the main sample loop, I could start a collector sample, run a benchmark sample, then stop the collector sample, and then the rest of the benchmark sample processing/indexing would stay the same for each iteration.

PSEUDO-CODE:

OLD:
for data in benchmark.run():
(current indexing stuff)
...and then inside the .run() there would be a for loop for sample in range(samples)....(code to collect sample)

NEW:
for sample in range(samples):
collector.start_sample()
data = benchmark.collect_sample() ...now just has (code to collect sample), no for sample in range(samples)
collector.stop_sample()
(current indexing stuff)

Would help a lot with current/future work, and also would likely help with unit testing and whatnot.

This is more of an example of a solution rather than the "one perfect solution", so I'm sure people may think of better ones. Rather, I'm just looking to add this level of abstraction to allow for proper integration of stuff like collectors, as well as new features in the future.

Currently, what collectors are ACTUALLY doing are just running one collection sample over the course of the whole benchmark run process, because that's all that is currently possible.

@learnitall
@dry923
@rsevilla87
@mohit-sheth
@whitleykeith ;)

Any additional thoughts?

whitleykeith · 2021-07-09T20:06:02Z

Hi! I'm using this space to write about my current progress, updates and plans. The biggest piece right now is the addition of "collectors to the benchmark wrapper":

COLLECTORS

Ultimately I'm in agreement with @jtaleric on this. The benchmark-wrapper shouldn't be responsible for integrating with other in-process tools. I'm not against maybe supporting a different output format/target but the data collection shouldn't be tied at the process level.

I think it makes sense that any "collector" would be done in a sidecar similar to how Jaeger works https://github.com/jaegertracing/jaeger-kubernetes

I'm not even sure I want prom_extract to be a feature of SNAFU, and it definitely shouldn't be a collector. We have kube-burner for that.

SAMPLE ISSUE

Currently no universal notion of a sample, samples are defined within the benchmark, with each benchmark having its own for-loop iterating through the number of samples and yielding results.

No way to sync collector samples with benchmark samples (or do any other work revolving around benchmark samples) without hard-coding hacks into each individual benchmark.

Instead, would like to request that the new benchmark rewrite fix this problem by abstracting the main sample loop and introducing a universal definition of sample:

Basically, instead of having a collect() method in the benchmarks where a for-loop is run over the number of samples with each iteration yielding one sample result, simply have a collect_sample() method that collects one sample of the benchmark (essentially the same code minus the for-loop), then have that called in the new sample loop in run_snafu at each iteration instead.

That way, in the main sample loop, I could start a collector sample, run a benchmark sample, then stop the collector sample, and then the rest of the benchmark sample processing/indexing would stay the same for each iteration.

PSEUDO-CODE:

OLD:
for data in benchmark.run():
(current indexing stuff)
...and then inside the .run() there would be a for loop for sample in range(samples)....(code to collect sample)

NEW:
for sample in range(samples):
collector.start_sample()
data = benchmark.collect_sample() ...now just has (code to collect sample), no for sample in range(samples)
collector.stop_sample()
(current indexing stuff)

Would help a lot with current/future work, and also would likely help with unit testing and whatnot.

This is more of an example of a solution rather than the "one perfect solution", so I'm sure people may think of better ones. Rather, I'm just looking to add this level of abstraction to allow for proper integration of stuff like collectors, as well as new features in the future.

Currently, what collectors are ACTUALLY doing are just running one collection sample over the course of the whole benchmark run process, because that's all that is currently possible.

I completely agree that there should be one universal implementation for a "sample" in SNAFU, and I think your example makes perfect sense (minus wrapping it with a collector 😄 )

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Addition of Collectors and Sample Abstraction Request #290

Addition of Collectors and Sample Abstraction Request #290

Maxusmusti commented Jul 5, 2021 •

edited

Loading

jtaleric commented Jul 9, 2021 •

edited

Loading

whitleykeith commented Jul 9, 2021

Addition of Collectors and Sample Abstraction Request #290

Addition of Collectors and Sample Abstraction Request #290

Comments

Maxusmusti commented Jul 5, 2021 • edited Loading

jtaleric commented Jul 9, 2021 • edited Loading

whitleykeith commented Jul 9, 2021

Maxusmusti commented Jul 5, 2021 •

edited

Loading

jtaleric commented Jul 9, 2021 •

edited

Loading