Skip to content

Create dashboards comparing various telemetry protocols and engines #2855

@JakeDern

Description

@JakeDern

Pre-filing checklist

  • I searched existing issues and didn't find a duplicate

Component(s)

Documentation

Objective

I'm opening this parent issue to track the work needed to get a comparison dashboard up and running. The goal is to start making incremental progress on the site mechanics while waiting to publish any data until we're confident that we've created fair and accurate comparisons.

See the scope section for more details.

Rationale

The telemetry landscape is complex. There are many ways to do similar things in the ecosystem using many different engines and protocols that each shine under different use-cases. To make sure users have all the information needed to pick the best choice for their situation, we need to create these comparisons.

Scope

Work will broadly span these areas:

1. Creating benchmarking dashboard and framework features

This is all of the tooling for the site as well as the site itself. This includes, among other things:

  • A script for rendering and running benchmark test suites
  • Scripts for updating/building the site
  • Base site code

The site is intended to be hosted on github pages alongside our current site and long term we can integrate the two more deeply.

2. Defining test suites and scenarios for comparison

These are all the definitions for what kinds of experiments to run and what experiments are valid to compare including:

  • Orchestrator config and step templates
  • Engine configuration templates
  • Comparison metadata used to generate the site, label graphs, etc

The intent is for each test suite and comparison to have its own PR so that we can get good visibility and review the templates thoroughly for gotchas or mistakes that can lead to unfair evaluations.

3. Running experiments and publishing data

This is the yet to be decided methodology for how we decide to run the benchmarks including:

  • How often and which benchmarks do we run?
  • How do we automate this process?
  • What machine do we run them on?

Keeping in mind that the number of individual tests is already over 200, they take 2-3 minutes to run each, and that the bar is very high in terms of making sure every run finished properly with no errors, noise from the surrounding environment, or anomalies in metrics scraping.

Acceptance Criteria

To be defined in more detail soon.

Dependencies or Blockers

No response

Additional Context

No response

Metadata

Metadata

Assignees

Type

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions