Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rush] duplicated cobuild telemetry leading to data skew #4737

Open
aramissennyeydd opened this issue May 24, 2024 · 0 comments · May be fixed by #4755
Open

[rush] duplicated cobuild telemetry leading to data skew #4737

aramissennyeydd opened this issue May 24, 2024 · 0 comments · May be fixed by #4755

Comments

@aramissennyeydd
Copy link
Contributor

Summary

When using cobuilds, there is no easy way to determine if a given project was cobuilt or not during the flushTelemetry hook. You get a full list of operations and their execution time, both start + end time and nonCachedDurationMs. I opened #4680 as start + end time aren't super useful for cobuild cache hits. That work still causes data skew though. As a custom telemetry integrator, I'd like to make sure that I'm only counting operations once per cobuild for the agent that handled the building itself. No other agents should report on that operation.

Details

Current Skew

Main Agent

"@company/my-package (test)": {
        "startTimestampMs": 31468.20680500008,
        "endTimestampMs": 40377.58414799906,
        "nonCachedDurationMs": 8649.619353000075,
        "result": "SUCCESS",
        "dependencies": [
          "@company/my-package (build)"
        ]
},

Build Cache Restore Agent

      "@company/my-package (test)": {
        "startTimestampMs": 220830.3333630003,
        "endTimestampMs": 220875.03935700096,
        "nonCachedDurationMs": 8649.619353000075,
        "result": "SUCCESS",
        "dependencies": [
          "@company/my-package (build)"
        ]
      },

For the primary agent, endTimestampMs - startTimestampMs doesn't match nonCachedDurationMs. On the build cache restore agent, there's no reliable way to determine if the package was built on this machine or another machine. We currently report both and have been accidentally introducing data skew to our metric collection.

Recommendation

Possibly in conjunction with #4680, adding a new wasCobuiltOnThisAgent property to telemetry operation events would allow integrators to differentiate between primary and cache restore agents. I'd also recommend deprecating nonCachedDurationMs from telemetry, and moving to use just start and end time + wasCobuildOnThisAgent, as it's confusing to have multiple possible sources of truth for timing. It may be useful to capture build cache restore time elsewhere; for our use case though, it's not yet important to track.

Standard questions

Please answer these questions to help us investigate your issue more quickly:

Question Answer
Would you consider contributing a PR?
@aramissennyeydd aramissennyeydd changed the title [rush] Cobuild Telemetry is Duplicated [rush] duplicated cobuild telemetry leading to data skew May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Low priority
Development

Successfully merging a pull request may close this issue.

1 participant