You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using cobuilds, there is no easy way to determine if a given project was cobuilt or not during the flushTelemetry hook. You get a full list of operations and their execution time, both start + end time and nonCachedDurationMs. I opened #4680 as start + end time aren't super useful for cobuild cache hits. That work still causes data skew though. As a custom telemetry integrator, I'd like to make sure that I'm only counting operations once per cobuild for the agent that handled the building itself. No other agents should report on that operation.
For the primary agent, endTimestampMs - startTimestampMs doesn't match nonCachedDurationMs. On the build cache restore agent, there's no reliable way to determine if the package was built on this machine or another machine. We currently report both and have been accidentally introducing data skew to our metric collection.
Recommendation
Possibly in conjunction with #4680, adding a new wasCobuiltOnThisAgent property to telemetry operation events would allow integrators to differentiate between primary and cache restore agents. I'd also recommend deprecating nonCachedDurationMs from telemetry, and moving to use just start and end time + wasCobuildOnThisAgent, as it's confusing to have multiple possible sources of truth for timing. It may be useful to capture build cache restore time elsewhere; for our use case though, it's not yet important to track.
Standard questions
Please answer these questions to help us investigate your issue more quickly:
Question
Answer
Would you consider contributing a PR?
The text was updated successfully, but these errors were encountered:
aramissennyeydd
changed the title
[rush] Cobuild Telemetry is Duplicated
[rush] duplicated cobuild telemetry leading to data skew
May 24, 2024
Summary
When using cobuilds, there is no easy way to determine if a given project was cobuilt or not during the
flushTelemetry
hook. You get a full list of operations and their execution time, both start + end time and nonCachedDurationMs. I opened #4680 as start + end time aren't super useful for cobuild cache hits. That work still causes data skew though. As a custom telemetry integrator, I'd like to make sure that I'm only counting operations once per cobuild for the agent that handled the building itself. No other agents should report on that operation.Details
Current Skew
Main Agent
Build Cache Restore Agent
For the primary agent,
endTimestampMs - startTimestampMs
doesn't matchnonCachedDurationMs
. On the build cache restore agent, there's no reliable way to determine if the package was built on this machine or another machine. We currently report both and have been accidentally introducing data skew to our metric collection.Recommendation
Possibly in conjunction with #4680, adding a new
wasCobuiltOnThisAgent
property to telemetry operation events would allow integrators to differentiate between primary and cache restore agents. I'd also recommend deprecatingnonCachedDurationMs
from telemetry, and moving to use just start and end time +wasCobuildOnThisAgent
, as it's confusing to have multiple possible sources of truth for timing. It may be useful to capture build cache restore time elsewhere; for our use case though, it's not yet important to track.Standard questions
Please answer these questions to help us investigate your issue more quickly:
The text was updated successfully, but these errors were encountered: