Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add lots of debug logging to investigate flakiness #421

Conversation

cosmicexplorer
Copy link

I'm not sure if this change is something Scoot would want to do differently.

Problem

When running a local cluster with go run ./binaries/setup-cloud-scoot/main.go --strategy local.local to run pants against, we uncovered some flakiness, which was resolved in pants with pantsbuild/pants#7422. We investigated that by pairing timestamped logs from pants (which are being done in pantsbuild/pants#7536) with timestamped logs from scoot when uploading files (from this change).

Explain the context and why you're making that change. What is
the problem you're trying to solve? In some cases there is not a
problem and this can be thought of being the motivation for your change.

Solution

  • Add some info to the debug logs in the grpc cas write method, including:
    • timestamps
    • a randomly assigned request id

Describe the modifications you've done.

Result

CAS writes via Scoot and any flakiness around them are easier to debug!

What will change as a result of your pull request? Note that sometimes
this section is unnecessary because it is self-explanatory based on
the solution.

@codecov-io
Copy link

codecov-io commented Apr 15, 2019

Codecov Report

Merging #421 into master will decrease coverage by 0.01%.
The diff coverage is 22.22%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #421      +/-   ##
==========================================
- Coverage   56.58%   56.56%   -0.02%     
==========================================
  Files         111      111              
  Lines        8280     8281       +1     
==========================================
- Hits         4685     4684       -1     
- Misses       3142     3144       +2     
  Partials      453      453
Impacted Files Coverage Δ
bazel/cas/service.go 62.44% <22.22%> (+0.08%) ⬆️
runner/runners/queue.go 78.4% <0%> (-0.8%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 764741f...7622f45. Read the comment docs.

bazel/cas/service.go Outdated Show resolved Hide resolved
bazel/cas/service.go Outdated Show resolved Hide resolved
bazel/cas/service.go Outdated Show resolved Hide resolved
ryancouto
ryancouto previously approved these changes Apr 16, 2019
"net"
"sync"
"time"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you keep this newline? we like to keep our imports separated by stdlib / 3rdparty / scoot stuff

"net"
"sync"
"time"

"github.com/golang/protobuf/proto"
log "github.com/sirupsen/logrus"
remoteexecution "github.com/twitter/scoot/bazel/remoteexecution"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

related to above comment, as long as you're in here, could you move this down to the scoot stuff below? thanks!

@dgassaway
Copy link
Contributor

Don't logs already have timestamps in them from logrus, or do these not appear when running locally? If that's the case can we fix that at the logging layer instead of explicitly logging timestamps only in a single function of a single module?

@dgassaway dgassaway dismissed ryancouto’s stale review June 8, 2019 20:20

unsetting approved state

@dgassaway
Copy link
Contributor

Closing to cleanup PRs, open & update per comments if you'd like to revisit

@dgassaway dgassaway closed this Jul 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants