Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add determinism #145

Draft
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

jacobweinstock
Copy link
Member

Description

Add deterministic behavior for calls against a single host. By queuing requests per host we now provide deterministic behavior of calls.

Why is this needed

Fixes: #130

How Has This Been Tested?

How are existing users impacted? What migration steps/scripts do we need?

Checklist:

I have:

  • updated the documentation and/or roadmap (if required)
  • added unit or e2e tests
  • provided instructions on how to upgrade

By queuing requests per host we now provide
deterministic behavior of calls.

Signed-off-by: Jacob Weinstock <[email protected]>
Version was old.

Signed-off-by: Jacob Weinstock <[email protected]>
Fixes issue where the task is not found.
This was happening because I was creating
the record in the DB in a goroutine and
mocking the action to immediately fail.
This caused the worker to complete and close
before any record could be created.

Signed-off-by: Jacob Weinstock <[email protected]>
@codecov
Copy link

codecov bot commented Aug 25, 2023

Codecov Report

Merging #145 (67adac7) into main (67fadb3) will decrease coverage by 1.95%.
The diff coverage is 72.36%.

❗ Current head 67adac7 differs from pull request most recent head 73de1b2. Consider uploading reports for the commit 73de1b2 to get more accurate results

@@            Coverage Diff             @@
##             main     #145      +/-   ##
==========================================
- Coverage   81.33%   79.38%   -1.95%     
==========================================
  Files           9       11       +2     
  Lines         466      587     +121     
==========================================
+ Hits          379      466      +87     
- Misses         74      104      +30     
- Partials       13       17       +4     
Files Changed Coverage Δ
grpc/taskrunner/manager.go 60.97% <60.97%> (ø)
grpc/taskrunner/taskrunner.go 67.46% <71.42%> (+4.30%) ⬆️
grpc/taskrunner/run.go 74.41% <74.41%> (ø)
grpc/server.go 83.58% <75.00%> (-4.95%) ⬇️
grpc/rpc/bmc.go 92.00% <100.00%> (ø)
grpc/rpc/machine.go 92.30% <100.00%> (ø)

... and 1 file with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Signed-off-by: Jacob Weinstock <[email protected]>
Signed-off-by: Jacob Weinstock <[email protected]>
Signed-off-by: Jacob Weinstock <[email protected]>
Signed-off-by: Jacob Weinstock <[email protected]>
The goroutine (in worker) dequeuing tasks was a
blocking call. This meant that after there were
no tasks in the queue, this dequeue call would sit
waiting. This meant that when the rest of the worker
exited this goroutine did not and was orphaned. Workers
are per host id so depending on unique hosts that
requested tasks this would grown and stay at that number.
The resolution was to make the dequeue call non-blocking.

Signed-off-by: Jacob Weinstock <[email protected]>
Fix tests, updating for linting issues, etc

Signed-off-by: Jacob Weinstock <[email protected]>
Signed-off-by: Jacob Weinstock <[email protected]>
Signed-off-by: Jacob Weinstock <[email protected]>
Plumb these through to the cli flags.

Also fixed a channel close so that when workers
are done the goruntimes drops back down to original
levels.

Signed-off-by: Jacob Weinstock <[email protected]>
Signed-off-by: Jacob Weinstock <[email protected]>
The ingestor was single threaded.
After adding metrics and watching
the queues and processing happen,
running the ingestor process with a
max goroutine cap yielded better performance.
The per ID queues filled up faster and
allowed processing of BMC interactions
faster.

Signed-off-by: Jacob Weinstock <[email protected]>
Signed-off-by: Jacob Weinstock <[email protected]>
This keeps the goroutines from growning
uncontrollably.

Signed-off-by: Jacob Weinstock <[email protected]>
This made processing significantly faster.

Signed-off-by: Jacob Weinstock <[email protected]>
Update the metric server to handle
slow loris attacks.

Signed-off-by: Jacob Weinstock <[email protected]>
It was too big and made memory
consumption way too high. Also,
the typical/general use-case will be
high number of hosts and low number of
tasks that are requested close together.

Signed-off-by: Jacob Weinstock <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Non-deterministic behavior in serialized contexts
1 participant