Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data race in internals/overlord/servstate:S.startTestServices(c) #264

Closed
barrettj12 opened this issue Jul 28, 2023 · 2 comments · Fixed by #266
Closed

Data race in internals/overlord/servstate:S.startTestServices(c) #264

barrettj12 opened this issue Jul 28, 2023 · 2 comments · Fixed by #266

Comments

@barrettj12
Copy link
Contributor

This is a utility function called in many of the servstate tests.

func (s *S) startTestServices(c *C) {

It starts two services

  • test1: /bin/sh -c "echo test1 | tee -a .../log.txt; sleep 10"
  • test2: /bin/sh -c "echo test2 | tee -a .../log.txt; sleep 10"

and then asserts the contents of log.txt are

test1
test2

However, there is a data race, since we may read log.txt before the test2 service has had a chance to write to it.

manager_test.go:203:
    c.Assert(string(data), Matches, "(?s)"+expected)
... value string = "test1\n"
... regex string = "" +
...     "(?s).*test1\n" +
...     ".*test2\n"

To reproduce

Create a new test in internals/overlord/servstate/manager_test.go:

func (s *S) TestStartTestServices(c *C) {
	s.startTestServices(c)
}

Install the stress utility (useful for finding sporadic test failures):

go install golang.org/x/tools/cmd/stress@latest

Run

stress go test ./internals/overlord/servstate -check.f TestStartTestServices -count=1

and start observing failures.

@flotter
Copy link
Contributor

flotter commented Jul 28, 2023

I will add this to #266 - its not included right now, but easy to add.

flotter added a commit to flotter/pebble that referenced this issue Aug 1, 2023
The startTestServices() test helper uses a special entry in the
service command under test to write the service standard output
also to a log file that can be inspected.

This mechanism suffers from a race condition (as highlighted in
canonical#264) because when the
content of the log file is loaded, the service may not yet have
completed writing to the log.

Since standard output is also verified separately through a
different mechanism, the following changes are made:

- Enhance the global "done check" (previous called "done file") to
  check completion per service. This effectively adds the
  capability previously provided by the log assert mechanism.

- Use the existing "done check" mechanism to wait until the service
  command-line is complete up to the point of the check.

- Only now verify the stdout buffer content as checked previously.

- Remove the log mechanism all together.
@flotter
Copy link
Contributor

flotter commented Aug 1, 2023

#266 Should now fix this.

@jnsgruk jnsgruk linked a pull request Aug 1, 2023 that will close this issue
jnsgruk pushed a commit that referenced this issue Aug 1, 2023
The startTestServices() test helper uses a special entry in the
service command under test to write the service standard output
also to a log file that can be inspected.

This mechanism suffers from a race condition (as highlighted in
#264) because when the
content of the log file is loaded, the service may not yet have
completed writing to the log.

Since standard output is also verified separately through a
different mechanism, the following changes are made:

- Enhance the global "done check" (previous called "done file") to
  check completion per service. This effectively adds the
  capability previously provided by the log assert mechanism.

- Use the existing "done check" mechanism to wait until the service
  command-line is complete up to the point of the check.

- Only now verify the stdout buffer content as checked previously.

- Remove the log mechanism all together.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants