Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sporadic test failures in BPD tests #3309

Open
sampsyo opened this issue Jun 11, 2019 · 3 comments
Open

Sporadic test failures in BPD tests #3309

sampsyo opened this issue Jun 11, 2019 · 3 comments
Labels
testing Relates to the testing/CI infrastructure

Comments

@sampsyo
Copy link
Member

sampsyo commented Jun 11, 2019

Travis seems to fail surprisingly often in the BPD tests with this error:

[...]
  File "/home/travis/build/beetbox/beets/beets/util/bluelet.py", line 404, in __init__
    self.sock.bind((host, port))
OSError: [Errno 98] Address already in use

Here's an example job, for instance.

The problem is that these tests, since #3190, use a socket to set up communication between a server and a client. But if two of these run at the same time, then the server runs into an error when trying to listen on that socket. I tried to mitigate this in a hacky way in 99778d9 by randomizing the port over 10,000 options, but failures of this kind are still happening pretty often on Travis.

A potential solution would be to catch this error and re-randomize a few times before giving up.

@sampsyo sampsyo added the testing Relates to the testing/CI infrastructure label Jun 11, 2019
@arcresu
Copy link
Member

arcresu commented Jun 11, 2019

It seems really weird that there would still be collisions with the port randomised like that. It's not like any of the tests lasts for very long either, and the server should be killed reliably in the finally clause. Maybe there's something deterministic about the RNG in the CI environment? I'll take a look but I agree that catching the error is the most reliable way to deal with this flakiness.

@zsinskri
Copy link
Contributor

I have not yet looked at this thoroughly, so do not believe anything I say!

According to this StackOverflow question we can just use port 0 to let the OS choose an available port. The assigned port can then be read as sock.getsockname()[1].

To get that port number from the server process to the main process in order to create the clients, we could use a multiprocessing.Value. Though to write to that Value after binding the socket and before waiting on it we probably have to mock something.

@sampsyo
Copy link
Member Author

sampsyo commented Jul 15, 2019

Wow, that’s very cool! I didn’t know about that Unix feature. It’s at least worth giving it a try!

zsinskri added a commit to zsinskri/beets that referenced this issue Jul 15, 2019
The bpd test bind a socket in order to test the protocol implementation. When
running concurrently this often resulted in an attempt to bind an already
occupied port.

By using the port number `0` we instead let the OS choose a free port. We then
have to extract it from the socket (which is handled by `bluelet`) via
`mock.patch`ing.
zsinskri added a commit to zsinskri/beets that referenced this issue Jul 17, 2019
…tbox#3330

Add a changelog entry asking plugin developers to report any further occurrences
of this failure.
sampsyo added a commit that referenced this issue Jul 19, 2019
fix "Sporadic test failures in BPD tests #3309"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing Relates to the testing/CI infrastructure
Projects
None yet
Development

No branches or pull requests

3 participants