-
Notifications
You must be signed in to change notification settings - Fork 528
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: Enable race detector #15932
base: main
Are you sure you want to change the base?
test: Enable race detector #15932
Conversation
This pull request does not have a backport label. Could you fix it @ericywl? 🙏
|
40c948f
to
a6f9618
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the part that fixes flaky TestServerElasticsearchOutput related to the race detector? If no, I'd prefer it to be a separate PR to fix #13606 , as we might want to backport it to make our lives easier.
Not exactly, I can extract that PR out. |
b7461ac
to
147ecd9
Compare
internal/beater/server_test.go
Outdated
"elasticsearch.events.count": 5, | ||
"elasticsearch.events.queued": 5, | ||
// Disable this flaky test to be fixed later | ||
// "elasticsearch.bulk_requests.available": 9, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please create a follow up task to fix this issue and link from here?
cc @raultorrecilla for scheduling work to look into why this is flaky.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ericywl paste it here once created please!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup apparently its a long standing issue: #13606
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ericywl can you revert this change in this PR? The flaky test shouldn't block this PR. (Even if the bugfix isn't merged and it unfortunately fails, rerunning the CI should be good enough)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
This pull request is now in conflicts. Could you fix it @ericywl? 🙏
|
@@ -55,16 +55,13 @@ jobs: | |||
with: | |||
go-version-file: go.mod | |||
cache: true | |||
- run: go test -v -race ./... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we removing CGO_ENABLED: "0"
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Race detector unfortunately requires CGO to be enabled: https://go.dev/doc/articles/race_detector.
The race detector requires cgo to be enabled, and on non-Darwin systems requires an installed C compiler. The race detector supports linux/amd64, linux/ppc64le, linux/arm64, linux/s390x, freebsd/amd64, netbsd/amd64, darwin/amd64, darwin/arm64, and windows/amd64.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we explicitly set it to 0
to make sure we do not include any C based library. Has this change been discussed previously?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, @kruskall I based this off your previous PR to enable the race detector. Not sure if that's something that has been discussed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I confirmed with @kruskall that this is ok. systemtests
run without CGO, so we are covered on that side.
I would add a comment to document this for future reference
@@ -55,16 +55,13 @@ jobs: | |||
with: | |||
go-version-file: go.mod | |||
cache: true | |||
- run: go test -v -race ./... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: can you enable the race detector for the fips tests too ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, let me update it.
@@ -20,6 +20,7 @@ services: | |||
interval: 1s | |||
environment: | |||
- "ES_JAVA_OPTS=-Xms1g -Xmx1g" | |||
- _JAVA_OPTIONS # Workaround on OSX for https://bugs.openjdk.org/browse/JDK-8345296 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: Is the key _JAVA_OPTIONS
without value good enough to workaround your issue? I don't see any mention of _JAVA_OPTIONS
in the bug report.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's mainly so that I can run the system tests using _JAVA_OPTIONS="-XX:UseSVE=0" make system-test
without having to update the docker compose.
internal/beater/otlp/http.go
Outdated
@@ -57,8 +61,9 @@ func NewHTTPHandlers(logger *zap.Logger, processor modelpb.BatchProcessor, semap | |||
|
|||
// TODO we should add an otel counter metric directly in the | |||
// apm-data consumer, then we could get rid of the callback. | |||
metricRegistrationMu.Lock() // Temporary locking to fix data race issue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, is this a bugfix? If so, I prefer a separate PR, and if it is user-facing, it needs a changelog as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has data race because the NewHTTPHandlers
can be called by different goroutines concurrently in tests. I believe it's not user-facing since its only used internally in beater.
This pull request is now in conflicts. Could you fix it @ericywl? 🙏
|
System test failed previously due to EDIT: There seems to be another flaky system test, I will extract it into a new PR instead: #15992. |
Motivation/summary
Enable race detector for tests, continuation of #13827.
How to test these changes
Run unit test, system test and CI.