Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix CI stability issues #321

Open
Bouncheck opened this issue Aug 6, 2024 · 3 comments
Open

Fix CI stability issues #321

Bouncheck opened this issue Aug 6, 2024 · 3 comments

Comments

@Bouncheck
Copy link
Collaborator

Creating this issue to track ongoing driver CI problems. Fixing all of them is likely more than a single PR of work. I'll keep adding as I encounter more.

3.x issues

"Unit Tests" job is broken

Currently it fails on Copy test results step:

 shopt -s globstar
  mkdir unit
  cp --parents ./**/target/*-reports/*.xml unit/
  shell: /usr/bin/bash -e {0}
  env:
    JAVA_HOME: /opt/hostedtoolcache/Java_Adopt_jdk/8.0.422-5/x64
cp: failed to get attributes of './**': No such file or directory
Error: Process completed with exit code 1.

Likely cause is that cp fails to find any generated xml test report.
It turns out that "Run unit tests" step which completes successfully runs 0 unit tests according to the console report. The change in behavior was likely introduced by updating surefire version. It seems that now we're using wrong provider which skips testng tests.

Flaky tests

to be updated

4.x issues

General ccm flakiness

Sometimes the cluster fails to create which seems to cause multiple tests to fail and test environment to be polluted.
Recent example: https://github.com/scylladb/java-driver/actions/runs/10251929884/job/28361130638

 [INFO] Results:
[INFO] 
Error:  Errors: 
Error:    HeapCompressionIT » Runtime The command '[ccm, create, ccm_1, -i, 127.0.0., -n, 1:0, -v, release:6.1.0-rc1, --scylla, --config-dir=/tmp/ccm8589068201510993611]' failed to execute
Error:    NettyResourceLeakDetectionIT » Runtime The command '[ccm, create, ccm_1, -i, 127.0.0., -n, 1:0, -v, release:6.1.0-rc1, --scylla, --config-dir=/tmp/ccm9090955162389851387]' failed to execute
Error:    PreparedStatementCachingIT.should_invalidate_cache_entry_on_basic_udt_change_result_set » IllegalState Attempting to use a Ccm rule while another is in use.  This is disallowed
Error:    PreparedStatementCachingIT.should_invalidate_cache_entry_on_basic_udt_change_variable_defs » Runtime The command '[ccm, create, ccm_1, -i, 127.0.0., -n, 1:0, -v, release:6.1.0-rc1, --scylla, --config-dir=/tmp/ccm8052825817352386024]' failed to execute
Error:    PreparedStatementCachingIT.should_invalidate_cache_entry_on_collection_udt_change_result_set » IllegalState Attempting to use a Ccm rule while another is in use.  This is disallowed
Error:    PreparedStatementCachingIT.should_invalidate_cache_entry_on_collection_udt_change_variable_defs » IllegalState Attempting to use a Ccm rule while another is in use.  This is disallowed
Error:    PreparedStatementCachingIT.should_invalidate_cache_entry_on_nested_udt_change_result_set » IllegalState Attempting to use a Ccm rule while another is in use.  This is disallowed
Error:    PreparedStatementCachingIT.should_invalidate_cache_entry_on_nested_udt_change_variable_defs » IllegalState Attempting to use a Ccm rule while another is in use.  This is disallowed
Error:    PreparedStatementCachingIT.should_invalidate_cache_entry_on_tuple_udt_change_result_set » IllegalState Attempting to use a Ccm rule while another is in use.  This is disallowed
Error:    PreparedStatementCachingIT.should_invalidate_cache_entry_on_tuple_udt_change_variable_defs » IllegalState Attempting to use a Ccm rule while another is in use.  This is disallowed
Error:    DefaultSslEngineFactoryPropertyBasedWithClientAuthIT » Runtime The command '[ccm, create, ccm_1, -i, 127.0.0., -n, 1:0, -v, release:6.1.0-rc1, --scylla, --config-dir=/tmp/ccm4961261789669327613]' failed to execute

Flaky tests

DefaultLoadBalancingPolicyIT
Error:  com.datastax.oss.driver.core.loadbalancing.DefaultLoadBalancingPolicyIT.should_use_round_robin_on_local_dc_when_not_enough_routing_information  Time elapsed: 0.071 s  <<< FAILURE!
org.junit.ComparisonFailure: expected:<...e(endPoint=/127.0.0.[4:9042, hostId=0c42d9cf-4675-47ab-97f6-2821e9048a2d, hashCode=55b5054a])> but was:<...e(endPoint=/127.0.0.[1:9042, hostId=87ae7ce3-5146-481a-b3d6-de4ae59b6317, hashCode=2e82e568])>
Error:  com.datastax.oss.driver.core.loadbalancing.DefaultLoadBalancingPolicyIT.should_apply_node_filter  Time elapsed: 0.001 s  <<< FAILURE!
org.junit.ComparisonFailure: expected:<[4]> but was:<[3]>

Suspected cause: one of the nodes goes down, which creates mismatch with expected values (endpoints order is shuffled and node number is lower)

@Lorak-mmk
Copy link

Lorak-mmk commented Aug 7, 2024

HeapCompressionIT » Runtime The command '[ccm, create, ccm_1, -i, 127.0.0., -n, 1:0, -v, release:6.1.0-rc1, --scylla, --config-dir=/tmp/ccm8589068201510993611]' failed to execute

Do we have some output from this command? If not, can some logging be added so that we get output?

By output I mean Scylla logs / ccm logs.

@roydahan
Copy link
Collaborator

roydahan commented Aug 7, 2024

@Bouncheck lets start by not picking RC builds to be used in CI.

@Bouncheck
Copy link
Collaborator Author

HeapCompressionIT » Runtime The command '[ccm, create, ccm_1, -i, 127.0.0., -n, 1:0, -v, release:6.1.0-rc1, --scylla, --config-dir=/tmp/ccm8589068201510993611]' failed to execute

Do we have some output from this command? If not, can some logging be added so that we get output?

By output I mean Scylla logs / ccm logs.

4.x CI deletes the logs. We can modify existing test rules or try to add another one to be used in combination with the existing ones. #191 was supposed to help with that at one point, not sure if it's a correct implementation in its current state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants