-
Notifications
You must be signed in to change notification settings - Fork 1k
zombienet test with timeout #9168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -13,6 +15,20 @@ use zombienet_sdk::{ | |||
NetworkConfigBuilder, | |||
}; | |||
|
|||
fn timeout_1min<T>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need a timeout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When running the test without having statement store enabled, it ran forever.
For me both are same, have a timeout in the test, or running the test itself with a timeout I will let @lrubasze decide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds more like the node should return an error when the statement store is not enabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem I observed was in a case when node spawned by zombienet-sdk
does not start (I was using outdated node without statement store support, node was terminating).
Currently the test would hang-up waiting for client.
I'm not sure if this is really needed to run each step within timeout. Maybe it is just fine to check if node is up and running?
Recently we have added to the zombienet-sdk
a helper method wait_until_is_up()
for:
Maybe we can use it here?
Regarding (1) timeout in the test vs (2) test in a timeout.
- option is preferred, since:
- timeout for 2. is set to 1h - no need to occupy runner for ~1h if we might easily check the test fails sooner
- timeout for 2. is set only in CI - locally people run tests without timeout
- with 1. one knows more exactly where the test fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the RPC call not finding out that there is no node running? This sounds like this is the problem. A timeout here would try to fix some symptom and not the root cause. If the node is not reachable, the RPC call should fail and ultimately the entire test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, so we only need to wrap the rpc()
call into a timeout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that should do the job.
Also, zombienet-sdk
allows to wait until a node or a whole network is up.
- Network - https://docs.rs/zombienet-sdk/latest/zombienet_sdk/struct.NetworkNode.html#method.wait_until_is_up
- Node - https://docs.rs/zombienet-sdk/latest/zombienet_sdk/struct.Network.html#method.wait_until_is_up
Maybe we could just assert above before proceeding with the test steps - that should do the job too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think is better to use wait_until_is_up for the whole network and then call the rpc
without wrap it in the timeout fn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, was trying to suggest that gently 😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pepoviola you suggest this #9640 ? we can merge it in.
cumulus/zombienet/zombienet-sdk/tests/zombie_ci/statement_store.rs
Outdated
Show resolved
Hide resolved
refactor suggestion from #9168 (comment)
All GitHub workflows were cancelled due to failure one of the required jobs. |
@gui1117 please fix the warnings. |
I added timeout for async operation in the statement store zombienet test.
@lrubasze