Skip to content

Conversation

gui1117
Copy link
Contributor

@gui1117 gui1117 commented Jul 10, 2025

I added timeout for async operation in the statement store zombienet test.

@lrubasze

@gui1117 gui1117 added the R0-no-crate-publish-required The change does not require any crates to be re-published. label Jul 10, 2025
@@ -13,6 +15,20 @@ use zombienet_sdk::{
NetworkConfigBuilder,
};

fn timeout_1min<T>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need a timeout?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When running the test without having statement store enabled, it ran forever.

For me both are same, have a timeout in the test, or running the test itself with a timeout I will let @lrubasze decide.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds more like the node should return an error when the statement store is not enabled?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem I observed was in a case when node spawned by zombienet-sdk does not start (I was using outdated node without statement store support, node was terminating).
Currently the test would hang-up waiting for client.
I'm not sure if this is really needed to run each step within timeout. Maybe it is just fine to check if node is up and running?

Recently we have added to the zombienet-sdk a helper method wait_until_is_up() for:

Maybe we can use it here?

Regarding (1) timeout in the test vs (2) test in a timeout.

  1. option is preferred, since:
  • timeout for 2. is set to 1h - no need to occupy runner for ~1h if we might easily check the test fails sooner
  • timeout for 2. is set only in CI - locally people run tests without timeout
  • with 1. one knows more exactly where the test fails.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the RPC call not finding out that there is no node running? This sounds like this is the problem. A timeout here would try to fix some symptom and not the root cause. If the node is not reachable, the RPC call should fail and ultimately the entire test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, so we only need to wrap the rpc() call into a timeout?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that should do the job.
Also, zombienet-sdk allows to wait until a node or a whole network is up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think is better to use wait_until_is_up for the whole network and then call the rpc without wrap it in the timeout fn.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, was trying to suggest that gently 😂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pepoviola you suggest this #9640 ? we can merge it in.

@bkchr bkchr requested a review from lrubasze September 3, 2025 08:35
@paritytech-workflow-stopper
Copy link

All GitHub workflows were cancelled due to failure one of the required jobs.
Failed workflow url: https://github.com/paritytech/polkadot-sdk/actions/runs/17543719957
Failed job name: test-linux-stable-no-try-runtime

@bkchr
Copy link
Member

bkchr commented Sep 8, 2025

@gui1117 please fix the warnings.

@gui1117 gui1117 enabled auto-merge September 8, 2025 09:34
@gui1117 gui1117 added this pull request to the merge queue Sep 8, 2025
Merged via the queue into master with commit 644f14f Sep 8, 2025
258 of 262 checks passed
@gui1117 gui1117 deleted the gui-timeout branch September 8, 2025 11:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
R0-no-crate-publish-required The change does not require any crates to be re-published.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants