Replies: 2 comments 1 reply
-
Can't find evidence of this test failing. In windows 3.12 is also passing: (venv_win3.12) quickemu@QUICKEM-V6TKH1L C:\Users\Quickemu\Libp2p\py-libp2p>pytest -v tests/core/kad_dht/test_kad_dht.py::test_provide_and_find_providers
==================================================================== test session starts ====================================================================
platform win32 -- Python 3.12.10, pytest-8.4.2, pluggy-1.6.0 -- C:\Users\Quickemu\Libp2p\py-libp2p\venv_win3.12\Scripts\python.exe
cachedir: .pytest_cache
rootdir: C:\Users\Quickemu\Libp2p\py-libp2p
configfile: pyproject.toml
plugins: anyio-1.4.0, Faker-37.8.0, timeout-2.4.0, trio-0.8.0, xdist-3.8.0
collected 1 item
tests/core/kad_dht/test_kad_dht.py::test_provide_and_find_providers PASSED [100%]
===================================================================== warnings summary ======================================================================
<frozen importlib._bootstrap>:488
<frozen importlib._bootstrap>:488: DeprecationWarning: Type google._upb._message.MessageMapContainer uses PyType_Spec with a metaclass that has custom tp_ne
w. This is deprecated and will no longer be allowed in Python 3.14.
<frozen importlib._bootstrap>:488
<frozen importlib._bootstrap>:488: DeprecationWarning: Type google._upb._message.ScalarMapContainer uses PyType_Spec with a metaclass that has custom tp_new
. This is deprecated and will no longer be allowed in Python 3.14.
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=================================================================== slowest 50 durations ====================================================================
3.25s call tests/core/kad_dht/test_kad_dht.py::test_provide_and_find_providers
0.04s setup tests/core/kad_dht/test_kad_dht.py::test_provide_and_find_providers
0.00s teardown tests/core/kad_dht/test_kad_dht.py::test_provide_and_find_providers
=============================================================== 1 passed, 2 warnings in 3.62s =============================================================== |
Beta Was this translation helpful? Give feedback.
-
✅ RESOLVED: DHT Performance Issue FixedI've successfully implemented a comprehensive fix for the DHT performance regression reported in this discussion. The solution maintains the security improvements from PR #892 while providing configurable performance optimization. 🔧 Solution ImplementedEnvironment Variable Configuration in # Configurable bind address with secure default
DEFAULT_BIND_ADDRESS = os.getenv("LIBP2P_BIND", "127.0.0.1")
LISTEN_MADDR = multiaddr.Multiaddr(f"/ip4/{DEFAULT_BIND_ADDRESS}/tcp/0") 📊 Performance Results
🚀 CI/CD UsageAs suggested in the discussion, CI/CD can now use: LIBP2P_BIND=0.0.0.0 pytest tests/ Or in GitHub Actions: env:
LIBP2P_BIND: '0.0.0.0' ✅ Key Benefits
🧪 VerificationYou can test the fix: # Default (secure)
python -c "from libp2p.tools.constants import DEFAULT_BIND_ADDRESS; print(f'Default: {DEFAULT_BIND_ADDRESS}')"
# Performance test
LIBP2P_BIND=0.0.0.0 pytest tests/core/kad_dht/test_kad_dht.py::test_provide_and_find_providers --durations=1 📁 Implementation DetailsFiles added/modified:
The solution follows the suggestion to the "secure by default, fast when needed" principle discussed in this thread. CI/CD pipelines can now perform optimally while keeping production deployments secure by default. Grateful for @seetadev for creating this discussion and addressing the issue. Branch:
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
@yashksaini-coder , @acul71, @pacrob and @sumanjeet0012 : This is in reference to PR #892
We merged it yesterday.
✅ What was fixed in this PR
We replaced all
0.0.0.0
binds with127.0.0.1
, which restricts services to loopback (local only). This is the correct security move.🚨 What’s failing
From the logs:
So only one test fails:
test_provide_and_find_providers
.All the other slow tests are just long-running (timeouts, heartbeats, churn) but not failing.
🔎 Likely cause
The
kad_dht
test failure ties directly to your change:DHT nodes need to connect to each other.
If all nodes are only listening on
127.0.0.1
, then:0.0.0.0
(all interfaces) for peer discovery, then restricting to127.0.0.1
may break multiaddr resolution.0.0.0.0
, which is not dialable, leading to connection failures.That would explain why everything else passed (local pubsub, QUIC, relay, etc.), but DHT provider/discovery (which depends on multiaddr correctness) broke.
🛠️ How to fix
You likely need a fallback strategy instead of hardcoding
127.0.0.1
everywhere. Here are some options:Allow config override
Keep
127.0.0.1
as default (secure).But let tests explicitly use
0.0.0.0
(or actual interface IP) when simulating multi-node DHT.Example in
libp2p/utils/address_validation.py
:Then in CI tests:
Fix DHT test assumptions
test_provide_and_find_providers
is constructing peers with0.0.0.0
.127.0.0.1
explicitly.Improve multiaddr resolution
When replacing
0.0.0.0
with127.0.0.1
, ensure the peer multiaddrs still look like:and not
because the latter is undialable.
🐢 Why CI is slow
quic
,gossipsub
,dummyaccount_demo
) are not failing, just slow.0.0.0.0
to127.0.0.1
might cause more connection retries (dial attempts failing before falling back). That inflates runtime.You can:
pytest -n auto --maxfail=1 -q
to fail fast.@acul71 : Luca, wish to have your thoughts on the timeouts we should add for different modules.
Beta Was this translation helpful? Give feedback.
All reactions