Skip to content

Conversation

@Frando
Copy link
Member

@Frando Frando commented Dec 8, 2025

Description

This adds two metrics:

  • remote_holepunch_attempt counts the number of remotes for which we initiated holepunching.
  • remote_holepunch_success counts the number of remotes where we initiated holepunching and had success

Both are either increased by 0 or 1 during the lifetime of a RemoteStateActor, never more. We incrase _attempt once on the first time when holepunching is initiated. We increase _success once if a new IP path is opened whose remote address is part of the last holepunching round.

I'm not entirely sure if this logic is fully sound, but it's the best I could come up with so far.

Fixes #3695

Breaking Changes

Notes & open questions

Change checklist

  • Self-review.
  • Documentation updates following the style guide, if relevant.
  • Tests if relevant.
  • All breaking changes documented.
    • List all breaking changes in the above "Breaking Changes" section.
    • Open an issue or PR on any number0 repos that are affected by this breaking change. Give guidance on how the updates should be handled or do the actual updates themselves. The major ones are:

@Frando Frando force-pushed the Frando/mp-metrics-hp branch from eff1c4b to acbc048 Compare December 8, 2025 13:35
@github-actions
Copy link

github-actions bot commented Dec 8, 2025

Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/3748/docs/iroh/

Last updated: 2025-12-10T12:24:53Z

@n0bot n0bot bot added this to iroh Dec 8, 2025
@github-project-automation github-project-automation bot moved this to 🏗 In progress in iroh Dec 8, 2025
@Frando Frando force-pushed the Frando/mp-metrics-hp branch from ca85a9f to f6c5e49 Compare December 9, 2025 11:33
@Frando Frando marked this pull request as ready for review December 9, 2025 13:57
@Frando Frando changed the title draft: holepunch metrics feat: holepunch metrics Dec 9, 2025
@Frando Frando changed the title feat: holepunch metrics feat: basic holepunch metrics Dec 9, 2025
Base automatically changed from Frando/mp-metrics-basics to feat-multipath December 9, 2025 21:38
@Frando Frando force-pushed the Frando/mp-metrics-hp branch from 0a4ba70 to fed03ef Compare December 10, 2025 12:18
@Frando Frando requested a review from flub December 10, 2025 12:23
@flub
Copy link
Contributor

flub commented Dec 10, 2025

Help, I find this very hard to think about. What do we really want to measure here? I don't have a good answer to this so find commenting on the PR hard. @Arqu maybe has input on what we really need to have?

Things that are easy to measure:

  • number of times we start holepuching
  • number of times we explicitly open a path without holepunching (e.g. because it was holepunched in another connection).
  • number of times a path was opened for any reason (e.g. on the server side or because of holepunching or because of explicit open_path).
  • number of times we abandon a path.
  • number of times a path is abandoned.

Things that are hard(er) to measure:

  • number of times a path is opened from nat traversal (essentially shown in this PR currently though you also would have to be sure it wasn't from an explict open_path call which is not easy right now)
  • number of successful holepunches
  • number of failed holepunches

My inclination is to concentrate on the easy ones, but do you learn anything useful from those? You can't do any maths to tell you how good your holepunching is I think, so what's the point even? This goes back to, what are we trying to measure?

/// The number of NAT traversal attempts initiated.
pub nat_traversal: Counter,
/// The number of remote endpoints for which of NAT traversal was initiated.
pub remote_holepunch_attempts: Counter,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "remote_" prefix makes me think this was initiated by the remote.

Essentially the metrics you added are:

  • remotes connected to ("remotes_connection_established"?)
  • remotes holepunched ("remotes_connection_direct"?)

IIUC. This could be useful I guess. Not sure, it's also a bit weird (see separate comment).

&& hp.remote_candidates.contains(ip_addr)
{
self.has_holepunched = true;
self.metrics.remote_holepunch_success.inc();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One issue with this impl is that this metric would only be incremented on the client, and not the server. But that's a bit counter-intuitive from the user point of view.

@flub
Copy link
Contributor

flub commented Dec 10, 2025

So I think the original intention was that you could have two metrics like:

  • no of connections that were on relay and had holepunch attempts
  • no of connections that moved from relay to direct after holepunching

And now you can do maths on them because you can compute the number of connections that did not holepunch and have a holepunching rate. But these metrics fall under the "hard" metrics 😞 I'm not sure how to do them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: 🏗 In progress

Development

Successfully merging this pull request may close these issues.

3 participants