fix(transport/ecn): start after handshake and path validation#2560
fix(transport/ecn): start after handshake and path validation#2560mxinden merged 9 commits intomozilla:mainfrom
Conversation
Failed Interop TestsQUIC Interop Runner, client vs. server, differences relative to 0bd3933. neqo-latest as client
neqo-latest as server
All resultsSucceeded Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
Unsupported Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
|
During ECN path testing, in case `TEST_COUNT_INITIAL_PHASE` ECN probes get lost, ECN is disabled. https://github.com/mozilla/neqo/blob/a62c14068ad603f878f13a354f9ce0eddf97231b/neqo-transport/src/ecn.rs#L21-L25 Though only if `*probes_sent == *probes_lost`. https://github.com/mozilla/neqo/blob/a62c14068ad603f878f13a354f9ce0eddf97231b/neqo-transport/src/ecn.rs#L241-L246 Say that a Neqo client has send 2 ECN marked datagrams, i.e. probes, each of them eventually declared lost. Next the client sends two more datagrams, each again marked. When the first of the two is declared lost, `probes_sent` is `4` and `probes_lost` is `3`, thus ECN is not disabled. When the second of the two is declared lost, `probes_sent` is `4` and `probes_lost` is `4`. Given that `probes_lost` is not `== TEST_COUNT_INITIAL_PHASE`, ECN is once more not disabled. This patch removes the requirement of `*probes_sent == *probes_lost`, thus always disabling ECN after `TEST_COUNT_INITIAL_PHASE` losses. Note that as of now, this patch isn't relevant yet, as path probing is done at connection establishment time and thus `MAX_PATH_PROBES` already disables ECN marking beforehand. https://github.com/mozilla/neqo/blob/a62c14068ad603f878f13a354f9ce0eddf97231b/neqo-transport/src/path.rs#L32-L36 That said, this will become relevant once we start ECN validation only after the handshake (mozilla#2560).
Alternative to mozilla#2588. **Problem** (Same as mozilla#2588.) During ECN path testing, in case `TEST_COUNT_INITIAL_PHASE` ECN probes get lost, ECN is disabled. https://github.com/mozilla/neqo/blob/a62c14068ad603f878f13a354f9ce0eddf97231b/neqo-transport/src/ecn.rs#L21-L25 Though only if `*probes_sent == *probes_lost`. https://github.com/mozilla/neqo/blob/a62c14068ad603f878f13a354f9ce0eddf97231b/neqo-transport/src/ecn.rs#L241-L246 Say that a Neqo client has send 2 ECN marked packets, i.e. probes, each of them eventually declared lost. Next the client sends two more packets, each again marked. When the first of the two is declared lost, `probes_sent` is `4` and `probes_lost` is `3`, thus ECN is not disabled. When the second of the two is declared lost, `probes_sent` is `4` and `probes_lost` is `4`. Given that `probes_lost` is not `== TEST_COUNT_INITIAL_PHASE`, ECN is once more not disabled. **Solution** (Alternative to mozilla#2588.) With this patch `neqo_transport::ecn` keeps track of ACKed marked packets. Instead of requiring `*probes_sent == *probes_lost`, only disable ECN if no marked packet has been acked already. Note that as of now, this patch isn't relevant yet, as path probing is done at connection establishment time and thus `MAX_PATH_PROBES` already disables ECN marking beforehand. https://github.com/mozilla/neqo/blob/a62c14068ad603f878f13a354f9ce0eddf97231b/neqo-transport/src/path.rs#L32-L36 That said, this will become relevant once we start ECN validation only after the handshake (mozilla#2560).
Alternative to mozilla#2588. **Problem** (Same as mozilla#2588.) During ECN path testing, in case `TEST_COUNT_INITIAL_PHASE` ECN probes get lost, ECN is disabled. https://github.com/mozilla/neqo/blob/a62c14068ad603f878f13a354f9ce0eddf97231b/neqo-transport/src/ecn.rs#L21-L25 Though only if `*probes_sent == *probes_lost`. https://github.com/mozilla/neqo/blob/a62c14068ad603f878f13a354f9ce0eddf97231b/neqo-transport/src/ecn.rs#L241-L246 Say that a Neqo client has send 2 ECN marked packets, i.e. probes, each of them eventually declared lost. Next the client sends two more packets, each again marked. When the first of the two is declared lost, `probes_sent` is `4` and `probes_lost` is `3`, thus ECN is not disabled. When the second of the two is declared lost, `probes_sent` is `4` and `probes_lost` is `4`. Given that `probes_lost` is not `== TEST_COUNT_INITIAL_PHASE`, ECN is once more not disabled. **Solution** (Alternative to mozilla#2588.) With this patch `neqo_transport::ecn` keeps track of ACKed marked packets. Instead of requiring `*probes_sent == *probes_lost`, only disable ECN if no marked packet has been acked already. Note that as of now, this patch isn't relevant yet, as path probing is done at connection establishment time and thus `MAX_PATH_PROBES` already disables ECN marking beforehand. https://github.com/mozilla/neqo/blob/a62c14068ad603f878f13a354f9ce0eddf97231b/neqo-transport/src/path.rs#L32-L36 That said, this will become relevant once we start ECN validation only after the handshake (mozilla#2560).
Alternative to #2588. **Problem** (Same as #2588.) During ECN path testing, in case `TEST_COUNT_INITIAL_PHASE` ECN probes get lost, ECN is disabled. https://github.com/mozilla/neqo/blob/a62c14068ad603f878f13a354f9ce0eddf97231b/neqo-transport/src/ecn.rs#L21-L25 Though only if `*probes_sent == *probes_lost`. https://github.com/mozilla/neqo/blob/a62c14068ad603f878f13a354f9ce0eddf97231b/neqo-transport/src/ecn.rs#L241-L246 Say that a Neqo client has send 2 ECN marked packets, i.e. probes, each of them eventually declared lost. Next the client sends two more packets, each again marked. When the first of the two is declared lost, `probes_sent` is `4` and `probes_lost` is `3`, thus ECN is not disabled. When the second of the two is declared lost, `probes_sent` is `4` and `probes_lost` is `4`. Given that `probes_lost` is not `== TEST_COUNT_INITIAL_PHASE`, ECN is once more not disabled. **Solution** (Alternative to #2588.) With this patch `neqo_transport::ecn` keeps track of ACKed marked packets. Instead of requiring `*probes_sent == *probes_lost`, only disable ECN if no marked packet has been acked already. Note that as of now, this patch isn't relevant yet, as path probing is done at connection establishment time and thus `MAX_PATH_PROBES` already disables ECN marking beforehand. https://github.com/mozilla/neqo/blob/a62c14068ad603f878f13a354f9ce0eddf97231b/neqo-transport/src/path.rs#L32-L36 That said, this will become relevant once we start ECN validation only after the handshake (#2560).
3b19c7f to
c7bd756
Compare
Benchmark resultsPerformance differences relative to 3909abe. 1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: Change within noise threshold. time: [653.17 ms 654.62 ms 656.13 ms]
thrpt: [152.41 MiB/s 152.76 MiB/s 153.10 MiB/s]
change:
time: [−0.6433% −0.3870% −0.1110%] (p = 0.00 < 0.05)
thrpt: [+0.1111% +0.3885% +0.6475%]
1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: No change in performance detected. time: [305.14 ms 306.57 ms 307.97 ms]
thrpt: [32.471 Kelem/s 32.619 Kelem/s 32.771 Kelem/s]
change:
time: [−0.3048% +0.3810% +1.0538%] (p = 0.27 > 0.05)
thrpt: [−1.0428% −0.3796% +0.3058%]
1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: No change in performance detected. time: [26.913 ms 27.053 ms 27.213 ms]
thrpt: [36.748 elem/s 36.965 elem/s 37.156 elem/s]
change:
time: [−0.4503% +0.2610% +1.0196%] (p = 0.47 > 0.05)
thrpt: [−1.0093% −0.2603% +0.4523%]
1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client: Change within noise threshold. time: [660.83 ms 662.31 ms 663.80 ms]
thrpt: [150.65 MiB/s 150.99 MiB/s 151.32 MiB/s]
change:
time: [−0.6393% −0.3896% −0.1148%] (p = 0.00 < 0.05)
thrpt: [+0.1149% +0.3911% +0.6435%]
decode 4096 bytes, mask ff: No change in performance detected. time: [11.818 µs 11.864 µs 11.914 µs]
change: [−0.0598% +0.2597% +0.6180%] (p = 0.13 > 0.05)
decode 1048576 bytes, mask ff: No change in performance detected. time: [3.0218 ms 3.0312 ms 3.0423 ms]
change: [−0.4463% +0.0386% +0.5230%] (p = 0.89 > 0.05)
decode 4096 bytes, mask 7f: No change in performance detected. time: [19.922 µs 19.964 µs 20.014 µs]
change: [−0.5146% −0.0800% +0.4248%] (p = 0.75 > 0.05)
decode 1048576 bytes, mask 7f: No change in performance detected. time: [5.0444 ms 5.0556 ms 5.0685 ms]
change: [−0.4599% −0.0952% +0.2405%] (p = 0.60 > 0.05)
decode 4096 bytes, mask 3f: No change in performance detected. time: [8.2596 µs 8.2896 µs 8.3259 µs]
change: [−0.3200% +0.0738% +0.5247%] (p = 0.75 > 0.05)
decode 1048576 bytes, mask 3f: No change in performance detected. time: [1.5849 ms 1.5905 ms 1.5973 ms]
change: [−0.6282% −0.1014% +0.4344%] (p = 0.70 > 0.05)
1000 streams of 1 bytes/multistream: 💔 Performance has regressed. time: [50.923 ns 51.040 ns 51.164 ns]
change: [+55.287% +56.773% +58.245%] (p = 0.00 < 0.05)
1000 streams of 1000 bytes/multistream: 💔 Performance has regressed. time: [51.241 ns 51.379 ns 51.525 ns]
change: [+61.845% +63.311% +64.761%] (p = 0.00 < 0.05)
: No change in performance detected.
coalesce_acked_from_zero 3+1 entries: No change in performance detected. time: [106.10 ns 106.51 ns 106.93 ns]
change: [−0.4317% +1.0524% +3.5603%] (p = 0.54 > 0.05)
coalesce_acked_from_zero 10+1 entries: No change in performance detected. time: [105.25 ns 105.48 ns 105.82 ns]
change: [−0.4960% +0.0515% +0.6013%] (p = 0.86 > 0.05)
coalesce_acked_from_zero 1000+1 entries: No change in performance detected. time: [88.708 ns 92.298 ns 100.45 ns]
change: [−0.5587% +2.6888% +8.3111%] (p = 0.37 > 0.05)
RxStreamOrderer::inbound_frame(): Change within noise threshold. time: [111.03 ms 111.10 ms 111.17 ms]
change: [+0.9502% +1.0311% +1.1131%] (p = 0.00 < 0.05)
sent::Packets::take_ranges: No change in performance detected. time: [8.3071 µs 8.5386 µs 8.7568 µs]
change: [−0.1886% +6.0934% +16.006%] (p = 0.15 > 0.05)
transfer/pacing-false/varying-seeds: 💚 Performance has improved. time: [34.603 ms 34.664 ms 34.724 ms]
change: [−3.5790% −3.3105% −3.0545%] (p = 0.00 < 0.05)
transfer/pacing-true/varying-seeds: Change within noise threshold. time: [36.047 ms 36.139 ms 36.231 ms]
change: [−1.8788% −1.5171% −1.1519%] (p = 0.00 < 0.05)
transfer/pacing-false/same-seed: Change within noise threshold. time: [35.061 ms 35.111 ms 35.161 ms]
change: [−2.4981% −2.2964% −2.1088%] (p = 0.00 < 0.05)
transfer/pacing-true/same-seed: Change within noise threshold. time: [36.492 ms 36.556 ms 36.621 ms]
change: [−2.4463% −2.2170% −1.9969%] (p = 0.00 < 0.05)
Client/server transfer resultsPerformance differences relative to 3909abe. Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.
Download data for |
This doesn't quite work, because we never receive an ACK for the ECN probes in the Initial packet number space before dropping it, so the probe and ACK counts never align. I think mozilla#2560 is the better approach here, i.e., do ECN once we're in established state. Fixes mozilla#2741
|
| Branch | ecn-after-handshake |
| Testbed | t-linux64-ms-280 |
🚨 1 Alert
| Benchmark | Measure Units | View | Benchmark Result (Result Δ%) | Upper Boundary (Limit %) |
|---|---|---|---|---|
| coalesce_acked_from_zero 1000+1 entries | Latency nanoseconds (ns) | 📈 plot 🚷 threshold 🚨 alert (🔔) | 92.30 ns(+3.32%)Baseline: 89.33 ns | 91.78 ns (100.56%) |
Click to view all benchmark results
| Benchmark | Latency | Benchmark Result nanoseconds (ns) (Result Δ%) | Upper Boundary nanoseconds (ns) (Limit %) |
|---|---|---|---|
| 1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client | 📈 view plot 🚷 view threshold | 662,310,000.00 ns(-0.33%)Baseline: 664,484,459.46 ns | 731,705,113.47 ns (90.52%) |
| 1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client | 📈 view plot 🚷 view threshold | 654,620,000.00 ns(+3.58%)Baseline: 631,997,027.03 ns | 832,542,305.02 ns (78.63%) |
| 1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client | 📈 view plot 🚷 view threshold | 27,053,000.00 ns(-0.50%)Baseline: 27,189,675.68 ns | 27,661,578.21 ns (97.80%) |
| 1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client | 📈 view plot 🚷 view threshold | 306,570,000.00 ns(+0.57%)Baseline: 304,834,189.19 ns | 316,066,323.62 ns (97.00%) |
| 1000 streams of 1 bytes/multistream | 📈 view plot 🚷 view threshold | 51.04 ns(+35.41%)Baseline: 37.69 ns | 54.95 ns (92.89%) |
| 1000 streams of 1000 bytes/multistream | 📈 view plot 🚷 view threshold | 51.38 ns(+37.96%)Baseline: 37.24 ns | 54.62 ns (94.07%) |
| RxStreamOrderer::inbound_frame() | 📈 view plot 🚷 view threshold | 111,100,000.00 ns(+0.55%)Baseline: 110,495,959.46 ns | 114,561,444.98 ns (96.98%) |
| coalesce_acked_from_zero 1+1 entries | 📈 view plot 🚷 view threshold | 88.56 ns(-0.15%)Baseline: 88.69 ns | 89.30 ns (99.17%) |
| coalesce_acked_from_zero 10+1 entries | 📈 view plot 🚷 view threshold | 105.48 ns(-0.43%)Baseline: 105.93 ns | 106.88 ns (98.69%) |
| coalesce_acked_from_zero 1000+1 entries | 📈 view plot 🚷 view threshold 🚨 view alert (🔔) | 92.30 ns(+3.32%)Baseline: 89.33 ns | 91.78 ns (100.56%) |
| coalesce_acked_from_zero 3+1 entries | 📈 view plot 🚷 view threshold | 106.51 ns(-0.02%)Baseline: 106.53 ns | 107.40 ns (99.17%) |
| decode 1048576 bytes, mask 3f | 📈 view plot 🚷 view threshold | 1,590,500.00 ns(-1.60%)Baseline: 1,616,313.51 ns | 1,764,082.52 ns (90.16%) |
| decode 1048576 bytes, mask 7f | 📈 view plot 🚷 view threshold | 5,055,600.00 ns(-0.16%)Baseline: 5,063,533.78 ns | 5,090,572.93 ns (99.31%) |
| decode 1048576 bytes, mask ff | 📈 view plot 🚷 view threshold | 3,031,200.00 ns(-0.14%)Baseline: 3,035,586.49 ns | 3,067,625.58 ns (98.81%) |
| decode 4096 bytes, mask 3f | 📈 view plot 🚷 view threshold | 8,289.60 ns(+4.60%)Baseline: 7,924.70 ns | 10,163.54 ns (81.56%) |
| decode 4096 bytes, mask 7f | 📈 view plot 🚷 view threshold | 19,964.00 ns(+0.22%)Baseline: 19,919.30 ns | 20,405.80 ns (97.83%) |
| decode 4096 bytes, mask ff | 📈 view plot 🚷 view threshold | 11,864.00 ns(+0.40%)Baseline: 11,816.19 ns | 11,975.56 ns (99.07%) |
| sent::Packets::take_ranges | 📈 view plot 🚷 view threshold | 8,538.60 ns(+1.16%)Baseline: 8,440.78 ns | 8,591.80 ns (99.38%) |
| transfer/pacing-false/same-seed | 📈 view plot 🚷 view threshold | 35,111,000.00 ns(+0.53%)Baseline: 34,926,486.49 ns | 36,611,004.30 ns (95.90%) |
| transfer/pacing-false/varying-seeds | 📈 view plot 🚷 view threshold | 34,664,000.00 ns(-1.15%)Baseline: 35,066,513.51 ns | 36,805,390.21 ns (94.18%) |
| transfer/pacing-true/same-seed | 📈 view plot 🚷 view threshold | 36,556,000.00 ns(+0.06%)Baseline: 36,534,000.00 ns | 38,140,269.54 ns (95.85%) |
| transfer/pacing-true/varying-seeds | 📈 view plot 🚷 view threshold | 36,139,000.00 ns(+0.65%)Baseline: 35,906,418.92 ns | 37,523,651.23 ns (96.31%) |
|
| Branch | ecn-after-handshake |
| Testbed | t-linux64-ms-280 |
Click to view all benchmark results
| Benchmark | Latency | Benchmark Result milliseconds (ms) (Result Δ%) | Upper Boundary milliseconds (ms) (Limit %) |
|---|---|---|---|
| s2n vs. neqo (cubic, paced) | 📈 view plot 🚷 view threshold | 310.42 ms(-0.45%)Baseline: 311.83 ms | 327.11 ms (94.90%) |
|
@mxinden I tried to "help" and I think I messed up the rebase... |
0dd1ad8 to
09109a4
Compare
|
No worries. Rebased on to latest |
Client/server transfer resultsPerformance differences relative to 531cb0b. Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.
Download data for |
Benchmark resultsPerformance differences relative to 531cb0b. 1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: Change within noise threshold. time: [199.38 ms 199.81 ms 200.29 ms]
thrpt: [499.27 MiB/s 500.47 MiB/s 501.55 MiB/s]
change:
time: [+0.4936% +0.7774% +1.0872%] (p = 0.00 < 0.05)
thrpt: [−1.0755% −0.7714% −0.4912%]
1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: No change in performance detected. time: [303.36 ms 304.87 ms 306.37 ms]
thrpt: [32.640 Kelem/s 32.801 Kelem/s 32.964 Kelem/s]
change:
time: [−0.5614% +0.1751% +0.8849%] (p = 0.63 > 0.05)
thrpt: [−0.8771% −0.1748% +0.5646%]
1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: No change in performance detected. time: [27.455 ms 27.546 ms 27.660 ms]
thrpt: [36.153 B/s 36.303 B/s 36.424 B/s]
change:
time: [−0.1785% +0.2806% +0.7581%] (p = 0.26 > 0.05)
thrpt: [−0.7524% −0.2799% +0.1788%]
1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client: 💚 Performance has improved. time: [614.57 ms 619.13 ms 622.98 ms]
thrpt: [160.52 MiB/s 161.52 MiB/s 162.72 MiB/s]
change:
time: [−4.4209% −3.3279% −2.2884%] (p = 0.00 < 0.05)
thrpt: [+2.3420% +3.4424% +4.6254%]
decode 4096 bytes, mask ff: No change in performance detected. time: [11.813 µs 11.845 µs 11.885 µs]
change: [−0.1456% +0.3796% +1.1232%] (p = 0.27 > 0.05)
decode 1048576 bytes, mask ff: No change in performance detected. time: [3.0206 ms 3.0301 ms 3.0412 ms]
change: [−0.7263% −0.1769% +0.3167%] (p = 0.53 > 0.05)
decode 4096 bytes, mask 7f: No change in performance detected. time: [20.003 µs 20.051 µs 20.106 µs]
change: [−0.0550% +0.4063% +1.0006%] (p = 0.13 > 0.05)
decode 1048576 bytes, mask 7f: No change in performance detected. time: [5.0432 ms 5.0585 ms 5.0769 ms]
change: [−0.2586% +0.1203% +0.5741%] (p = 0.58 > 0.05)
decode 4096 bytes, mask 3f: No change in performance detected. time: [8.2880 µs 8.3257 µs 8.3692 µs]
change: [−0.4558% +0.3830% +1.1278%] (p = 0.37 > 0.05)
decode 1048576 bytes, mask 3f: No change in performance detected. time: [1.5852 ms 1.5908 ms 1.5976 ms]
change: [−0.7821% −0.1839% +0.4184%] (p = 0.53 > 0.05)
1000 streams of 1 bytes/multistream: No change in performance detected. time: [27.109 ns 27.307 ns 27.516 ns]
change: [−34.524% −14.331% +1.7113%] (p = 0.60 > 0.05)
:Criterion.rs ERROR: Error in Gnuplot: line 0: Can't plot with an empty x range! : No change in performance detected.
coalesce_acked_from_zero 1+1 entries: No change in performance detected. time: [88.160 ns 88.528 ns 88.893 ns]
change: [−0.3781% +0.1738% +0.7427%] (p = 0.55 > 0.05)
coalesce_acked_from_zero 3+1 entries: No change in performance detected. time: [105.48 ns 105.83 ns 106.22 ns]
change: [−0.6046% −0.1401% +0.3511%] (p = 0.57 > 0.05)
coalesce_acked_from_zero 10+1 entries: No change in performance detected. time: [104.84 ns 105.19 ns 105.63 ns]
change: [−0.3236% +0.2731% +0.9357%] (p = 0.40 > 0.05)
coalesce_acked_from_zero 1000+1 entries: Change within noise threshold. time: [88.548 ns 88.715 ns 88.921 ns]
change: [−1.5290% −0.7327% −0.0641%] (p = 0.05 < 0.05)
RxStreamOrderer::inbound_frame(): Change within noise threshold. time: [108.44 ms 108.57 ms 108.74 ms]
change: [−1.2441% −1.0937% −0.9305%] (p = 0.00 < 0.05)
sent::Packets::take_ranges: No change in performance detected. time: [8.0682 µs 8.3085 µs 8.5428 µs]
change: [−1.9447% +4.0692% +14.590%] (p = 0.40 > 0.05)
transfer/pacing-false/varying-seeds: Change within noise threshold. time: [37.289 ms 37.367 ms 37.446 ms]
change: [+0.8769% +1.2026% +1.5124%] (p = 0.00 < 0.05)
transfer/pacing-true/varying-seeds: Change within noise threshold. time: [38.069 ms 38.191 ms 38.323 ms]
change: [+0.5182% +0.9683% +1.4091%] (p = 0.00 < 0.05)
transfer/pacing-false/same-seed: Change within noise threshold. time: [36.904 ms 36.988 ms 37.087 ms]
change: [+0.5750% +0.8624% +1.1783%] (p = 0.00 < 0.05)
transfer/pacing-true/same-seed: Change within noise threshold. time: [38.733 ms 38.816 ms 38.901 ms]
change: [+0.1120% +0.4055% +0.6994%] (p = 0.01 < 0.05)
Download data for |
09109a4 to
9b67e49
Compare
|
|
9b67e49 to
3f3ab35
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2560 +/- ##
==========================================
- Coverage 94.92% 94.89% -0.04%
==========================================
Files 115 115
Lines 34281 34286 +5
Branches 34281 34286 +5
==========================================
- Hits 32541 32534 -7
- Misses 1733 1745 +12
Partials 7 7
|
There was a problem hiding this comment.
Pull Request Overview
This PR delays ECN marking and path validation until the QUIC handshake is confirmed and secondary paths are validated.
- Introduce a
NotStarteddefault state for ECN path validation and trigger testing only viaInfo::start. - Refactor
lost_ecnandstart_ecnAPIs inPaths,Path, andInfo, removing the packet‐type parameter. - Update connection logic to call
start_ecnand handle ECN loss tokens, and revise tests to unwrap datagrams and adjust expectations for ECN behavior.
Reviewed Changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| src/ecn.rs | Add NotStarted default, implement Info::start, update ECN validation state transitions and lost_ecn. |
| src/path.rs | Call start_ecn after baseline setup; unify lost_ecn signature. |
| src/connection/mod.rs | Invoke start_ecn on connection flows; update ECN token handling. |
| src/tracking.rs | Minor formatting change in RecvdPackets builder. |
| tests/**/*.rs | Adapt tests to new ECN start logic, unwrap datagrams, and adjust counters. |
Comments suppressed due to low confidence (3)
neqo-transport/src/tracking.rs:466
- [nitpick] An extra blank line was added after
builder.encode_varint; consider removing it to keep formatting consistent with existing code style.
let mut iter = ranges.iter();
neqo-transport/src/ecn.rs:247
- The
lost_ecnmethod now counts all lost ECN-marked packets rather than just Initial probes. To preserve the original validation logic, reintroduce a filter forpacket::Type::Initialor explicitly document that all packet types should count toward initial probe loss.
pub(crate) fn lost_ecn(&mut self, stats: &mut Stats) {
neqo-transport/src/connection/mod.rs:3150
start_ecnis called unconditionally on each connection send path, which may start ECN marking before the handshake is confirmed. Guard this call so it only runs when the connection state isConfirmed.
self.paths.start_ecn(&mut self.stats.borrow_mut());
Signed-off-by: Max Inden <mail@max-inden.de>
Bug 1918070 introduced the network.http.http3.ecn pref, enabled on Firefox Nightly. When true, Firefox would (a) mark and (b) report ECN. Bug 1948739 enabled the pref on Firefox Beta. Previously, with the pref set to true, we saw around 2% of ECN blackholes, i.e. network paths that drop all ECN marked datagrams. Such a blackhole leads to significant connection establishment delay. Bug 1961340 enabled the ECN reporting side only, leaving the marking side on Beta and Nightly only. Since then we moved the ECN path validation to after the handshake (mozilla/neqo#2560) thus reducing the impact of blackholes and we improved the metric itself in Bug 1950831. With the above in mind, we can now enable QUIC ECN marking as well. Differential Revision: https://phabricator.services.mozilla.com/D265557
Only start ECN marking and thus ECN-Path validation when:
PATH_RESPONSE).Fixes #2490 and #2741