Integrity requirements
Description
I've investigated high memory consumption of an iOS Network Extension (NE) for an app based on xray-core and found an interesting quirk specific to HTTP/2.
The overall stack pins significant amount of RAM, 0.5+ MiB per conn, to maintain idle(?) connections.
The largest heap user (inuse_space per pprof) was http2.(*clientStream).writeRequestBody function:

or (another crash of the same kind)
These goroutines were blocked on io.(*pipe).Read(). OOM was happening as soon as there were ≈25 of them. I suspect, that's the Pipe() from splithttp.Dial():
|
reader, writer := io.Pipe() |
The writeRequestBody() comes from http2 client in x/net/http2/transport.go. It allocates some memory for scratch buffer to read from Pipe:
https://github.com/golang/net/blob/8ecbaa95fea823c19fa74c5c3b53e0bccd473828/http2/transport.go#L1506-L1515
writeRequestBody() does a reasonable thing. The API author probably assumed, that the pipe is a "local" (e.g. file), so it was unexpected for the x/net/http2 code to block for a long time, waiting for Pipe to be written to.
Content-Length is not known in advance in xHTTP, so the frameScratchBufferLen() defaults to min(512KiB, SETTINGS_MAX_FRAME_SIZE):
https://github.com/golang/net/blob/8ecbaa95fea823c19fa74c5c3b53e0bccd473828/http2/transport.go#L1451
Default SETTINGS_MAX_FRAME_SIZE seems to be 1MiB Go x/net/http2. It's http2.defaultMaxReadFrameSize and server sends {SettingMaxFrameSize, conf.MaxReadFrameSize} as a part of SETTINGS frame to avoid default 16 KiB limit:
https://github.com/golang/net/blob/8ecbaa95fea823c19fa74c5c3b53e0bccd473828/http2/http2.go#L85
So, if I understand the case correctly, each idle connection pins at least 512 KiB of buffers on the client side just to maintain possibility to upload data via possibly-idle HTTP/2 stream.
#4749 expressed concerns about MaxReadFrameSize tuning in high-bandwidth scenarios. I've done two tests with iPhone 12 running xray-core with frameScratchBufferLen reduced from 512 KiB to 16 KiB. Both tests were using landline connections, with Wi-Fi 6, running on top of Gigabit Ethernet
- link soft-capped at 100 Mbit/s - iPhone was still able to saturate 100 Mbit/s just fine, OOKLA's Speedtest resulted in 110 Mbit/s (upload)
- link soft-capped at 800 Mbit/s - OOKLA's Speedtest uploads as 215...230 Mbit/s
The 512-to-16 KiB patch changed heap profile:
The thing that changes the most is the number of goroutines waiting in writeRequestBody right before OOM: the number went from ≈25 to ≈200. Yet, this number looks strange on its own and it's another thing I'm going to investigate.

The takeaways are the following:
- xray-core might provide an option to tune
MaxReadFrameSize for the h2 listener and, maybe, other options of http2.Server/http2.Transport, it'll not help when h2 is terminated at a CDN, but will help self-hosters with REALITY
- iOS apps maintainers may want to patch
frameScratchBufferLen() in Go http2 library to reduce per-connection overhead
- Go http2 library might provide a setting to cap frameScratchBufferLen and/or provide a better platform-specific default
- mux seems to have potential to help with such setups
Reproduction Method
My test-case was specific:
- feed refresh in VK app triggered a small connection spike
- it was enough to hit 50 MiB limit after a two-three refreshes
- I polled
/debug/pprof/heap?gc=1 every two seconds and investigated the profile
Client config
xray-core versoin: v26.6.1, go1.26.3
Details
{
"inbounds": [
{
"settings": {
"udp": true
},
"port": NNN,
"listen": "127.0.0.1",
"protocol": "socks"
}
],
"outbounds": [
{
"streamSettings": {
"xhttpSettings": {
"path": "/"
},
"security": "reality",
"realitySettings": {
"publicKey": "...........................................",
"serverName": "example.net",
"spiderX": "/",
"shortId": "",
"fingerprint": "..."
},
"network": "xhttp"
},
"protocol": "vless",
"settings": {
"vnext": [
{
"address": "a.b.c.d",
"users": [
{
"id": "ffffffff-ffff-ffff-ffff-ffffffffffff",
"encryption": "none"
}
],
"port": 443
}
]
}
}
]
}
Server config
n/a
Client log
n/a
Server log
n/a
Integrity requirements
Description
I've investigated high memory consumption of an iOS Network Extension (NE) for an app based on xray-core and found an interesting quirk specific to HTTP/2.
The overall stack pins significant amount of RAM, 0.5+ MiB per conn, to maintain idle(?) connections.
The largest heap user (inuse_space per pprof) was
http2.(*clientStream).writeRequestBodyfunction:These goroutines were blocked on
io.(*pipe).Read(). OOM was happening as soon as there were ≈25 of them. I suspect, that's the Pipe() from splithttp.Dial():Xray-core/transport/internet/splithttp/dialer.go
Line 441 in fdb9b61
The
writeRequestBody()comes from http2 client in x/net/http2/transport.go. It allocates some memory for scratch buffer to read from Pipe:https://github.com/golang/net/blob/8ecbaa95fea823c19fa74c5c3b53e0bccd473828/http2/transport.go#L1506-L1515
writeRequestBody()does a reasonable thing. The API author probably assumed, that the pipe is a "local" (e.g. file), so it was unexpected for the x/net/http2 code to block for a long time, waiting for Pipe to be written to.Content-Length is not known in advance in xHTTP, so the
frameScratchBufferLen()defaults tomin(512KiB, SETTINGS_MAX_FRAME_SIZE):https://github.com/golang/net/blob/8ecbaa95fea823c19fa74c5c3b53e0bccd473828/http2/transport.go#L1451
Default
SETTINGS_MAX_FRAME_SIZEseems to be 1MiB Go x/net/http2. It'shttp2.defaultMaxReadFrameSizeand server sends{SettingMaxFrameSize, conf.MaxReadFrameSize}as a part of SETTINGS frame to avoid default 16 KiB limit:https://github.com/golang/net/blob/8ecbaa95fea823c19fa74c5c3b53e0bccd473828/http2/http2.go#L85
So, if I understand the case correctly, each idle connection pins at least 512 KiB of buffers on the client side just to maintain possibility to upload data via possibly-idle HTTP/2 stream.
#4749 expressed concerns about MaxReadFrameSize tuning in high-bandwidth scenarios. I've done two tests with iPhone 12 running xray-core with frameScratchBufferLen reduced from 512 KiB to 16 KiB. Both tests were using landline connections, with Wi-Fi 6, running on top of Gigabit Ethernet
The 512-to-16 KiB patch changed heap profile:
The thing that changes the most is the number of goroutines waiting in writeRequestBody right before OOM: the number went from ≈25 to ≈200. Yet, this number looks strange on its own and it's another thing I'm going to investigate.

The takeaways are the following:
MaxReadFrameSizefor the h2 listener and, maybe, other options of http2.Server/http2.Transport, it'll not help when h2 is terminated at a CDN, but will help self-hosters with REALITYframeScratchBufferLen()in Go http2 library to reduce per-connection overheadReproduction Method
My test-case was specific:
/debug/pprof/heap?gc=1every two seconds and investigated the profileClient config
xray-core versoin: v26.6.1, go1.26.3
Details
Server config
n/a
Client log
n/a
Server log
n/a