-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix packet forwarding between vz and socket_vmnet #2680
base: master
Are you sure you want to change the base?
Conversation
@balajiv113 can you review? |
Example iperf3 run - case 12 lima vms:
Running iperf3 on the host, testing server vm. Tested on M1 Pro
Retries logs on client:
Retry logs on server: |
Example iperf3 run - case 22 lima vms:
Tested on M1 Pro
Retries logs on client: Retries logs on server:
|
Example iperf3 run - case 3lima vm running Tested on M1 Pro
Retries logs on server vm: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the idea behind this change is good.
I have a few comments though.
} | ||
|
||
var _ net.Conn = (*QEMUPacketConn)(nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am guessing that this line is required. This was to ensure that QEMUPacketConn implements the net.Conn interface.
Even if we want to handle only reads and write for now, maintaining the compatibility with the net.Conn interface is useful for easy replacement replacement of the implementation in the future.
If it is not a lot of work to retain the deleted methods, please keep it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to verify this since it always implement the interface now, since we embed a net.Conn
. Previously we kept a net.Conn instance so the check was useful.
} | ||
|
||
func (v *QEMUPacketConn) SetWriteDeadline(t time.Time) error { | ||
return v.unixConn.SetWriteDeadline(t) | ||
// Write write a packet retrying on ENOBUFS errors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/Write write/Write/g
var retries uint64 | ||
for { | ||
n, err := c.Conn.Write(b) | ||
if err != nil && errors.Is(err, syscall.ENOBUFS) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to handle the case where n > 0 and err is syscall.ENOBUFS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not possible with SOCK_DGRAM socket. You either send a complete datagram, or cannot send anything.
Maybe retry on n == 0 && errors.Is(err, syscall.ENOBUFS)
, and let the call fail otherwise?
We used external package (tcpproxy) for proxying between unix stream and datagram sockets. This package cannot handle ENOBUFS error, expected condition on BSD based systems, and worse, it hides errors and stop forwarding packets silently when write to vz socket fails with ENOBUFS[1]. Fix the issues by replacing tcpproxy with a simpler and more direct implementation that will be easier to maintain. Fixes: - Fix error handling if write to vz datagram socket fail with ENOBUFS. We retry the write until it succeeds with a very short sleep between retries. Similar solution is used in gvisor-tap-vsock[2]. - Fix error handling if we could not read packet header or body from socket_vmnet stream socket. Previously we logged an error and continue to send corrupted packet to vz from the point of the failure. - Fix error handling if writing a packet to socket_vmnet stream socket returned after writing partial packet. Now we handle short writes and write the complete packet. Previously would break the protocol and continue to send corrupted packet from the point of the failure. - Log error if forwarding packets from vz to socket_vmnet or from socket_vmnet to vz failed. Simplification: - Use binary.Read() and binary.Write() to read and write qemu packet header. Visibility: - Make QEMUPacketConn private since it is an implementation detail of vz when using socket_vmnet. [1] lima-vm/socket_vmnet#39 [2] containers/gvisor-tap-vsock#370 Signed-off-by: Nir Soffer <[email protected]>
The test starts a fake vmnet server listening on a unix stream socket and echoing received packet. Then it sends packets to to the vz dgram socekt and verify the received packets. The test is little bit complicated, but it covers the happy path in 10 milliseconds. We can extend this later to a benchmark for packet forwarding. Signed-off-by: Nir Soffer <[email protected]>
79c83f9
to
e8e80c8
Compare
Example test output
|
@@ -47,7 +47,7 @@ require ( | |||
google.golang.org/protobuf v1.34.2 | |||
gopkg.in/op/go-logging.v1 v1.0.0-20160211212156-b2cb9fa56473 | |||
gotest.tools/v3 v3.5.1 | |||
inet.af/tcpproxy v0.0.0-20221017015627-91f861402626 // replaced to github.com/inetaf/tcpproxy (see the bottom of this go.mod file) | |||
inet.af/tcpproxy v0.0.0-20221017015627-91f861402626 // indirect; replaced to github.com/inetaf/tcpproxy (see the bottom of this go.mod file) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be moved to the indirect section below
// QEMUPacketConn converts raw network packet to a QEMU supported network packet. | ||
type QEMUPacketConn struct { | ||
unixConn net.Conn | ||
func forwardPackets(qemuConn *qemuPacketConn, vzConn *packetConn) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need our custom forwardPackets ??
I would still prefer to use inetproxy itself. Unless it doesn't work even after wrapping
We could simply wrap the packetConn with the fileconn. This way retry on ENOBUFS are present.
https://github.com/lima-vm/lima/blob/master/pkg/vz/network_darwin.go#L49
We used external package (tcpproxy) for proxying between unix stream and datagram sockets. This package cannot handle ENOBUFS error, expected condition on BSD based systems, and worse, it hides errors and stop forwarding packets silently when write to vz socket fails with ENOBUFS[1].
Fix the issues by replacing tcpproxy with a simpler and more direct implementation that will be easier to maintain.
Fixes:
Fix error handling if write to vz datagram socket fail with ENOBUFS. We retry the write until it succeeds with a very short sleep between retries. Similar solution is used in gvisor-tap-vsock[2].
Fix error handling if we could not read packet header or body from socket_vmnet stream socket. Previously we logged an error and continue to send corrupted packet to vz from the point of the failure.
Fix error handling if writing a packet to socket_vmnet stream socket returned after writing partial packet. Now we handle short writes and write the complete packet. Previously would break the protocol and continue to send corrupted packet from the point of the failure.
Log error if forwarding packets from vz to socket_vmnet or from socket_vmnet to vz failed.
Simplification:
Visibility:
[1] lima-vm/socket_vmnet#39
[2] containers/gvisor-tap-vsock#370