-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Networking
Seastar supports two networking stacks, both accessible through the same future-based API:
- the posix stack (default), is implemented using the familiar C socket API with an
epoll
back-end providing events. - the native stack, selected via a the
--network-stack=native
command-line option, enables a TCP/IP stack provided by seastar itself. This stack is several times faster than theposix
stack.
Seastar provides a share-nothing architecture with each core responsible for its own data structures and, thus, without locking requirements. However, as soon as we interact with the rest of the world this breaks down. The host (operating system) stack is written for a shared memory architecture, and as such every operation involves locks, or at least atomic operations. Even if no lock contention occurs, cache-line ping-pongs, slow atomic read-modify-write operations, and serializations are the result. In addition, the socket APIs enforce a data copy when crossing the kernel/user boundary.
In contrast, the native seastar stack is built for the seastar architecture:
- each connection is local to a core, so no locking is required and no cache-line ping-pongs occur.
- each core is responsible for a shard of the TCP tuple space; seastar uses hardware multiqueue capabilities so that the network interface card (NIC) DMAs each packet to the core that handles the flow it belongs to.
- seastar provide copy-less APIs; you can access receive buffers as they are DMAed into memory, or incorporate blobs stored in memory into outgoing TCP streams.
To transfer packets, seastar provides a driver subsystem with several different options:
- DPDK: The Data-Plane Development Kit or DPDK is a cross-platform framework for high performance networking. This is the preferred option as it allows using 10GbE and 40GbE NICs and provides full performance. When using DPDK, the host (Linux) driver is unbound from the NIC, and seastar takes over instead, driving the card directly.
-
virtio
: used when running in a KVM guest without SRIOV device assignment. -
Xen
: used when running in a Xen guest without SRIOV device assignment. -
vhost
: this is mostly a debugging driver used by networking stack developers.
The networking stack code is in the net directory.