Skip to content

EtherateMT Design Overview

James Bensley edited this page Jan 6, 2018 · 4 revisions

I think it goes without saying (so why am I saying it?) a single threaded scalar approach to sending or receiving data and processing it in today's world of multi-core CPUs, multi-channel RAM, and multi-queue NICs isn’t going to stress test networking throughput and latency of a host device or network device to its full potential. Threads are used in EtherateMT instead of multiple processes using MPI (message-passing interface) to give higher memory throughput for (NUMA) on-node communications. Also pthread_create() is much more lightweight than fork(). Additionally the EtherateMT author is lazy and pedantic at the same time so the pthread library is used instead of boost (as well as various other idiosyncrasies throughout the code).

In general a recent kernel version is advised (> 4.1.x) when using EtherateMT and the most recent NIC drivers and Firmware version are also recommended.

  • Security flaw CVE-2017-14497 existed in the tpacket_rcv() function in af_packet.c from Kernel version 4.6 which and was fixed in Kernel 4.13
  • AF_PACKET support for TX_RING in TPACKET_V3 was added in Kernel 4.11 (previously v3 only supported RX_RING).
  • A bug existed in the non-blocking MSG_DONTWAIT flag for AF_PACKET sockets which meant it wasn't fully non-blocking until it was fixed in Kernel 4.1
  • PACKET_QDISC_BYPASS was not introduced until Kernel 3.14

Use the -c flag to spin up as many threads as you have cores to fill up all your NIC queues. If you're (a total bastard!) lucky enough to have a 100G NIC with 1024 queues and you’ve got 1024 cores, it might be worth checking that you can spin up 1024 threads threads and open 1024 file descriptors before trying:

# Show soft user limit for threads
$ ulimit -u

# Show hard limits
$ ulimit -Hu

# Increase soft limit to hard limit
$ ulimit -uNNN
$ ulimit-u