The following is a setup guide to tune AMD EYPC CPU based servers to achieve maximum performance from Mellanox NICs.
- Performance Tuning for Mellanox Adapters
- Understanding NUMA Node for Performance Benchmarks
- What is IRQ Affinity?
Verifying System Configuration
Prior to CPU tuning, we must inspect the NUMA node configuration and verify that our server is actually running an AMD CPU:
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Thread(s) per core: 1
Core(s) per socket: 32
NUMA node(s): 4
Vendor ID: AuthenticAMD
Model name: AMD EPYC 7551 32-Core Processor
CPU MHz: 1996.203
NUMA node0 CPU(s): 0,4,8,12,16,20,24,28
NUMA node1 CPU(s): 1,5,9,13,17,21,25,29
NUMA node2 CPU(s): 2,6,10,14,18,22,26,30
NUMA node3 CPU(s): 3,7,11,15,19,23,27,31
In the above output we can observe that the tested server is running an AMD CPU model "EPYC 7551 32-Core Processor" with 4 octa-core NUMA nodes.
Since Hyper Threading is disabled, only a combined total of 32 CPUs (physical and logical) are available.
To find Mellanox NIC's local NUMA node, refer to the following How-To: Understanding NUMA Node for Performance Benchmarks.
In this example, we will tune Mellanox NIC's local node to NUMA node #2. To do that, run:
# cat /sys/class/net/eth20/device/numa_node
For the performance tuning process, we will utilize local CPU cores 2,6,10,14,18,22,26,30.
To maximize the NIC's bandwidth, interrupt events processing must be handled by the local CPUs only. This will localize processing and memory usage, and reduce QPI overhead.
See What is IRQ Affinity? for more information.
To bind the NIC's interrupt events to the local cores, run:
# service irqbalance stop
# set_irq_affinity_cpulist.sh 2,6,10,14,18,22,26,30 eth20
Alternatively, binding the NIC's interrupt events to the local cores can be done using the mlnx_tune tool (runs automatically on all Mellanox NIC's), run:
# mlnx_tune -p HIGH_THROUGHPUT
Below are the expected OOB results with the above tuning for the following setup:
- 8 threads
- TCP window 512KB
- 8KB message size
iperf -c 184.108.40.206 -P 8 -t 10 -w 512k