OS and VM Performance Tuning for Windows

Version 5

    This post lists various tuning options for servers using Mellanox adapters with Windows OS.

     

    References

     

     

    OS Tuning

    1. Make sure that the BIOS is tuned to performance. See BIOS Performance Tuning Example

     

    2. Make sure to install the adapter on the server with the right bandwidth of PCI. For example:

    • For ConnectX-3/ConnectX-3 Pro, PCI gen3 x8 (or more) is needed
    • For ConnectX-4, PCI gen3 x16 is needed

     

    3. Make sure that the power options are configured with high performance. Go to "Power Options" on the control panel and select "High performance"

    1.png

     

    4. Go to Device Manager-> adapter properties, and run performance tuning for your needs.

    2.png

     

    Note: These scripts do not work on VEA interfaces; they should be done manually. Make sure to assign the virtual interfaces to the same NUMA node and cores as the physical interface.

     

    5. For higher bandwidth, consider using Jumbo frames on the server and all network switches. Refer to: Getting started with ConnectX-4 100Gb/s Adapter for Windows for example.

     

    RSS Tuning

    It is recommended to configure RSS on the server. For examples, seeHowTo Configure RSS on ConnectX-3 Pro for Windows 2012 server

     

    Note: Pay attention to the number of NUMA nodes that exist in your server and configure RSS accordingly. For example:

    • 4xNUMA server (8 cores per NUMA) while the adapter is connected to NUMA-1 - use cores 8-15.
    • 2xNUMA server (16 cores per NUMA), use 0-15 for NUMA 0 or 16-31 for NUMA 1.

     

    Note: In case the CPU is based on Intel Westmere CPU model, then RSS should be configured to use all cores.

     

    Virtualization

    Hypervisor Tuning for the VM

     

    1. Configure VMQ of the interface assigned to the VM. BaseProccessorNumber and MaxProcessor values must correspond with the values configured for RSS.

    BaseProccessorNumber is the first core to be used, and MaxProcessor is the number of cores to be used. In this case, it is cores 4-7.

    The interface could be a physical interface or a teaming interface.

    This example is configured on the hypervisor:

    --- For physical interface ---

    PS$ Set-NetAdapterVmq -Name port1 -BaseProcessorNumber 4 -MaxProcessors 4

     

    --- For teaming interface ---

    PS$ Set-NetAdapterVmq -Name Teaming-Interface -BaseProcessorNumber 4 -MaxProcessors 4

     

    2. Go to the Hyper-V and set the CPU settings for the VM with the required performance (e.g. 100% CPU):

    3.jpg

     

    VM Tuning

    1. Install all updates for the OS. Restart the host to verify that there are no more updates.

     

    2. Disable firewall to make sure that the benchmark tools run smoothly.

     

    3. Consider using Jumbo frames.

     

    4. Enable vRSS

    PS$ Enable-NetAdpterRss *

    5. Tune the buffer size to reach the maximum performance. The default is 3MB, and can be tuned to 8/16 or more.

    PS$ Set-NetAdapterAdvancedProperty -Name "Ethernet 15" -DisplayName 'Receive Buffer Size' -DisplayValue 16MB

     

    For performance tuning, it is recommended to use ntttcp or sqlio tools, see examples here: