HowTo Configure RSS on ConnectX-3 Pro for Windows 2012 server

Version 17

    This post shows the procedure to configure and tune RSS on ConnectX-3/ConnectX-3 Pro adapters and servers installed with Windows 2012 OS.

    This post is advanced and meant to help IT managers understand and configure RSS.

     

    References

     

    Overview

    Receive Side Scaling (RSS) allows spreading the ingress traffic among many receive paths, each on a dedicated CPU.

    RSS allows network traffic to scale with a number of CPUs used to achieve better traffic characteristics, lower latency and higher throughput. Network traffic scaling is achieved by hashing TCP connections among RSS CPUs.

    RSS is best suited for servers handling many short lived TCP connections.

     

    The actual number of RSS queues used is limited by number of physical cores used by the RSS on the adapter, which can be lower than total number of CPU cores present in the system.

    Windows OS versions may impose additional limit on number of RSS CPUs.

     

    To learn more about RSS for Windows, see here:  Introduction to Receive Side Scaling (Windows Drivers)

     

     

    Setup

    To perform these configurations via PowerShell, use any setup that has more than 4 physical cores with win2012 and above.

     

    Configuration

    1. Make sure you have the latest WinOF driver installed.

     

    2. Get the RSS admin status. Run the Get-NetAdapterRss command:

     

    PS C:\Program Files\Mellanox\WinMFT> Get-NetAdapterRss -Name "Ethernet 10"
    Name                           : Ethernet 10
    InterfaceDescription           : Mellanox ConnectX-3 IPoIB Adapter #3
    Enabled                        : False
    NumberOfReceiveQueues          : 8
    Profile                        : Closest
    BaseProcessor: [Group:Number]  : 0:0
    MaxProcessor: [Group:Number]   : 0:63
    MaxProcessors                  : 8
    RssProcessorArray: [Group:Number/NUMA Distance] :
    IndirectionTable: [Group:Number]                :

     

     

     

    Let's review and explain each line in the status output:

    • Configurable Parameters:
      • Name - The name assigned to the interface. Can be modified (e.g. using PowerShell).
      • InterfaceDescription - The interface name as displayed in the device manager under "Network Adapters"
      • Enabled - The RSS state in this interface. Above example shows that RSS is currently disabled.
        Note: In Mellanox NICs, RSS is enabled by default.
      • Profile - The method which determines how CPUs are assigned to NIC. See more details in section 6 below.
      • BaseProcessor  - The minimal processor (core) number from which the OS will try to assign to the NIC.
        • [Group:Number] - processor Group number : processor number
      • MaxProcessor  - The maximal processor (core) number from which the OS will try to assign to the NIC.
        • [Group:Number] - processor Group number : processor number
      • MaxProcessors - The total amount of processors (cores)  we want the OS to determine to assign to the NIC.

     

    For example, assuming the CPU has several cores (let’s say 24 cores). Assuming we wish to set the server in this way that only 4 cores out of cores 13-20 only will be assigned to handle the traffic from this port, the BaseProcessor will be 13, the MaxProcessor will be 20 and the MaxProcessors can be 4.

     

    • Result parameters:
      • RssProcessorArray - The processors that the OS can assign to the interface, which reside between the BaseProcessor and MaxProcessor
      • IndirectionTable - Contains 128 entries. Each entry will be assigned a processor number from  "RssProcessorArray"
        Note: If the link on the interface is down, the indirection table will now be filled.

     

    3. To enable RSS on a NIC adapter, use the Enable-NetAdapterRss command:

    PS C:\Users\Administrator> Enable-NetAdapterRss -Name "ETH Adapter # 1"
    PS C:\Users\Administrator> Get-NetAdapterrss -Name "ETH Adapter # 1"


    Name                                            : ETH Adapter # 1
    InterfaceDescription                            : Mellanox ConnectX-3 Pro Ethernet Adapter
    Enabled                                         : True
    NumberOfReceiveQueues                           : 16
    Profile                                         : Closest
    BaseProcessor: [Group:Number]                   : 0:0
    MaxProcessor: [Group:Number]                    : 0:3
    MaxProcessors                                   : 4
    RssProcessorArray: [Group:Number/NUMA Distance] : 0:0/0  0:1/0  0:2/0  0:3/0
    IndirectionTable: [Group:Number]                : 0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3

    • To disable RSS, use the Disable-NetAdapterRss command.

     

    • Another way to change the RSS admin status (Enable or Disable) is to use the GUI:

              Click on Device Manager-> Network adapter-> Properties -> Advance menu

     

    3.png

     

     

     

    4. Change the number of RSS queues, use the Set-NetAdapterRss command with the -NumberOfReceiveQueues option.

    PS C:\Users\Administrator> Set-NetAdapterRss -Name "ETH Adapter # 1" -NumberOfReceiveQueues 16
    PS C:\Users\Administrator> Get-NetAdapterrss -Name "ETH Adapt*"


    Name                                            : ETH Adapter # 1
    InterfaceDescription                            : Mellanox ConnectX-3 Pro Ethernet Adapter
    Enabled                                         : True
    NumberOfReceiveQueues                           : 16
    Profile                                         : Closest
    BaseProcessor: [Group:Number]                   : 0:0
    MaxProcessor: [Group:Number]                    : 0:3
    MaxProcessors                                   : 4
    RssProcessorArray: [Group:Number/NUMA Distance] : 0:0/0  0:1/0  0:2/0  0:3/0
    IndirectionTable: [Group:Number]                : 0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3
                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3

     

    Note:

    • Changing the number of available CPUs may affect the number of RSS queues. When the number of available CPUs is lower than the number of currently configured RSS queues, Windows OS will silently lower the number of RSS queue to match the number of CPUs. In the example above, the number of RSS queues is 16, however, the number of CPUs is 4 (0-3).
    • Mellanox default value is 8, and there is no limit for the MAX value, yet there is no need to put a higher number than the CUPs'.

     

    5. Change the range of CPUs available to assign to the NIC. Use the Set-NetAdapterRss command with the -BaseProcessorNumber / -MaxProcessorNumber option.

    PS C:\Users\Administrator> Set-NetAdapterRss -Name "ETH Adapter # 1" -BaseProcessorNumber 2 -MaxProcessorNumber 8
    PS C:\Users\Administrator> Get-NetAdapterrss -Name "ETH Adapt*"

     

    Name                                            : ETH Adapter # 1
    InterfaceDescription                            : Mellanox ConnectX-3 Pro Ethernet Adapter
    Enabled                                         : True
    NumberOfReceiveQueues                           : 16
    Profile                                         : Closest
    BaseProcessor: [Group:Number]                   : 0:2
    MaxProcessor: [Group:Number]                    : 0:8
    MaxProcessors                                   : 2
    RssProcessorArray: [Group:Number/NUMA Distance] : 0:2/0  0:3/0  0:4/0  0:5/0  0:6/0  0:7/0  0:8/0
    IndirectionTable: [Group:Number]                : 0:2    0:3    0:2    0:3    0:2    0:3    0:2    0:3
                                                      0:2    0:3    0:2    0:3    0:2    0:3    0:2    0:3
                                                      0:2    0:3    0:2    0:3    0:2    0:3    0:2    0:3
                                                      0:2    0:3    0:2    0:3    0:2    0:3    0:2    0:3
                                                      0:2    0:3    0:2    0:3    0:2    0:3    0:2    0:3
                                                      0:2    0:3    0:2    0:3    0:2    0:3    0:2    0:3
                                                      0:2    0:3    0:2    0:3    0:2    0:3    0:2    0:3
                                                      0:2    0:3    0:2    0:3    0:2    0:3    0:2    0:3
                                                      0:2    0:3    0:2    0:3    0:2    0:3    0:2    0:3
                                                      0:2    0:3    0:2    0:3    0:2    0:3    0:2    0:3
                                                      0:2    0:3    0:2    0:3    0:2    0:3    0:2    0:3
                                                      0:2    0:3    0:2    0:3    0:2    0:3    0:2    0:3
                                                      0:2    0:3    0:2    0:3    0:2    0:3    0:2    0:3
                                                      0:2    0:3    0:2    0:3    0:2    0:3    0:2    0:3
                                                      0:2    0:3    0:2    0:3    0:2    0:3    0:2    0:3
                                                      0:2    0:3    0:2    0:3    0:2    0:3    0:2    0:3

     

    6. Change the number of RSS CPUs.

    In cases where the use of all available CPU is undesirable, user can limit the number of concurrently used CPUs.

    In effect, this setting changes the number of RSS queues being concurrently used, leaving some queues unused.

    Possible scenario is a system with lower power usage that limits the number of active CPU cores.

    By having RSS queues pre-allocated (pre-assigned) on a different CPU, the system can quickly move RSS queues from one CPU core to another in order to balance its power usage.

    To change the number of concurrently used CPU cores, use the Set-NetAdapterRss command with the -MaxProcessors option.

    PS C:\Users\Administrator> Set-NetAdapterRss -Name "ETH Adapter # 1" -MaxProcessors 2
    PS C:\Users\Administrator> Get-NetAdapterrss -Name "ETH Adapt*"


    Name                                            : ETH Adapter # 1
    InterfaceDescription                            : Mellanox ConnectX-3 Pro Ethernet Adapter
    Enabled                                         : True
    NumberOfReceiveQueues                           : 16
    Profile                                         : Closest
    BaseProcessor: [Group:Number]                   : 0:0
    MaxProcessor: [Group:Number]                    : 0:3
    MaxProcessors                                   : 2
    RssProcessorArray: [Group:Number/NUMA Distance] : 0:0/0  0:1/0  0:2/0  0:3/0
    IndirectionTable: [Group:Number]                : 0:0    0:1    0:0    0:1    0:0    0:1    0:0    0:1
                                                      0:0    0:1    0:0    0:1    0:0    0:1    0:0    0:1
                                                      0:0    0:1    0:0    0:1    0:0    0:1    0:0    0:1
                                                      0:0    0:1    0:0    0:1    0:0    0:1    0:0    0:1
                                                      0:0    0:1    0:0    0:1    0:0    0:1    0:0    0:1
                                                      0:0    0:1    0:0    0:1    0:0    0:1    0:0    0:1
                                                      0:0    0:1    0:0    0:1    0:0    0:1    0:0    0:1
                                                      0:0    0:1    0:0    0:1    0:0    0:1    0:0    0:1
                                                      0:0    0:1    0:0    0:1    0:0    0:1    0:0    0:1
                                                      0:0    0:1    0:0    0:1    0:0    0:1    0:0    0:1
                                                      0:0    0:1    0:0    0:1    0:0    0:1    0:0    0:1
                                                      0:0    0:1    0:0    0:1    0:0    0:1    0:0    0:1
                                                      0:0    0:1    0:0    0:1    0:0    0:1    0:0    0:1
                                                      0:0    0:1    0:0    0:1    0:0    0:1    0:0    0:1
                                                      0:0    0:1    0:0    0:1    0:0    0:1    0:0    0:1
                                                      0:0    0:1    0:0    0:1    0:0    0:1    0:0    0:1

     

    In the example above, only CPU cores 0 and 1 are currently used out of the list available in RssProcessorArray, as seen in the indirection table.

     

    6. RSS Profiles

     

    RSS profile determines how CPUs are assigned to the NIC. The following table lists all currently available RSS profiles.

     

    ProfileDescription
    ClosestBehavior is consistent with the behavior of Windows Server 2008 R2.
    ClosestStaticNo dynamic load balancing, such as distributing but not load balancing at runtime.
    NUMAAssigns RSS processors in a round robin basis across every NUMA node to enable applications that are running on NUMA servers to scale well.
    NUMAStaticDefault behavior. RSS processor selection is the same as for NUMA scalability without dynamic load balancing.
    ConservativeRSS uses as few processors as possible to sustain the load. This option helps reduce the number of interrupts.

     

    To change the number of concurrently used CPU cores, use the Set-NetAdapterRss command with the -Profile option to specify new RSS profile.

    PS C:\Users\Administrator> Set-NetAdapterRss -Name "ETH Adapter # 1" -Profile NUMAStatic
    PS C:\Users\Administrator> Get-NetAdapterrss -Name "ETH Adapt*"


    Name                                            : ETH Adapter # 1
    InterfaceDescription                            : Mellanox ConnectX-3 Pro Ethernet Adapter
    Enabled                                         : True
    NumberOfReceiveQueues                           : 16
    Profile                                         : NUMAStatic
    BaseProcessor: [Group:Number]                   : 0:2
    MaxProcessor: [Group:Number]                    : 0:8
    MaxProcessors                                   : 2
    RssProcessorArray: [Group:Number/NUMA Distance] : 0:2/0  0:3/0  0:4/0  0:5/0  0:6/0  0:7/0  0:8/0
    IndirectionTable: [Group:Number]                : 0:4    0:5    0:4    0:5    0:4    0:5    0:4    0:5
                                                      0:4    0:5    0:4    0:5    0:4    0:5    0:4    0:5
                                                      0:4    0:5    0:4    0:5    0:4    0:5    0:4    0:5
                                                      0:4    0:5    0:4    0:5    0:4    0:5    0:4    0:5
                                                      0:4    0:5    0:4    0:5    0:4    0:5    0:4    0:5
                                                      0:4    0:5    0:4    0:5    0:4    0:5    0:4    0:5
                                                      0:4    0:5    0:4    0:5    0:4    0:5    0:4    0:5
                                                      0:4    0:5    0:4    0:5    0:4    0:5    0:4    0:5
                                                      0:4    0:5    0:4    0:5    0:4    0:5    0:4    0:5
                                                      0:4    0:5    0:4    0:5    0:4    0:5    0:4    0:5
                                                      0:4    0:5    0:4    0:5    0:4    0:5    0:4    0:5
                                                      0:4    0:5    0:4    0:5    0:4    0:5    0:4    0:5
                                                      0:4    0:5    0:4    0:5    0:4    0:5    0:4    0:5
                                                      0:4    0:5    0:4    0:5    0:4    0:5    0:4    0:5
                                                      0:4    0:5    0:4    0:5    0:4    0:5    0:4    0:5
                                                      0:4    0:5    0:4    0:5    0:4    0:5    0:4    0:5

     

    Example:

     

    Say we are working on an 8 CPU’s per core system, and we want to assign for a Dual-Port NIC, 4 CPU cores each and also starting from specific CPU.

    In this example we want to assign 0,1,2,3 for first port and 4,5,6,7 for second port.

     

    For port 1 we configure the following:

    • Set “Maximum number of RSS queues = 4”
    • Set “RSS Base processor number = 0”
    • Set “RSS Max processor number = 3”
    • Set “Maximum number of RSS processors = 4”

    Command:

    PS C:\Users\Administrator> Set-NetAdapterRss -Name "ETH Adapter # 1" -NumberOfReceiveQueues 4 -BaseProcessorNumber 0 -MaxProcessorNumber 3 -MaxProcessors 4

     

    For port 2 we configure the following:

    • Set “Maximum number of RSS queues = 4”
    • Set “RSS Base processor number = 4”
    • Set “RSS Max processor number = 7”
    • Set “Maximum number of RSS processors = 4”

    Command:

    PS C:\Users\Administrator> Set-NetAdapterRss -Name "ETH Adapter # 2" -NumberOfReceiveQueues 4 -BaseProcessorNumber 4 -MaxProcessorNumber 7 -MaxProcessors 4

     

    The End result would be:

    PS C:\Users\Administrator> Get-NetAdapterrss -Name "ETH Adapt*"

     

    Name                                            : ETH Adapter # 1

    Name                                            : ETH Adapter # 1

    InterfaceDescription                            : Mellanox ConnectX-3 Pro Ethernet Adapter

    Enabled                                         : True

    NumberOfReceiveQueues                           : 4

    Profile                                         : Closest

    BaseProcessor: [Group:Number]                   : 0:0

    MaxProcessor: [Group:Number]                    : 0:3

    MaxProcessors                                   : 4

    RssProcessorArray: [Group:Number/NUMA Distance] : 0:0/0  0:1/0  0:2/0  0:3/0 

    IndirectionTable: [Group:Number]                : 0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3   

                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3   

                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3   

                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3   

                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3   

                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3   

                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3   

                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3   

                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3   

                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3   

                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3   

                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3   

                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3   

                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3   

                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3   

                                                      0:0    0:1    0:2    0:3    0:0    0:1    0:2    0:3 

     

     

    Name                                            : ETH Adapter # 2

    InterfaceDescription                            : Mellanox ConnectX-3 Pro Ethernet Adapter #2

    Enabled                                         : True

    NumberOfReceiveQueues                           : 4

    Profile                                         : Closest

    BaseProcessor: [Group:Number]                   : 0:4

    MaxProcessor: [Group:Number]                    : 0:7

    MaxProcessors                                   : 4

    RssProcessorArray: [Group:Number/NUMA Distance] : 0:4/0  0:5/0  0:6/0  0:7/0 

    IndirectionTable: [Group:Number]

     

     

    Note: The indirection table is empty since the ETH Adapter # 2 is disconnected.              :