DataON Windows S2D-3110 Storage Solution with Mellanox Spectrum Switches

Version 21

    This post discusses the DataON Storage Solution using 2016 Windows servers S2D with NVMe Storage and two Mellanox Spectrum switches in VRRP configuration (ToR Switches) for the data path.

    This post is focuses on the networking side of the solution, for the SSD configuration refer to this document.

     

    References

     

    Setup

    The setup includes two Mellanox Spectrum switches configured with VRRP and four DataON Serves equipped with NVMe cards and ConnectX-4 Network adapters dual ports for multi-path connectivity.

    There are several models of Spectrum switches, in this example we will use the SN2700 32 port 100G Ethernet switches.

     

    e3.png

     

     

    Overview

    We will configure this setup in 3 phases.

    • IP Connectivity
    • RDMA QoS Configuration (PFC/ECN, buffers) on the switches and servers.
    • Windows Server Configuration, core switch connectivity and other considerations.

     

    In this design the traffic patterns from the servers to the servers will be with RDMA (Storage traffic), while traffic towards the core switches will be TCP/Management traffic and not RDMA traffic.

     

    Configuration

    Before you start, make sure you have the servers installed and powered UP as well as the switches.

    For MLNX-OS first time installation see HowTo Get Started with Mellanox switches.

     

    1. IP Connectivity

    In this section we will go over the IP connectivity configuration required for the spectrum switches as VRRP ToR switches.

     

    Port Connectivity

    • Ports 1-28 Downlinks to the servers (VLAN interface)
    • Ports 29-30 Uplinks towards the Core switches (Router ports)
    • Ports 31-32 Connected to the other ToR switch (port-channel)

     

    VLANs

    We will use the dual port for multi-path solution. Each server will be configured with different VLAN on each port.

    In this example we will use VLANs 8,9

    switch (config) # vlan 8-9
    switch (config) # vlan 8 name "Storage1"
    switch (config) # vlan 9 name "Storage2"

     

    Interface

    1. Create LAG (port-channel) in trunk mode on ports 31, 32. This link will be used for VRRP communication between switches.

    switch (config) # interface port-channel 1

    switch (config) # interface port-channel 1 description VRRP Link To other switch

    switch (config) # interface ethernet 1/31 description VRRP Link To other switch
    switch (config) # interface ethernet 1/32 description VRRP Link To other switch

    switch (config) # interface ethernet 1/31 channel-group 1 mode on

    switch (config) # interface ethernet 1/32 channel-group 1 mode on

    switch (config) # interface port-channel 1 switchport mode trunk

    Note: all VLANs are members of trunk ports by default.

     

    2. Configure links 1-28 (downlinks) towards the servers as trunk.

    switch (config) # interface ethernet 1/1 switchport mode trunk

    switch (config) # interface ethernet 1/2 switchport mode trunk

    ...

    switch (config) # interface ethernet 1/28 switchport mode trunk

     

    Note: all VLANs are members of trunk ports by default. The trunk allow only tagged traffic, if untagged traffic is needed (e.g. PXE boot) as well on those ports, set the links to hybrid and configure it to allow all VLANs.

    switch (config) # interface ethernet 1/28 switchport mode hybrid

    switch (config) # interface ethernet 1/28 switchport hybrid allowed-vlan all

     

    Learn more about switchport on Mellanox switches in HowTo Configure Switch Port Types with MLNX-OS .

     

    3. Configure the uplink ports as router ports towards the core switches. Set the IP Address and subnet required on this interface.

    switch (config) # interface ethernet 1/29 no switchport

    switch (config) # interface ethernet 1/29 ip address 10.10.1.1 /24

     

    switch (config) # interface ethernet 1/30 no switchport

    switch (config) # interface ethernet 1/30 ip address 10.10.2.1 /24

     

    Note: In this design, we assume that RDMA traffic will not pass via the core switches.

     

    L3 and VRRP

    We design the network to have two VLANs (multi-path). Each of the switches will be configured as VRRP master for a different VLAN, so both of the switches will be used (active-active).

     

    1. Enable IP routing, and configure VLAN interface for each VLAN (8,9).

    Note: each ToR switch should be configured with different IP address, this IP address will be the local IP address of each switch.

     

    ToR 1:

    switch (config) # ip routing vrf default

    switch (config) # interface vlan 8

    switch (config) # interface vlan 9

    switch (config) # interface vlan 8 ip address 192.168.101.2 255.255.255.0

    switch (config) # interface vlan 9 ip address 192.168.102.2 255.255.255.0

     

    ToR 2:

    switch (config) # ip routing vrf default

    switch (config) # interface vlan 8

    switch (config) # interface vlan 9

    switch (config) # interface vlan 8 ip address 192.168.101.3 255.255.255.0             <--- different IP of ToR 1

    switch (config) # interface vlan 9 ip address 192.168.102.3 255.255.255.0             <--- different IP of ToR 1

     

    2. Enable VRRP protocol on the switch and configure virtual IP address for each VLAN. Make sure to design the VRRP master for each VLAN to be a different port (using the priority parameter, the master priority is 255)

     

    ToR 1:

    switch (config) # protocol vrrp

    switch (config) # interface vlan 8 vrrp 8

    switch (config) # interface vlan 8 vrrp 8 address 192.168.101.1

    switch (config) # interface vlan 9 vrrp 9

    switch (config) # interface vlan 9 vrrp 9 address 192.168.102.1

    switch (config) # interface vlan 9 vrrp 9 priority 200                              <-- ToR 1 will be the VRRP Slave for this subnet

     

    ToR 2:

    switch (config) # protocol vrrp

    switch (config) # interface vlan 8 vrrp 8

    switch (config) # interface vlan 8 vrrp 8 address 192.168.101.1

    switch (config) # interface vlan 9 vrrp 9

    switch (config) # interface vlan 9 vrrp 9 address 192.168.102.1

    switch (config) # interface vlan 8 vrrp 8 priority 200                              <-- ToR 2 will be the VRRP Slave for this subnet

     

    To learn more about VRRP configuration, see HowTo Configure VRRP on Mellanox Ethernet Switches.

     

    IP Interfaces on the Servers

     

    1. Make sure to install Windows 2016 with the latest WinOF-2 driver on the servers. The servers should be equipped with ConnectX-4/5 dual port adapter.

    2. Make sure that the security features, like firewalls are disabled, so ping can pass.

    3. Set the IP addresses on the interfaces:

    • Set port 1 connected to ToR 1 with VLAN 8 with the IP range of that VLAN (192.168.101.X)
    • Set port 2 connected to ToR 2 with VLAN 9 with the IP range of that VLAN (192.168.102.X)

     

    Set the IP Address to suit the VRRP virtual address.

     

    • Set the VLAN to 8

     

     

     

    3. Add a route with lower metric from one network to the other network.

    For example:

    # route add 192.168.101.0 192.168.102.1 METRIC 500

    # route add 192.168.102.0 192.168.101.1 METRIC 500

     

    There are two ways now to reach the 101 network:

    - via 192.168.101.11 (locally connected)

    - via the other port 192.168.102.1 (the virtual router address).

     

    so, if one port is down, the traffic to that network will be sent from the second port.

    Note: that the second route should have higher metric (510) in the example below. higher metric will be lower priority to be used. we don't want that to be used regularly.

     

     

     

    Verification

     

    Test the L3 connectivity. Make sure ping is running (all to all).

    1. Ping between the servers (all to all) make sure that traffic reaches the virtual routers and the local router interfaces on the VLANs.

    2. Ping to the core switches and to external servers.

    3. Disable a port, verify that the traffic goes via the second port and reach the desired network (high availability).

     

    2. RDMA QoS Configuration

     

    There are different ways to Setup the RDMA layer required for the Windows S2D. To learn more about RDMA and RoCE see RDMA/RoCE Solutions page.

     

    The Recommended Network Configuration Examples for RoCE Deployment will give you a good start with the switch configuration for few selected profiles.

    To understand more about QoS requirement for RDMA see Understanding QoS Configuration for RoCE.

     

    Switch Configuration

    For this example, we will select Profile 5. Follow Lossless RoCE Configuration for MLNX-OS Switches in DSCP-Based QoS Mode  configure the switches.

     

    • Losslesss network, PFC is enabled on priority 3
    • ECN is enabled on the switch for priority 3.
    • Trust L3 is configured on the switch ports (classify the priority via DSCP field).
    • Buffer pool configuration and priority mapping.
    • CNP traffic will pass on DSCP 48.

     

    Note: In the example below, we configure QoS on all ports. It is not needed to do so for the uplink ports, just for the ports that may carry RDMA traffic.

     

    1. In order to make DCQCN congestion control to work, a user must enable ECN for RoCE traffic that run over traffic class 3:

    switch (config) # interface ethernet 1/1-1/32 traffic-class 3 congestion-control ecn minimum-absolute 150 maximum-absolute 1500

     

    For a fair sharing of switch buffer with other traffic classes It is recommended to configure ECN on all other traffic classes as well.

     

    2. Buffer pool configuration.

    Allocating a buffer pool 0 for lossy traffic and pool 1 for lossless traffic.

    switch (config) # pool ePool0 direction egress size 5242880 type dynamic

    switch (config) # pool iPool0 direction ingress size 5242880 type dynamic

    switch (config) # pool ePool1 direction egress-mc size 16777000 type dynamic

    switch (config) # pool iPool1 direction ingress size 5242880 type dynamic

     

    3. Bind interfaces to switch-priority

    Binding switch priorities 3 and 6 to ingress PG group 3 and 6.

    switch (config) # interface ethernet 1/1-1/32 ingress-buffer iPort.pg6 bind switch-priority 6

    switch (config) # interface ethernet 1/1-1/32 ingress-buffer iPort.pg3 bind switch-priority 3

     

    4. Mapping ingress/egress interface to pool configuration

    Allocating buffer to priority 3 and mapping it to a lossless pool and allocating buffer to priority 6 and mapping it to a lossy pool:

    switch (config) # interface ethernet 1/1-1/32 ingress-buffer iPort.pg3 map pool iPool1 type lossless reserved 67538 xoff 18432 xon 18432 shared alpha 2

    switch (config) # interface ethernet 1/1-1/32 ingress-buffer iPort.pg6 map pool iPool0 type lossy reserved 10240 shared alpha 8

    switch (config) # interface ethernet 1/1-1/32 egress-buffer ePort.tc3 map pool ePool1 reserved 1500 shared alpha inf

     

    Note: the reserved buffer size may be changes according to the port speed and MTU size.

     

    5. Setting strict priority to CNPs over traffic class 6

    interface ethernet 1/1-1/32 traffic-class 6 dcb ets strict

     

    6. Set trust mode L3 (dscp)

    switch (config) # interface ethernet 1/1-1/32 qos trust L3

     

    7. Enable PFC on priority 3 on all ports:

    switch (config) # dcb priority-flow-control enable force

    switch (config) # dcb priority-flow-control priority 3 enable

    switch (config) # interface ethernet 1/1-1/32 dcb priority-flow-control mode on force

     

     

    Server Configuration

    The servers should be configured with the following:

    • PFC is enabled on DSCP 26
    • Windows S2D RDMA traffic is mapped to egress with priority 3
    • ECN is enabled with priority 3.
    • PFC enabled with priority 3.
    • CNP traffic will be sent with DSCP 48.
    • Trust L3 is used (priority to DSCP mapping)

     

     

    1. Install Data Center Bridging Windows Feature.

    PS C:> Install-WindowsFeature data-center-bridging

     

    Success Restart Needed Exit Code      Feature Result

    ------- -------------- ---------      --------------

    True    No             Success        {Data Center Bridging}

     

    2. Import the PowerShell modules that are required to configure DCB.

    PS C:\> import-module netqos

    PS C:\> import-module dcbqos

    PS C:\> import-module netadapter

     

    3. Enable QoS on the network adapter

    PS C:\>  Set-NetAdapterQos -Enabled 1 *

     

    4. Enable Priority Flow Control (PFC) on the specific priority 3.

    PS C:\> Enable-NetQosFlowControl -Priority 3

     

    5. Locate the registry key for the Mellanox adapter, see HowTo Locate the Windows Registry key for Mellanox Adapters .

    In this example, the registry key is:

    {4d36e972-e325-11ce-bfc1-08002be10318}\0003

     

    You will need that for the next configuration commands.

     

    6. Map DSCP to priority for the RDMA traffic. In this example we are using DSCP 26 to map into a priority 3 (PriorityToDscpMappingTable_3).

    PS C:\>  new-itemProperty -Path HKLM:\SYSTEM\CurrentControlSet\Control\Class\"{4d36e972-e325-11ce-bfc1-08002be10318}"\0003\ -Name "PriorityToDscpMappingTable_3" -PropertyType "String" -Value "26" -Force

    PriorityToDscpMappingTable_3 : 26

    PSPath                       : Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Class\{4d36e972-e325-11ce-bfc1-08002be10318}\0003\

    PSParentPath                 : Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Class\{4d36e972-e325-11ce-bfc1-08002be10318}

    PSChildName                  : 0003

    PSDrive                      : HKLM

    PSProvider                   : Microsoft.PowerShell.Core\Registry

     

    7. Create a Quality of Service (QoS) policy and tag each type of traffic with the relevant priority.

    In this example we used SMB port 445 with a CoS Value 3.

    PS c:\> New-NetQosPolicy "SMBDirect" -NetDirectPortMatchCondition 445 -PriorityValue8021Action 3

    Name           : SMBDirect

    Owner          : Group Policy (Machine)

    NetworkProfile : All

    Precedence     : 127

    JobObject      :

    NetDirectPort  : 445

    PriorityValue  : 3

     

    For testing, you can add another port (e.g. 50000) that will be used later by performance tests (e.g. nd_write_bw).

    PS c:\> New-NetQosPolicy "SMBDirect" -NetDirectPortMatchCondition 50000 -PriorityValue8021Action 3

    Name           : SMBDirect_testRDMA

    Owner          : Group Policy (Machine)

    NetworkProfile : All

    Precedence     : 127

    JobObject      :

    NetDirectPort  : 50000

    PriorityValue  : 3

     

    8. Enable ECN on priority 3, and set the DSCP value of the CNP traffic to 48.

    PS c:\> Mlx5Cmd.exe -Qosconfig -Name RDMA1 -Dcqcn -Enable 3 -set -DcqcnCnpDscp 48

    The command was executed successfully

     

    Other Related commands

    The following commands are not needed in this procedure as there are VLANs, but in case of RDMA over untagged traffic, it should be used.

     

    1. Do not add a 802.1Q tag to transmitted packets that are assigned an 802.1p priority. Note that they are not assigned a non-zero VLAN ID (for example priority-tagged). The default is 0x0 for DSCP-based PFC set to 0x1.

    PS C:\>  new-itemProperty -Path HKLM:\SYSTEM\CurrentControlSet\Control\Class\"{4d36e972-e325-11ce-bfc1-08002be10318}"\0003\ -Name "TxUntagPriorityTag" -PropertyType "String" -Value "1" -Force

     

    2. Map all untagged traffic to the lossless receive queue. The default is 0x0 for DSCP-based PFC set to 0x1.

    PS C:\> new-itemProperty -Path HKLM:\SYSTEM\CurrentControlSet\Control\Class\"{4d36e972-e325-11ce-bfc1-08002be10318}"\0003\ -Name "RxUntaggedMapToLossless" -PropertyType "String" -Value "1" -Force

     

     

    Script

    This script assumes two port adapter (RDMA1 and RDMA2):

    Install-WindowsFeature data-center-bridging

    import-module netqos

    import-module dcbqos

    import-module netadapter

    Set-NetAdapterQos -Enabled 1 *

    Enable-NetQosFlowControl -Priority 3

    new-itemProperty -Path HKLM:\SYSTEM\CurrentControlSet\Control\Class\"{4d36e972-e325-11ce-bfc1-08002be10318}"\0003\ -Name "PriorityToDscpMappingTable_3" -PropertyType "String" -Value "26" -Force

    new-itemProperty -Path HKLM:\SYSTEM\CurrentControlSet\Control\Class\"{4d36e972-e325-11ce-bfc1-08002be10318}"\0002\ -Name "PriorityToDscpMappingTable_3" -PropertyType "String" -Value "26" -Force

    New-NetQosPolicy "SMBDirect" -NetDirectPortMatchCondition 445 -PriorityValue8021Action 3

    New-NetQosPolicy "SMBDirect_testRDMA" -NetDirectPortMatchCondition 50000 -PriorityValue8021Action 3

    Mlx5Cmd.exe -Qosconfig -Name RDMA1 -Dcqcn -Enable 3 -set -DcqcnCnpDscp 48

    Mlx5Cmd.exe -Qosconfig -Name RDMA2 -Dcqcn -Enable 3 -set -DcqcnCnpDscp 48

     

    Verification

    Verify RDMA QoS Configuration

     

    1. Get-NetAdapterQoS

    • Verify that PFC is enabled on priority 3
    • Verify that NetDirect on the required port and priority. For example, ports 445, 50000 are mapped to priority 3.

     

    PS C:\> Get-NetAdapterQos

     

    Name                       : RDMA1

    Enabled                    : True

    Capabilities               :                       Hardware     Current

                                                       --------     -------

                                 MacSecBypass        : NotSupported NotSupported

                                 DcbxSupport         : IEEE         IEEE

                                 NumTCs(Max/ETS/PFC) : 8/8/8        8/8/8

     

    OperationalTrafficClasses  : TC TSA    Bandwidth Priorities

                                 -- ---    --------- ----------

                                  0 ETS    100%      0-7

     

    OperationalFlowControl     : Priority 3 Enabled

    OperationalClassifications : Protocol  Port/Type Priority

                                 --------  --------- --------

                                 Default             0

                                NetDirect 50000     3

                                NetDirect 445       3

     

    Name                       : RDMA2

    Enabled                    : True

    Capabilities               :                       Hardware     Current

                                                       --------     -------

                                 MacSecBypass        : NotSupported NotSupported

                                 DcbxSupport         : IEEE         IEEE

                                 NumTCs(Max/ETS/PFC) : 8/8/8        8/8/8

     

    OperationalTrafficClasses  : TC TSA    Bandwidth Priorities

                                 -- ---    --------- ----------

                                  0 ETS    100%      0-7

     

    OperationalFlowControl     : Priority 3 Enabled

    OperationalClassifications : Protocol  Port/Type Priority

                                 --------  --------- --------

                                 Default             0

                                 NetDirect 50000     3

                                 NetDirect 445       3

     

    2. Get-Net-QosFlowControl

    • Verify that PFC is enabled on priority 3

    PS C:\> Get-NetQosFlowControl

     

    Priority   Enabled    PolicySet        IfIndex IfAlias

    --------   -------    ---------        ------- -------

    0          False      Global

    1          False      Global

    2          False      Global

    3          True       Global

    4          False      Global

    5          False      Global

    6          False      Global

    7          False      Global

     

    3. Get-NetQoSPolicy

    • Verify that NetDirect on the required port and priority. For example, ports 445, 50000 are mapped to priority 3.

    PS C:\Users\Administrator> Get-NetQosPolicy

     

    Name           : S2D Policy1

    Owner          : Group Policy (Machine)

    NetworkProfile : All

    Precedence     : 127

    JobObject      :

    NetDirectPort  : 50000

    PriorityValue  : 3

     

    Name           : SMB

    Owner          : Group Policy (Machine)

    NetworkProfile : All

    Precedence     : 127

    JobObject      :

    NetDirectPort  : 445

    PriorityValue  : 3

     

    4. Check DCQCN/ECN Configuration via Mlx5Cmd.exe command.

    • Check the DCQCN is enabled on priority 3 for RP and NP
    • Verify that the DSCP CNP is mapped to DSCP 48

    PS C:\Users\Administrator> Mlx5Cmd.exe -Qosconfig -Name RDMA1 -Dcqcn -get

    DCQCN RP attributes for adapter "RDMA1":

            DcqcnRPEnablePrio0: 1

            DcqcnRPEnablePrio1: 1

            DcqcnRPEnablePrio2: 1

            DcqcnRPEnablePrio3: 1

            DcqcnRPEnablePrio4: 1

            DcqcnRPEnablePrio5: 1

            DcqcnRPEnablePrio6: 1

            DcqcnRPEnablePrio7: 1

            DcqcnClampTgtRate: 0

            DcqcnClampTgtRateAfterTimeInc: 1

            DcqcnRpgTimeReset: 300

            DcqcnRpgByteReset: 32767

            DcqcnRpgThreshold: 5

            DcqcnRpgAiRate: 5

            DcqcnRpgHaiRate: 50

            DcqcnAlphaToRateShift: 11

            DcqcnRpgMinDecFac: 50

            DcqcnRpgMinRate: 1

            DcqcnRateToSetOnFirstCnp: 0

            DcqcnDceTcpG: 4

            DcqcnDceTcpRtt: 1

            DcqcnRateReduceMonitorPeriod: 4

            DcqcnInitialAlphaValue: 1023

     

    DCQCN NP attributes for adapter "RDMA1":

            DcqcnNPEnablePrio0: 1

            DcqcnNPEnablePrio1: 1

            DcqcnNPEnablePrio2: 1

            DcqcnNPEnablePrio3: 1

            DcqcnNPEnablePrio4: 1

            DcqcnNPEnablePrio5: 1

            DcqcnNPEnablePrio6: 1

            DcqcnNPEnablePrio7: 1

            DcqcnCnpDscp: 48

            DcqcnCnpPrioMode: 1

            DcqcnCnp802pPrio: 7

    The command was executed successfully

    PS C:\Users\Administrator>

     

    5. Check the Priority to DSCP mapping.

    • Get the PCI location, for example 138.0.0

     

    Get the regKeys configured.

    • Verify that DSCP is mapped to 26

     

    PS C:\Users\Administrator> Mlx5Cmd.exe -RegKeys -bdf 138.0.0

    NIC 1:

        Adapter: Mellanox ConnectX-4 Adapter

        Location (PCI bus, device, function): (138,0,0)

            Registry Key                                 Value          Default

     

            *IPChecksumOffloadIPv4                       3              3

            *TCPUDPChecksumOffloadIPv4                   3              3

            *TCPUDPChecksumOffloadIPv6                   3              3

            *EncapsulatedPacketTaskOffload               1              1

            *EncapsulatedPacketTaskOffloadNvgre          1              1

            *EncapsulatedPacketTaskOffloadVxlan          1              1

            *VxlanUDPPortNumber                          4789           4789

            *LsoV2IPv4                                   1              1

            *LsoV2IPv6                                   1              1

            *TransmitBuffers                             2048           2048

            TxIntModerationProfile                       1              1

            *RSS                                         1              1

            *ReceiveBuffers                              512            512

            *NumRssQueues                                8              128

            RecvCompletionMethod                         1              1

            *RscIPv4                                     1              1

            *RscIPv6                                     1              1

            RxIntModerationProfile                       1              1

            RxIntModeration                              2              2

            *VMQ                                         1              1

            *VMQVlanFiltering                            1              1

            *Sriov                                       1              0

            *RssOnHostVPorts                             0              0

            *QOS                                         1              0

            *FlowControl                                 3              3

            DcbxMode                                     2              2

            PriorityToDscpMappingTable_3                 26             3

            *PriorityVLANTag                             3              3

            VlanId                                       8              0

            *JumboPacket                                 1514           1514

            *EncapOverhead                               0              0

            PortType                                     1              None

            *InterruptModeration                         1              1

            *PacketDirect                                1              0

            *NetworkDirect                               1              1

     

    Benchmark testing (Basic)

    1. Run RDMA traffic between two ports.

     

    For example:

     

    Client:

    PS C:\> nd_write_bw -D 10 -C 192.168.101.12 -p 50000

     

    Server:

    PS C:\> nd_write_bw -D 10 -C 192.168.101.12 -p 50000

    2. Open Performance Monitoring tool (perfmon) and add the following counter sets

    • Mellanox WinOF-2 Congestion Control
    • Mellanox WinOF-2 Port QoS
    • RDMA Activity

     

    Check performance.

     

    Congestion Control Verification

     

    1. Create a synthetic congestion in the network (for example, lower the speed of one port to 10G), open Performance Monitoring (perfmon) tool, and run the benchmark testing.

     

    2. Check the Congestion Control counters are progressing on the Notification Point - NP (receiver) and the Reaction Point - RP (sender).

     

    Reaction Point - RP (sender) example:

     

     

    Notification Point - NP (receiver) example:

     

    If you see those counters, it means that DCQCN is working fine in the network (the switch upon congestion marks the IP ToS ECN bits.

    Note: PFC counters (pause counters) are not expected to advance.

     

    PFC Verification

    1. Disable ECN on one of the switch ports.

    switch (config) # no interface ethernet 1/1 traffic-class 3 congestion-control ecn

     

    2.  Run the benchmark test, and verify that the PFC counters are progressing. The Congestion Control counters should not be progressing.

     

     

    3. Enable ECN back on the switch.

    switch (config) # no interface ethernet 1/1 traffic-class 3 congestion-control ecn minimum-absolute 150 maximum-absolute 1500

    Packet Format Validation

    1. Capture RDMA traffic on one of the servers, use Mlx5Cmd.exe for that.

     

    For example:

    PS C:\> Mlx5Cmd.exe -Sniffer -name RDMA1 -start -filename  testing_rdma.pcap

     

    See also HowTo Capture RDMA traffic on mlx5 driver using mlx5cmd (Windows) .

     

    2. Run benchmark test.

     

    3. Open the file in wireshark

     

    4. Verify that the RDMA traffic is sent with DSCP 26 (as configured)

    • DSCP 26
    • ECN is not 00.

     

     

    5. Verify that the CNP traffic is send with DSCP 48 (as configured)

    • DSCP 48
    • RDMA OpCode is 0x81

     

    Read more on CNP packet format in RoCEv2 CNP Packet Format Example.