HowTo Configure RoCE in Windows Environment (Global Pause)

Version 14

    This post shows how to configure RoCE in a simple lab environment for Windows 2012 (Global Pause Flow Control is used).

    For production setups PFC is recommended to be used, refer to HowTo Configure SMB Direct (RoCE) over PFC on Windows 2012 Server.

     

    References

     

    Setup

    A basic setup with two server and a 40GbE swtich or more is recommended. Each server should be equipped with Mellanox ConnectX-3 Pro adapter card. Any 40GbE Ethernet switch can suit here (e.g. SX1036). As for the switch configuration, nothing special is required to be configured. the ports can be access ports (switchport type) with pvid=1 (access VLAN).

    1%3Fauth_token%3D534515826d57636a400cbfc71c1782d8d2ecd6d3

     

     

     

    Prerequisite

    • Install Windows 2012 R2 in both servers
    • Install Mellanox WinOF driver (click here to download)

     

    Configuration

    1. Setup the interface IP addresses for the 40GbE ports:

    Open the Network Connections window. Locate Local Area Connections with Mellanox devices, Right-click a Mellanox Local Area Connection and left-click Properties. Select Internet Protocol Version 4 (TCP/IPv4) from the scroll list and click Properties, and fill in the static IP address.

    In this example, I used:

    • 11.11.1.1 for server 1
    • 11.11.1.2 for server 2

    PS C:\> ipconfig

    Windows IP Configuration


    Ethernet adapter Ethernet 16:

       Media State . . . . . . . . . . . : Media disconnected
       Connection-specific DNS Suffix  . :

    Ethernet adapter Ethernet 15:

       Connection-specific DNS Suffix  . :
       Link-local IPv6 Address . . . . . : fe80::f652:14ff:fe17:1fc1%29
       IPv4 Address. . . . . . . . . . . : 11.11.1.2
       Subnet Mask . . . . . . . . . . . : 255.255.255.0
       Default Gateway . . . . . . . . . :

     

    2. Make sure that ping is running between the servers:

    PS C:\>  ping 11.11.1.1

    Pinging 11.11.1.1 with 32 bytes of data:
    Reply from 11.11.1.1: bytes=32 time<1ms TTL=128
    Reply from 11.11.1.1: bytes=32 time<1ms TTL=128
    Reply from 11.11.1.1: bytes=32 time<1ms TTL=128
    Reply from 11.11.1.1: bytes=32 time<1ms TTL=128

    Ping statistics for 11.11.1.1:
        Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
    Approximate round trip times in milli-seconds:
        Minimum = 0ms, Maximum = 0ms, Average = 0ms

     

    3. Make sure that Flow Control (Global Pause) is configured on the switch and on the server.

     

    For SX1036 use the following commands for each interface (for example, interface 1/1)

    Note: It is disabled by default.

     

    switch (config) # interface ethernet 1/1 flowcontrol send on force

    switch (config) # interface ethernet 1/1 flowcontrol receive on force

     

    4. Make sure that Flow Control is enabled on the sever's port (Port configuration -> Advance Tab)

    7 - flow control enabled..PNG

     

     

    5. (Optional) If you wish to add VLAN Tag to the traffic, you can add it as well (don't forget to enable VLAN tagging on the switch).

    Note: This step assumes that you use Flow Control (global pause) and not PFC, in case you wish to use PFC refer to HowTo Configure SMB Direct (RoCE) over PFC on Windows 2012 Server.

     

    8 - VLAN 1.PNG

     

    Note: Don't forget to enable VLAN tagging on the switch

    When using Mellanox switch use the commands (for each port).

    Additional tagging options are available on MLNX-OS user manual.

     

    switch config # interface ethernet 1/1 switchport mode trunk

    switch config # vlan 1

    6. Configure RoCE mode:

    The default RoCE mode is 1.0, to see the current RoCE mode configuration run Get-MlnxDriverCoreSetting command:

     

    PS C:\>  Get-MlnxDriverCoreSetting


    Caption               : DriverCoreSettingData 'mlx4_bus'
    Description           : Mellanox Driver Option Settings
    ElementName           : mlx4_bus
    InstanceID            : mlx4_bus
    Name                  : mlx4_bus
    Source                : 3
    SystemName            : GEN-L-VRT-002
    LogMttsPerSeg         : 3
    LogNumCq              : 16
    LogNumMac             : 7
    LogNumMcg             : 13
    LogNumMpt             : 19
    LogNumMtt             : 20
    LogNumQp              : 21
    LogNumRdmaRc          : 4
    LogNumSrq             : 16
    LogNumVlan            : 7
    MaximumWorkingThreads : 4
    RoceMode              : 1.0
    Set4kMtu              : True
    SriovEnable           : False
    SriovPort1NumVFs      :
    SriovPort2NumVFs      :
    SriovPortMode         :
    PSComputerName        :

     

    The following RoCE modes are supported:

    • RoCE V1 MAC based (legacy) : 1
    • RoCE V2 IP based (routable) : 2

     

    To configure RoCE mode (in this example RoCE v1) run Set-MlnxDriverCoreSetting command:

      

    PS c:\> Set-MlnxDriverCoreSetting –RoceMode 1

     

    Note: the default RoCE mode is 1 (RoCE v1) up to WinOF 4.90 or older. Since WinOF 5.00 the default is to 0 (disabled).

     

    7. Verify that RoCE is running between the hosts:

    For that you can use nd_send_bw command (for both client the server).

     

    Server 1 is acting as server while Server 2 is acting as client

     

    Run on server 1:

    PS C:\> nd_send_bw -S 11.11.1.1

    Listening for incoming connection request... Connection accepted.

    PS C:\>

     

    Run on server 2:

    PS C:\> nd_send_bw -C 11.11.1.1

    #bytes #iterations    MR [Mpps]     Gb/s     CPU Util.
    65536     100000       0.070        36.61    100.00

    Test finished. Releasing resources...
    PS C:\>

     

     

    Any bandwidth above 36Gb/s considered to be good, additional tuning could be added (refer to Mellanox performance tuning guide for network adapters).

     

    8. Performance test via RamDisk.

    As HDD are much slower than the network capacity and may create bottleneck when using only one, it is may be wise to perform additional performance tests based on RAM DISKs.

    Such example can be found here - Ram Disk Application for Windows Environment (imdisk, sqlio)

     

    Lossless Network

    Any network that planned to work with RDMA should be designed to be lossless network (no packet loss). There are various of approaches related to QoS, flow control and PFC that will be added later on.