Getting Started with ConnectX-5 100Gb/s Adapter for Windows

Version 4

    This post follows the basic steps of configuring and setting up basic parameters for the Mellanox ConnectX-5 100Gb/s Adapter on Windows 2016.

    This post is basic and is meant for beginners.

     

    References

    • WinOF-2 User Manual

     

    Setup

    The basic setup consists of:

    • Two servers equipped with PCI Gen3 x16 slots
    • Two Mellanox ConnectX-5 adapter cards
    • One 100Gb/s Cable

     

    In this setup, Windows 2016 was installed on the servers.

     

    Prerequisites

    If you plan to run a performance test, it is recommended that you tune the BIOS to high performance.

    Please refer to Mellanox Tuning Guide to view BIOS Performance Tuning Example.

     

    Configuration

    1. Install the latest WinOF-2 Driver, located at Mellanox.com.

     

    2. Install the latest MFT (Mellanox Firmware Tools) package, located at Mellanox.com.

     

    3. Check that the adapter is recognized in the device manager.

     

    4. The default link protocol for ConnectX-5 VPI is InfiniBand. To change it to Ethernet, follow these commands by using the MFT mlxconfig tool:

     

    a. Get MFT status:

    PS C:\Users\Administrator> mst status

    MST devices:

    ------------

     

    mt4119_pciconf0

     

    Note: 4119 in this case refers to the ConnectX-5.

     

    b. Query the Host about ConnectX-5 VPI adapters port type.

     

    LINK_TYPE_P1 (link type of Port 1): options are 1 (InfiniBand) or 2 (Ethernet). In this case, the ports are configured as InfiniBand.

     

     

    PS C:\Users\Administrator> mlxconfig -d mt4119_pciconf0 q

    Device #1:
    ----------

    Device type:    ConnectX5
    PCI device:     mt4119_pciconf0

    Configurations:                              Next Boot
             NUM_OF_VFS                          0
             SRIOV_EN                            False(0)
             PF_LOG_BAR_SIZE                     5
             VF_LOG_BAR_SIZE                     1
             NUM_PF_MSIX                         63
             NUM_VF_MSIX                         11
             INT_LOG_MAX_PAYLOAD_SIZE            AUTOMATIC(0)
             CQE_COMPRESSION                     BALANCED(0)
             LRO_LOG_TIMEOUT0                    6
             LRO_LOG_TIMEOUT1                    7
             LRO_LOG_TIMEOUT2                    8
             LRO_LOG_TIMEOUT3                    12
             LOG_DCR_HASH_TABLE_SIZE             11
             DCR_LIFO_SIZE                       16384
             ROCE_NEXT_PROTOCOL                  254
             LLDP_NB_DCBX_P1                     True(1)
             LLDP_NB_RX_MODE_P1                  OFF(0)
             LLDP_NB_TX_MODE_P1                  OFF(0)
             LLDP_NB_DCBX_P2                     True(1)
             LLDP_NB_RX_MODE_P2                  OFF(0)
             LLDP_NB_TX_MODE_P2                  OFF(0)
             CLAMP_TGT_RATE_AFTER_TIME_INC_P1    True(1)
             CLAMP_TGT_RATE_P1                   False(0)
             RPG_TIME_RESET_P1                   600
             RPG_BYTE_RESET_P1                   32767
             RPG_THRESHOLD_P1                    5
             RPG_MAX_RATE_P1                     0
             RPG_AI_RATE_P1                      5
             RPG_HAI_RATE_P1                     50
             RPG_GD_P1                           11
             RPG_MIN_DEC_FAC_P1                  50
             RPG_MIN_RATE_P1                     1
             RATE_TO_SET_ON_FIRST_CNP_P1         100
             DCE_TCP_G_P1                        4
             DCE_TCP_RTT_P1                      1
             RATE_REDUCE_MONITOR_PERIOD_P1       4
             INITIAL_ALPHA_VALUE_P1              0
             MIN_TIME_BETWEEN_CNPS_P1            0
             CNP_802P_PRIO_P1                    0
             CNP_DSCP_P1                         7
             CLAMP_TGT_RATE_AFTER_TIME_INC_P2    True(1)
             CLAMP_TGT_RATE_P2                   False(0)
             RPG_TIME_RESET_P2                   600
             RPG_BYTE_RESET_P2                   32767
             RPG_THRESHOLD_P2                    5
             RPG_MAX_RATE_P2                     0
             RPG_AI_RATE_P2                      5
             RPG_HAI_RATE_P2                     50
             RPG_GD_P2                           11
             RPG_MIN_DEC_FAC_P2                  50
             RPG_MIN_RATE_P2                     1
             RATE_TO_SET_ON_FIRST_CNP_P2         100
             DCE_TCP_G_P2                        4
             DCE_TCP_RTT_P2                      1
             RATE_REDUCE_MONITOR_PERIOD_P2       4
             INITIAL_ALPHA_VALUE_P2              0
             MIN_TIME_BETWEEN_CNPS_P2            0
             CNP_802P_PRIO_P2                    0
             CNP_DSCP_P2                         7
             LINK_TYPE_P1                        IB(1)
             LINK_TYPE_P2                        IB(1)
             KEEP_ETH_LINK_UP_P1                 True(1)
             KEEP_IB_LINK_UP_P1                  False(0)
             KEEP_LINK_UP_ON_BOOT_P1             False(0)
             KEEP_LINK_UP_ON_STANDBY_P1          False(0)
             KEEP_ETH_LINK_UP_P2                 True(1)
             KEEP_IB_LINK_UP_P2                  False(0)
             KEEP_LINK_UP_ON_BOOT_P2             False(0)
             KEEP_LINK_UP_ON_STANDBY_P2          False(0)
             ROCE_CC_PRIO_MASK_P1                0
             ROCE_CC_ALGORITHM_P1                ECN(0)
             ROCE_CC_PRIO_MASK_P2                0
             ROCE_CC_ALGORITHM_P2                ECN(0)
             DCBX_IEEE_P1                        True(1)
             DCBX_CEE_P1                         True(1)
             DCBX_WILLING_P1                     True(1)
             DCBX_IEEE_P2                        True(1)
             DCBX_CEE_P2                         True(1)
             DCBX_WILLING_P2                     True(1)
             NUM_OF_VL_P1                        4_VLS(3)
             NUM_OF_TC_P1                        8_TCS(0)
             NUM_OF_PFC_P1                       8
             NUM_OF_VL_P2                        4_VLS(3)
             NUM_OF_TC_P2                        8_TCS(0)
             NUM_OF_PFC_P2                       8
             DUP_MAC_ACTION_P1                   LAST_CFG(0)
             DUP_MAC_ACTION_P2                   LAST_CFG(0)
             PORT_OWNER                          True(1)
             ALLOW_RD_COUNTERS                   True(1)
             IP_VER                              IPv4(0)
             BOOT_VLAN                           0
             BOOT_VLAN_EN                        False(0)
             BOOT_OPTION_ROM_EN                  False(0)
             BOOT_PKEY                           0

     

    Note that the LINK_TYPE_P1 and LINK_TYPE_P2 are equal to 1 (InfiniBand) by default.

     

    c. Change the port type to Ethernet (LINK_TYPE = 2):

    PS C:\Users\Administrator> mlxconfig -d /dev/mst/mt4119_pciconf0 set LINK_TYPE_P1=2 LINK_TYPE_P2=2

     

    Device #1:

    ----------

     

    Device type:    ConnectX5      

    PCI device:     /dev/mst/mt4119_pciconf0

     

    Configurations:          Current New

    ...

             LINK_TYPE_P1    IB(1)   ETH(2)      

             LINK_TYPE_P2    IB(1)   ETH(2)      

    ...

    Apply new Configuration? ? (y/n) [n] : y

    Applying... Done!

    -I- Please reboot machine to load new configurations.

     

    d. Reboot the server or restart the firmware.

    PS C:\Users\Administrator> mlxfwreset --device mt4119_pciconf0 reset

     

    Minimal reset level for device, mt4119_pciconf0:

     

    3: Driver restart and PCI reset

    Continue with reset?[y/N] y

    -I- Stopping Driver                         -Done

    -I- Sending Reset Command To Fw             -Done

    -I- Resetting PCI                           -Done

    -I- Starting Driver                         -Done

    -I- Restarting MST                          -Done

    -I- FW was loaded successfully.

     

    e. After reboot, verify that the LINK_TYPE is configured as Ethernet (=2).

    PS C:\Users\Administrator> mlxconfig -d mt4119_pciconf0 q

    Device #1:

    ----------

     

    Device type:    ConnectX5

    PCI device:     mt4119_pciconf0

     

    Configurations:                              Current

      ..

             LINK_TYPE_P1                        ETH(2)

             LINK_TYPE_P2                        ETH(2)

      ..

     

    5. Configure IPs and MTU on both servers.

    For example:

    Server S1: 12.12.12.7/24 up

    Server S2: 12.12.12.8/24 up

     

    6. For higher bandwidth of basic performance testing, it is recommended that you change the MTU of the ports to Jumbo frames (9K).

    ConnectX-5_jumbo.JPG

     

    7. Disable the firewall (optional) .

     

    firewall.PNG

     

    Performance Testing

    1. Run performance tests. Here is an example with nd_send_bw:

     

    • Run on one server:

    PS C:\Users\Administrator> nd_send_bw -a -S 12.12.12.7

    Listening for incoming connection request...

    Connection accepted.

     

    • Run on the other server:

    PS C:\Users\Administrator> nd_send_bw -a -C 12.12.12.7

     

    #bytes #iterations    MR [Mmps]     Gb/s     CPU Util.

    1         100000       1.369        0.01     85.55

    2         100000       1.371        0.02     100.00

    4         100000       1.368        0.04     100.00

    8         100000       1.338        0.09     83.65

    16        100000       1.370        0.18     85.60

    32        100000       1.367        0.35     100.00

    64        100000       0.666        0.34     93.65

    128       100000       0.464        0.47     100.00

    256       100000       4.159        8.52     100.00

    512       100000       4.157        17.03    64.85

    1024      100000       4.143        33.94    100.00

    2048      100000       4.120        67.50    64.27

    4096      100000       2.917        95.59    100.00

    8192      100000       1.459        95.60    91.17

    16384     100000       0.729        95.61    100.00

    32768     100000       0.365        95.61    96.87

    65536     100000       0.182        95.61    100.00

    131072    100000       0.091        95.61    99.72

    262144    100000       0.046        95.61    99.73

    524288    100000       0.023        95.61    100.00

    1048576   100000       0.011        95.61    100.00

    2097152   100000       0.006        95.60    100.00

    4194304   100000       0.003        95.61    100.00

    8388608   100000       0.001        95.61    100.00

     

     

    Test finished. Releasing resources...

     

    2. Here is an example with ntttcp:

     

     

    Note: This tool needs to be run using a command line rather than through a PowerShell (cmd).

     

    Note: First, the receive side (-r) needs to be run, followed by the sender side (-s).

     

    Run the following on the Server:

    ntttcp.exe -r -m 28,*,12.12.12.8 -rb 2M -a 16 -t 5

     

    Run the following on the Client:

    ntttcp.exe -s -m 28,*,12.12.12.8 -l 512K -a 2 -t 5

     

    Note: The -a flag allows to double the amount of overlapped I/Os (the maximum allowed number of I/Os in the air without forcing to receive an ack (acknowledgement) on the first one of them). The default is only 2. It is recommended that you set this parameter to equal the number of Rx server CPU cores (for 16 cores you should use -a 16).

     

    You might need to experiment with the tool parameter in order to reach higher bandwidth.

     

    Here is an example:

    PS C:\Users\Administrator> cmd

    PS C:\Users\Administrator> cd Desktop\NTttcp-v5.31\x64

    C:\Users\Administrator\Desktop\NTttcp-v5.31\x64>ntttcp.exe -r -m 28,*,12.12.12.8 -rb 2M -a 16 -t 5

    Copyright Version 5.31

    Network activity progressing...

     

    Thread  Time(s) Throughput(KB/s) Avg B / Compl

    ======  ======= ================ =============

         0    5.016       481582.137     65536.000

         1    5.016       456165.869     65536.000

         2    5.016       642985.646     65530.799

         3    5.016       362054.226     65476.006

         4    5.016       434615.630     65526.382

         5    5.016       211864.747     64728.688

         6    5.016       282373.206     64714.314

         7    5.016       273250.399     65228.376

         8    5.016       322279.970     64511.055

         9    5.016       366213.716     65299.392

        10    5.016       277920.255     64710.116

        11    5.016       461984.051     65534.190

        12    5.016       344650.718     65330.422

        13    5.016       201391.044     64494.154

        14    5.016       665110.048     65527.201

        15    5.016      1079732.057     65475.650

        16    5.016       307968.102     64496.552

        17    5.016       348712.740     65226.604

        18    5.016       633480.064     65536.000

        19    5.016       314475.279     64937.919

        20    5.016       774953.748     65514.427

        21    5.016       298692.185     65463.294

        22    5.016       247821.372     65113.598

        23    5.016       481352.472     65536.000

        24    5.016       272829.346     65306.939

        25    5.016       202311.565     63814.167

        26    5.016       739215.311     65520.167

        27    5.016       464969.697     65530.605

     

    #####  Totals:  #####

     

       Bytes(MEG)    realtime(s) Avg Frame Size Throughput(MB/s)

    ================ =========== ============== ================

        58541.009071       5.017       9536.285        11668.529

     

    Throughput(Buffers/s) Cycles/Byte       Buffers

    ===================== =========== =============

               186696.461       1.557    936656.145

     

    DPCs(count/s) Pkts(num/DPC)   Intr(count/s) Pkts(num/intr)

    ============= ============= =============== ==============

        32394.060        39.607      295854.295          4.337

     

    Packets Sent Packets Received Retransmits Errors Avg. CPU %

    ============ ================ =========== ====== ==========

          594139          6436961           0      0     14.827

     

    C:\Users\Administrator\Desktop\NTttcp-v5.31\x64>