HowTo Configure SR-IOV for ConnectX-4 with ESX 5.5/6.0 Native (Ethernet)

Version 9

    This post describes how to configure the Mellanox ConnectX-4 driver with an SR-IOV (Ethernet) for ESX 5.5/6.0 Native driver.

    Note: Setting up a VM is out of the scope of this post.

     

     

    References

     

    Overview

    SR-IOV configuration includes the following steps to:

    1. Enable Virtualization (SR-IOV) in the BIOS (prerequisites).

    2. Enable SR-IOV in the firmware.

    3. Enable SR-IOV in the MLNX_OFED Driver.

    4. Map the Virtual Machine (VM) to the relevant port via SR-IOV.

     

    Setup and Prerequisites

    Ensure that the following steps have been taken:

    1. Make sure that two servers are connected via an Ethernet switch.

     

    2. Make sure that SR-IOV is enabled in the BIOS of the specific server. Each server has different BIOS configuration options for virtualization. See HowTo Set Dell PowerEdge R730 BIOS parameters to support SR-IOV for BIOS configuration examples.

     

    3. Make sure to have the latest native driver on the Hypervisor. Refer to Mellanox.com.

     

    4. Install Mellanox Firmware Tools (MFT) for ESX, refer to HowTo Install MFT for ESX VMWare.

     

    Configuration

    I. Enable SR-IOV on the Firmware

     

    1. Run MFT and check the status.

    # /opt/mellanox/bin/mst start

    Module nmst loaded successfully

     

    # /opt/mellanox/bin/mst status

     

    MST devices:
    ------------
    mt4115_pciconf0

     

    2. Query the status of the device.

    # /opt/mellanox/bin/mlxconfig -d mt4115_pciconf0 q

     

    Device #1:

    ----------

     

    Device type:    ConnectX4      

    PCI device:     mt4115_pciconf0

     

    Configurations:                              Current

             SRIOV_EN                            False(0)        

             NUM_OF_VFS                          0              

             PF_LOG_BAR_SIZE                     5              

             VF_LOG_BAR_SIZE                     5              

             NUM_PF_MSIX                         63             

             NUM_VF_MSIX                         11             

             LINK_TYPE_P1                        ETH(2)         

             LINK_TYPE_P2                        ETH(2)         

             LOG_DCR_HASH_TABLE_SIZE             14             

             DCR_LIFO_SIZE                       16384          

             ROCE_NEXT_PROTOCOL                  254            

             ROCE_CC_ALGORITHM_P1                ECN(0)         

             ROCE_CC_PRIO_MASK_P1                0              

             ROCE_CC_ALGORITHM_P2                ECN(0)         

             ROCE_CC_PRIO_MASK_P2                0              

             CLAMP_TGT_RATE_P1                   0              

             CLAMP_TGT_RATE_AFTER_TIME_INC_P1    1              

             RPG_TIME_RESET_P1                   600            

             RPG_BYTE_RESET_P1                   32767          

             RPG_THRESHOLD_P1                    5              

             RPG_MAX_RATE_P1                     0              

             RPG_AI_RATE_P1                      5              

             RPG_HAI_RATE_P1                     50             

             RPG_GD_P1                           11             

             RPG_MIN_DEC_FAC_P1                  50             

             RPG_MIN_RATE_P1                     1              

             RATE_TO_SET_ON_FIRST_CNP_P1         100            

             DCE_TCP_G_P1                        4              

             DCE_TCP_RTT_P1                      1              

             RATE_REDUCE_MONITOR_PERIOD_P1       4              

             INITIAL_ALPHA_VALUE_P1              0              

             MIN_TIME_BETWEEN_CNPS_P1            0              

             CNP_DSCP_P1                         0              

             CNP_802P_PRIO_P1                    7              

             CLAMP_TGT_RATE_P2                   0              

             CLAMP_TGT_RATE_AFTER_TIME_INC_P2    1              

             RPG_TIME_RESET_P2                   600            

             RPG_BYTE_RESET_P2                   32767          

             RPG_THRESHOLD_P2                    5              

             RPG_MAX_RATE_P2                     0              

             RPG_AI_RATE_P2                      5              

             RPG_HAI_RATE_P2                     50             

             RPG_GD_P2                           11             

             RPG_MIN_DEC_FAC_P2                  50             

             RPG_MIN_RATE_P2                     1              

             RATE_TO_SET_ON_FIRST_CNP_P2         100            

             DCE_TCP_G_P2                        4              

             DCE_TCP_RTT_P2                      1              

             RATE_REDUCE_MONITOR_PERIOD_P2       4              

             INITIAL_ALPHA_VALUE_P2              0              

             MIN_TIME_BETWEEN_CNPS_P2            0              

             CNP_DSCP_P2                         0              

             CNP_802P_PRIO_P2                    7              

             PORT_OWNER                          True(1)        

             ALLOW_RD_COUNTERS                   True(1)        

             IP_VER                              IPv4(0)       

     

    4. Enable SR-IOV and set the desired number of Virtual Functions (VFs).

    • SRIOV_EN=1
    • NUM_OF_VFS=8   ; This is an example with eight VFs per port.

     

    # /opt/mellanox/bin/mlxconfig -d mt4115_pciconf0 set SRIOV_EN=1 NUM_OF_VFS=8

     

    Device #1:

    ----------

     

    Device type:    ConnectX4

    PCI device:     mt4115_pciconf0

     

    Configurations:                              Current         New

             SRIOV_EN                            False(0)        True(1)        

             NUM_OF_VFS                          0               8          

     

    Apply new Configuration? ? (y/n) [n] : y

    Applying... Done!

    -I- Please reboot machine to load new configurations.

     

    Note: mlxconfig must be performed for each PCI device (adapter). In parallel, in the driver the configuration is per module, which means that it will be applicable for all adapters installed on the server.

     

    5. Reboot the server.

     

    Note: At this point, the VFs are not seen when using lspci. Only when SR-IOV is enabled on the driver will you be able to see them.

    # lspci -d | grep Mellanox

    0000:06:00.0 Network controller: Mellanox Technologies MT27770 Family [ConnectX-4] [vmnic4]

    0000:06:00.1 Network controller: Mellanox Technologies MT27770 Family [ConnectX-4] [vmnic5]

     

    II. Enable SR-IOV on the Driver

     

    1. Get the module parameter list as follows:

    # esxcli system module parameters list -m nmlx5_core

    Name                 Type  Value Description                                                                                                                               -------------------  ----  -----  -------------

     

    device_rss           int          This parameter is Obsolete - Please use 'drss' parameter

       Enable device RSS steering mode

       Values : 1 - enabled, 0 - disabled

       Default: 0

                                                                                                                                                                                                                                    

    drss                 int          Number of HW queues for DEFQ RSS

       Values : 2-16, 0 - disabled

       When this value is != 0, DEFQ RSS is enabled with 1 RSS Uplink queue that manages 'drss' HW queues.

       When this value is set to 16 the driver works in DEVICE RSS mode.

       Note: The value must be a power of 2.

       Note: Currently only DEVICE RSS or NETQ are supported (only values 0, 16).

       Default: 0

     

    enable_nmlx_debug    int          Enable debug prints

       Values : 1 - enabled, 0 - disabled

       Default: 0

                                                                                                                                                                                                                                                                                                            

    max_vfs              int   0      Number of PCI VFs to initialize

       Values : 0 - disabled

       Default: 0

                                                                                                                                                                                                                                                                                                             

    mst_recovery         int          Enable recovery mode(only NMST module is loaded)

       Values : 1 - enabled, 0 - disabled

       Default: 0

                                                                                                                                                                                                                                                                               

    supported_num_ports  int          Total number of supported ports

       Values : 2 to 8

       Default: 4

     

    2. Enable SR-IOV in the driver and set the max_vfs module parameter.

    # esxcli system module parameters set -m nmlx5_core -p max_vfs=4

     

    Note 1: Allow at least one more VF to be configured on the firmware (num_of_vfs) than is configured on the driver. In our example we had eight VFs configured on the firmware while four is configured on the driver (max_vfs).

     

    Note 2: mlxconfig must be performed for each PCI device (adapter). In parallel, in the driver the configuration is per module, which means that it will be applicable for all adapters installed on the server.

     

    Note 3: Changing the number of VFs is persistent.

     

    5. Check the PCI bus and verify that you see the VFs (with the same number of VFs on each port).

    # lspci -d | grep Mellanox

    0000:21:00.0 Network controller: Mellanox Technologies MT27700 Family [ConnectX-4] [vmnic2]

    0000:21:00.1 Network controller: Mellanox Technologies MT27700 Family [ConnectX-4] [vmnic3]

    0000:21:00.2 Network controller: Mellanox Technologies MT27700 Family [ConnectX-4 VF] [PF_0.33.0_VF_0]

    0000:21:00.3 Network controller: Mellanox Technologies MT27700 Family [ConnectX-4 VF] [PF_0.33.0_VF_1]

    0000:21:00.4 Network controller: Mellanox Technologies MT27700 Family [ConnectX-4 VF] [PF_0.33.0_VF_2]

    0000:21:00.5 Network controller: Mellanox Technologies MT27700 Family [ConnectX-4 VF] [PF_0.33.0_VF_3]

    0000:21:00.7 Network controller: Mellanox Technologies MT27700 Family [ConnectX-4 VF] [PF_0.33.1_VF_0]

    0000:21:01.0 Network controller: Mellanox Technologies MT27700 Family [ConnectX-4 VF] [PF_0.33.1_VF_1]

    0000:21:01.1 Network controller: Mellanox Technologies MT27700 Family [ConnectX-4 VF] [PF_0.33.1_VF_2]

    0000:21:01.2 Network controller: Mellanox Technologies MT27700 Family [ConnectX-4 VF] [PF_0.33.1_VF_3]

     

    At this point you can see four VFs and one Physical Function (PF) per port.

     

    III. Add Network Adapter to the VM in SR-IOV Mode

    Note 1: Before you start, power off the VM.

    Note 2: On the guest VM install the OS Mellanox driver (OFED, WinOF ...).

    Note 3: Make sure the VM version is Rel. 10 or above, and upgrade it if needed by accessing the Compatibility section (otherwise SR-IOV will not appear as an option in the network adapter selection).

     

    1. Select the VM and Go to "Edit Settings".

     

    vm1.png

     

    2. Add the new Network device by going to the bottom of the screen and selecting Network.

    vm2.png

     

    3. Under Adapter Type select the SR-IOV passthrough connectivity option.

    vm3.png

     

     

    4. Check the Reserve all guest memory (All locked) checkbox.

    vm4.png

     

    5. With the New Network option selected, attach the VM to the desired network by selecting VM_Network1 from the combo box at the bottom of the screen.

     

    vm5.png

     

    Notes: MAC Address and MTU Considerations:

        a. You can leave the automatic generated MAC address (this is the default), or change it manually.

        b. The Hypervisor MTU should be higher or equal to the Guest VM, otherwise, the packets may be dropped. You may modify “Set Guest OS MTU change” to allow changing MTU from guest.

     

    6. Open the VM command line (or use Windows) and make sure that you have the interface connected. Configure the IP Address and check Network connectivity.

     

    Troubleshooting

    1.At least one more VF must be configured on the firmware than is configured on the driver. In our example we had eight VFs configured on the firmware while four are configured on the driver.

     

    2. mlxconfig must be performed for each PCI device (adapter). In parallel, in the driver the configuration is per module, which means that it will be applicable for all adapters installed on the server.

     

    3. Make sure the VM version is Rel. 10 or above, and upgrade it if needed by accessing the Compatibility section (otherwise SR-IOV will not appear as option in network adapter selection).