2 Replies Latest reply on Jun 12, 2014 11:13 PM by alexmercer

    Poor infiniband performance on Vmware esxi 5.1

      Hello,

      I have a really weird problem with Infiniband connection between ESXi Hosts.

      Here is my setup :

       

      HP C7000 with BL685c G1 and HP 4x DDR IB Switch Module . The blades are running Vmware Esxi 5.1.0 U2 ( Custom HP image ), I have also installed Mellanox drivers ( MLNX-OFED-ESX-1.8.1.0 ) and ib-opensm on each of the hosts (http://www.hypervisor.fr/?p=4662 ) . Here are the vmnics :

       

      # esxcli network nic list | grep 10G

      vmnic_ib0  0000:047:00.0  ib_ipoib  Up    20000  Full    00:23:7d:94:d8:7d  4092  Mellanox Technologies MT25418 [ConnectX VPI - 10GigE / IB DDR, PCIe 2.0 2.5GT/s]

      vmnic_ib1  0000:047:00.0  ib_ipoib  Up    20000  Full    00:23:7d:94:d8:7e  1500  Mellanox Technologies MT25418 [ConnectX VPI - 10GigE / IB DDR, PCIe 2.0 2.5GT/s]

       

       

      I have created a VMkernel port and a switch, both the group and switch are setup to deal with mtu=4k. I have also configured the mlx4_core to support mtu=4k

       

      # esxcli system module parameters list -m=mlx4_core | grep mtu_4k

      mtu_4k                  int           1       configure 4k mtu (mtu_4k > 0)

       

       

      And here is the problem. When I am using MTU=1500

       

      /opt/iperf/bin # ./iperf -s

      ------------------------------------------------------------

      Server listening on TCP port 5001

      TCP window size: 64.0 KByte (default)

      ------------------------------------------------------------

      [  4] local 192.168.13.39 port 5001 connected with 192.168.13.36 port 61140

      [ ID] Interval       Transfer     Bandwidth

      [  4]  0.0-10.0 sec  3.98 GBytes  3.42 Gbits/sec

      [  5] local 192.168.13.39 port 5001 connected with 192.168.13.36 port 58854

      [  5]  0.0-10.0 sec  4.53 GBytes  3.89 Gbits/sec

      [  4] local 192.168.13.39 port 5001 connected with 192.168.13.36 port 51600

      [  4]  0.0-10.0 sec  3.66 GBytes  3.15 Gbits/sec

      [  5] local 192.168.13.39 port 5001 connected with 192.168.13.36 port 60066

      [  5]  0.0-10.0 sec  4.52 GBytes  3.88 Gbits/sec

      [  4] local 192.168.13.39 port 5001 connected with 192.168.13.36 port 50728

      [  4]  0.0-10.0 sec  4.71 GBytes  4.04 Gbits/sec

      [  5] local 192.168.13.39 port 5001 connected with 192.168.13.36 port 58792

      [  5]  0.0-10.0 sec  4.54 GBytes  3.90 Gbits/sec

       

      MTU=2000

       

       

      /opt/iperf/bin # ./iperf -s

      ------------------------------------------------------------

      Server listening on TCP port 5001

      TCP window size: 64.0 KByte (default)

      ------------------------------------------------------------

      [  4] local 192.168.13.39 port 5001 connected with 192.168.13.36 port 62523

      [ ID] Interval       Transfer     Bandwidth

      [  4]  0.0-10.0 sec  5.35 GBytes  4.59 Gbits/sec

      [  5] local 192.168.13.39 port 5001 connected with 192.168.13.36 port 56491

      [  5]  0.0-10.0 sec  5.43 GBytes  4.66 Gbits/sec

      [  4] local 192.168.13.39 port 5001 connected with 192.168.13.36 port 63144

      [  4]  0.0-10.0 sec  4.41 GBytes  3.79 Gbits/sec

      [  5] local 192.168.13.39 port 5001 connected with 192.168.13.36 port 53978

      [  5]  0.0-10.0 sec  4.43 GBytes  3.81 Gbits/sec

      [  4] local 192.168.13.39 port 5001 connected with 192.168.13.36 port 61886

      [  4]  0.0-10.0 sec  5.38 GBytes  4.62 Gbits/sec

       

       

      MTU=4092

       

       

      /opt/iperf/bin # ./iperf -c 192.168.13.39

      ------------------------------------------------------------

      Client connecting to 192.168.13.39, TCP port 5001

      TCP window size: 75.5 KByte (default)

      ------------------------------------------------------------

      [  3] local 192.168.13.36 port 50673 connected with 192.168.13.39 port 5001

      [ ID] Interval       Transfer     Bandwidth

      [  3]  0.0-79.5 sec  8.00 GBytes   864 Mbits/sec

      /opt/iperf/bin # ./iperf -c 192.168.13.39

      ------------------------------------------------------------

      Client connecting to 192.168.13.39, TCP port 5001

      TCP window size: 75.5 KByte (default)

      ------------------------------------------------------------

      [  3] local 192.168.13.36 port 49604 connected with 192.168.13.39 port 5001

      [ ID] Interval       Transfer     Bandwidth

      [  3]  0.0-79.5 sec  8.00 GBytes   864 Mbits/sec

      /opt/iperf/bin # ./iperf -c 192.168.13.39

      ------------------------------------------------------------

      Client connecting to 192.168.13.39, TCP port 5001

      TCP window size: 35.5 KByte (default)

      ------------------------------------------------------------

      [  3] local 192.168.13.36 port 58764 connected with 192.168.13.39 port 5001

      [ ID] Interval       Transfer     Bandwidth

      [  3]  0.0-79.5 sec  8.00 GBytes   864 Mbits/sec

       

       

      All the testing has been done with iperf. Any suggestions why when the mtu is 4092 I get slower connection speeds than when I am using MTU=2000. AFAIK the speed has to increase when the mtu is higher ( I can see this trend from the difference between mtu=1500 and mtu=2000 ) .

       

       

      Any input is welcome

        • Re: Poor infiniband performance on Vmware esxi 5.1
          andre

          A few ideas where to look:

          1.  Most likely you do not have 4K MTU set on the IB fabric itself. You need to make sure opensm is set with MTU 4K, it is likely set to 2044 – default. If you just have one partition you need to add following line to /etc/opensm/partitions.conf and then restart opensm:

                         pkey0=0x7fff,ipoib,mtu=5 : ALL=full;

          if your opensm runs on the switch – you will need to upload this file to the switch

           

          2. Your switch may or may not support 4K MTU

           

          3. Your card (quite old) and driver may or may not support 4K MTU. See page 18 at http://www.mellanox.com/related-docs/prod_software/Mellanox_IB_OFED_Driver_for_VMware_vSphere_User_Manual_Rev_1_8_2_4.pdf
          it says that “maximum value of JF supported by the InfiniBand device is: 2044 bytes for the InfiniHost III family and 4052 / 4092 bytes for ConnectX® IB family”

           

          4. In any case it also make sense to go with the latest
          driver: http://www.mellanox.com/downloads/Drivers/MLNX-OFED-ESX-1.8.2.4-10EM-500.0.0.472560.zip

          1 of 1 people found this helpful
            • Re: Poor infiniband performance on Vmware esxi 5.1

              1. I do have a partition with a partitions.conf with the following content :

              Default=0x7fff,ipoib,mtu=5:ALL=full;

              2. By specifications the HP DDR 4x IB Switch Module supports 4k MTU, but it's not really manageable, so it might be the switch's fault after all. I am waiting for a new Topspin switch, so if that is the problem it will be resolved.

              3 and 4. I am going to update the drivers today, tho I can see the infiniband ports as ConnectX family adapters. So I guess the firmware and the HCA itself supports 4k. But still I will do some reserach in that direction too.


              Thanks a lot. I will keep you guys updated and if anyone has any other ideas - please shoot. I really want to get this thing going so I can test the virtual storage.