1 Reply Latest reply on Apr 4, 2017 3:03 PM by yairi

    Poor IPoIB performance on two Mellanox MHGH28-XTC ConnectX VPI Dual-Port 20Gb/s

    funtiesto

      Hi

       

      I set up new fileserver on Gentoo linux with Supermicro case, Xeon, 12GB ECC RAM and 21TB 9265 RAID6 (read/write ~800MB/s).

      Supermicro SC846 24x SATA Storage Server 9650SE-24ML SAS846A 2x PWS-1K21P-1R | eBay

       

      I figured out infiniband being best method to connect it to my workstation running Windows 8.1 x64

      ASUS P8B WS Professional, i3 3240, 16GB ECC RAM and 6TB 9265 RAID5 (read/write ~600MB/s)

       

      I've bought 2x Mellanox MHGH28-XTC and Genuine Mellanox Infiniband CX4 sff8470 to sff8470 Cable 3m/10ft SAS MCC4L30-003

      Genuine Mellanox Infiniband CX4 sff8470 to sff8470 Cable 3m/10ft SAS MCC4L30-003 | eBay

       

      I compiled on Gentoo modules giving infiniband support (mlx4_ib, ib_umad, ib_ipoib, ib_uverbs). Controller is visible by system. ib0 and ib1 interfaces may be configured.

       

      On windows 8.1 I had more problems. As I eventually got to know from Mellanox support last working driver for this Connectx 1 controller is MLNX_VPI_WinOF-3_2_0_wlh_x64 (had to run this with compatibility mode Windows 7). Controller was installed succesfully. Two iboip adapters appeared.

       

      I setup OpenSM.

      I firstly used standard settings

      Windows port type Infiniband (16 Gbps/s)

      Gentoo fileserver port type Infiniband (16 Gbps/s) (ib0 mode datagram)

      ibstat shown Link UP and ACTIVE

      ping is working OK

       

      But iperf and file copying tests are strange.

      Copy over samba from fileserver to Windows is ~1MB/s (terribly slow)

      Copy over samba from Windows to fileserver is ~200MB/s (still to slow)

      Windows iperf -s fileserver iperf -c (from fileserver to Windows) ~3Mbits/s (terribly slow)

      Windows iperf -s fileserver iperf -c -R ~3Mbits/s (from Windows to fileserver) ~6Gbits/s (still to slow)

       

      C:\Windows\system32>ping 192.168.255.1

       

      Pinging 192.168.255.1 with 32 bytes of data:

      Reply from 192.168.255.1: bytes=32 time<1ms TTL=128

      Reply from 192.168.255.1: bytes=32 time<1ms TTL=128

       

      Ping statistics for 192.168.255.1:

      Packets: Sent = 2, Received = 2, Lost = 0 (0% loss),

      Approximate round trip times in milli-seconds:

      Minimum = 0ms, Maximum = 0ms, Average = 0ms

      Control-C

      ^C

      C:\Windows\system32>ping 192.168.255.2

       

      Pinging 192.168.255.2 with 32 bytes of data:

      Reply from 192.168.255.2: bytes=32 time<1ms TTL=64

      Reply from 192.168.255.2: bytes=32 time<1ms TTL=64

       

      Ping statistics for 192.168.255.2:

      Packets: Sent = 2, Received = 2, Lost = 0 (0% loss),

      Approximate round trip times in milli-seconds:

      Minimum = 0ms, Maximum = 0ms, Average = 0ms

      Control-C

      ^C

      C:\Windows\system32>iperf3 -s

      -----------------------------------------------------------

      Server listening on 5201

      -----------------------------------------------------------

      iperf3: interrupt - the server has terminated

       

      C:\Windows\system32>iperf3 -c 192.168.255.2

      Connecting to host 192.168.255.2, port 5201

      [ 4] local 192.168.255.1 port 49229 connected to 192.168.255.2 port 5201

      [ ID] Interval Transfer Bandwidth

      [ 4] 0.00-1.00 sec 1.04 GBytes 8.94 Gbits/sec

      [ 4] 1.00-2.00 sec 996 MBytes 8.35 Gbits/sec

      [ 4] 2.00-3.00 sec 748 MBytes 6.28 Gbits/sec

      [ 4] 3.00-4.00 sec 774 MBytes 6.49 Gbits/sec

      [ 4] 4.00-5.00 sec 776 MBytes 6.51 Gbits/sec

      [ 4] 5.00-6.00 sec 1.01 GBytes 8.69 Gbits/sec

      [ 4] 6.00-7.00 sec 808 MBytes 6.77 Gbits/sec

      [ 4] 7.00-8.00 sec 703 MBytes 5.90 Gbits/sec

      [ 4] 8.00-9.00 sec 740 MBytes 6.21 Gbits/sec

      [ 4] 9.00-10.00 sec 810 MBytes 6.79 Gbits/sec

      - - - - - - - - - - - - - - - - - - - - - - - - -

      [ ID] Interval Transfer Bandwidth

      [ 4] 0.00-10.00 sec 8.26 GBytes 7.09 Gbits/sec sender

      [ 4] 0.00-10.00 sec 8.26 GBytes 7.09 Gbits/sec receiver

       

      iperf Done.

       

      C:\Windows\system32>iperf3 -c 192.168.255.2 -R

      Connecting to host 192.168.255.2, port 5201

      Reverse mode, remote host 192.168.255.2 is sending

      [ 4] local 192.168.255.1 port 49231 connected to 192.168.255.2 port 5201

      [ ID] Interval Transfer Bandwidth

      [ 4] 0.00-1.02 sec 433 KBytes 3.49 Mbits/sec

      [ 4] 1.02-2.00 sec 278 KBytes 2.31 Mbits/sec

      [ 4] 2.00-3.00 sec 384 KBytes 3.14 Mbits/sec

      [ 4] 3.00-4.00 sec 429 KBytes 3.51 Mbits/sec

      [ 4] 4.00-5.00 sec 313 KBytes 2.56 Mbits/sec

      [ 4] 5.00-6.00 sec 399 KBytes 3.27 Mbits/sec

      [ 4] 6.00-7.00 sec 358 KBytes 2.93 Mbits/sec

      [ 4] 7.00-8.00 sec 384 KBytes 3.14 Mbits/sec

      [ 4] 8.00-9.00 sec 247 KBytes 2.02 Mbits/sec

      [ 4] 9.00-10.00 sec 452 KBytes 3.70 Mbits/sec

      - - - - - - - - - - - - - - - - - - - - - - - - -

      [ ID] Interval Transfer Bandwidth Retr

      [ 4] 0.00-10.00 sec 3.75 MBytes 3.15 Mbits/sec 591 sender

      [ 4] 0.00-10.00 sec 3.59 MBytes 3.01 Mbits/sec receiver

       

      iperf Done.

       

      C:\Windows\system32>

       

      Then I changed ib0 mode to connected

      echo connected > /sys/class/net/ib0/mode

      and this changed nothing

      Then I aligned MTU to 4092

      and this changed nothing

       

       

      Then I thought to check Windows - Windows (and finally Linux - Linux).

      When I run on both machines Windows 8.1 x64 with adapters set up Infiniband port type I gained:

      Copy over samba (both directions same results) ~600MB/s (acceptable)

      iperf (both directions same results) 8Gbps/s (acceptable)

       

      Then I changed port type to Ethernet (10Gbps/s). Results are pretty much as above.

       

      Then I finally run fileserver Gentoo and workstation Ubuntu 16.04.

      Out of the box settings

      ib0 mode datagram, port type Infiniband

      Copy over samba (both directions same results) ~1MB/s (terribly slow)

      iperf (both directions same results) 3Mbits/s (terribly slow)

       

      then I changed ib0 mode on both ends to connected. And results appeared to be cool:

      Copy over samba (both directions same results) ~600MB/s (good) (Rather RAID6 limitation)

      iperf (both directions same results) 12Gbits/s (good)

       

      Now I stuck. Seems poor performance on Windows - Linux configuratin is caused by software issue.

      I had idea to change interface mode to connected on windows but dont know how.

      Also I thought to set up connection on port type ethernet. But changing this on windows results in cable disconnected.

      I found how to change this on linux:

      echo eth > /sys/bus/pci/devices/0000\:01\:00.0/mlx4_port1

      But this results in Physical state: Disabled on ibstat and cable is still disconnected. Maybe I am lacking ethernet port type module?

       

       

      Why out of stock Windows > Fileserver is still slow (200MB/s - 1,5Gbits/s) but acceptable while opposite direction so terribly slow?

      I hope this is not OSes incompatible problem but Linux - Linux and Windows - Windows configurations not facing issue is a question.

       

      I appreciate any help.

       

      br

      Paweł