18 Replies Latest reply on Aug 22, 2013 8:53 PM by drolfe

    Getting eIPoIB to work ?

    drolfe

      OK, I've been trying to setup eIPoIB. I have my infiniband network up, ib0 is setup for ipoib I can see the new eth2 interface (virtual eIPoIB device) but nothing in the vifs, so I can't ping from compute node to compute node.

       

      Welcome to Ubuntu 12.04.2 LTS (GNU/Linux 3.2.0-39-generic x86_64)
      
      
       * Documentation:  https://help.ubuntu.com/
      
      
        System information as of Thu May  9 22:31:11 EST 2013
      
      
        System load:  1.83              Users logged in:     0
        Usage of /:   2.7% of 59.39GB   IP address for eth0: 192.168.10.101
        Memory usage: 0%                IP address for ib0:  10.10.10.101
        Swap usage:   0%                IP address for eth2: 20.20.20.101
        Processes:    112
      
      
        Graph this data and manage this system at https://landscape.canonical.com/
      
      
      Last login: Thu May  9 21:55:22 2013 from maas.local
      ubuntu@blade01:~$ sudo su -
      root@blade01:~# cat /sys/class/net/eth2/eth/vifs
      root@blade01:~#
      root@blade01:~#
      root@blade01:~# ibstat
      CA 'mlx4_0'
              CA type: MT25418
              Number of ports: 2
              Firmware version: 2.8.0
              Hardware version: a0
              Node GUID: 0x001b78ffff33ee58
              System image GUID: 0x001b78ffff33ee5b
              Port 1:
                      State: Active
                      Physical state: LinkUp
                      Rate: 20
                      Base lid: 5
                      LMC: 0
                      SM lid: 3
                      Capability mask: 0x02510868
                      Port GUID: 0x001b78ffff33ee59
                      Link layer: InfiniBand
              Port 2:
                      State: Active
                      Physical state: LinkUp
                      Rate: 20
                      Base lid: 2
                      LMC: 0
                      SM lid: 1
                      Capability mask: 0x02510868
                      Port GUID: 0x001b78ffff33ee5a
                      Link layer: InfiniBand
      root@blade01:~# ifconfig ib0
      ib0       Link encap:UNSPEC  HWaddr A0-00-01-00-FE-80-00-00-00-00-00-00-00-00-00-00
                inet addr:10.10.10.101  Bcast:10.10.10.255  Mask:255.255.255.0
                inet6 addr: fe80::21b:78ff:ff33:ee59/64 Scope:Link
                UP BROADCAST RUNNING MULTICAST  MTU:65520  Metric:1
                RX packets:6 errors:0 dropped:0 overruns:0 frame:0
                TX packets:30 errors:0 dropped:8 overruns:0 carrier:0
                collisions:0 txqueuelen:1024
                RX bytes:1515 (1.5 KB)  TX bytes:5752 (5.7 KB)
      
      
      root@blade01:~# ping 10.10.10.102
      PING 10.10.10.102 (10.10.10.102) 56(84) bytes of data.
      64 bytes from 10.10.10.102: icmp_req=2 ttl=64 time=2.24 ms
      64 bytes from 10.10.10.102: icmp_req=3 ttl=64 time=0.033 ms
      ^C
      --- 10.10.10.102 ping statistics ---
      3 packets transmitted, 2 received, 33% packet loss, time 2000ms
      rtt min/avg/max/mdev = 0.033/1.141/2.249/1.108 ms
      root@blade01:~# ifconfig eth2
      eth2      Link encap:Ethernet  HWaddr 00:1b:78:33:ee:59
                inet addr:20.20.20.101  Bcast:20.20.20.255  Mask:255.255.255.0
                inet6 addr: fe80::21b:78ff:fe33:ee59/64 Scope:Link
                UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
                RX packets:0 errors:0 dropped:0 overruns:0 frame:0
                TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
                collisions:0 txqueuelen:0
                RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
      
      
      root@blade01:~# ping 20.20.20.102
      PING 20.20.20.102 (20.20.20.102) 56(84) bytes of data.
      ^C
      --- 20.20.20.102 ping statistics ---
      3 packets transmitted, 0 received, 100% packet loss, time 2015ms
      
      
      root@blade01:~# cat /sys/class/net/eth2/eth/vifs
      root@blade01:~# ????
      root@blade01:~# ethtool -i eth2
      driver: eth_ipoib
      version: 1.0.0
      firmware-version: 1
      bus-info: ib0
      supports-statistics: yes
      supports-test: no
      supports-eeprom-access: no
      supports-register-dump: no
      root@blade01:~#
      
      
      
        • Re: Getting eIPoIB to work ?
          drolfe

          ok thanks to a google with another 3 commands on each node I can now using the eipoib interfaces.

           

          the first command creates the ib0 sub interface in this case ib0.1

           

          root@blade01:~# echo .1 > /sys/class/net/ib0/create_child
          
          
          root@blade01:~# ifconfig ib0.1
          ib0.1     Link encap:UNSPEC  HWaddr A0-00-01-10-FE-80-00-00-00-00-00-00-00-00-00-00
                    UP BROADCAST RUNNING SLAVE MULTICAST  MTU:2044  Metric:1
                    RX packets:1986136 errors:0 dropped:0 overruns:0 frame:0
                    TX packets:5610181 errors:0 dropped:0 overruns:0 carrier:0
                    collisions:0 txqueuelen:1024
                    RX bytes:79455932 (79.4 MB)  TX bytes:24752145636 (24.7 GB)
          
          
          root@blade01:~#
          

           

          then the last two commands completed the enslavement

           

          root@blade01:~# echo +ib0.1 > /sys/class/net/eth2/eth/slaves
          root@blade01:~# echo +ib0.1 00:1b:78:33:6e:95 > /sys/class/net/eth2/eth/vifs
          

           

          And speed tests are similar to normal ipoib withour larger mtu or SDP setup

           

          root@blade01:~# netperf -H 20.20.20.102 -c -C -- -m 1400
          MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 20.20.20.102 (20.20.20.102) port 0 AF_INET : demo
          Recv   Send    Send                          Utilization       Service Demand
          Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
          Size   Size    Size     Time     Throughput  local    remote   local   remote
          bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB
          
          
           87380  65536   1400    10.00      5388.11   21.91    25.73    1.333   1.565
          root@blade01:~#
          
            • Re: Getting eIPoIB to work ?
              drolfe

              My question now is why wasn't the sub ib0.1 interface enslaving setup for me as the user manual says the below:

               

              "The IPoIB daemon (ipoibd) detects the new VIFs and creates a new IPoIB instances, as a result number of IPoIB interfaces (ibX.Y) are shown as being created/destroyed, and are being enslaved to the corresponding ethX interface to serve any active VIF in the system according to the set configuration, This process is done automatically by the ipoibd service."

               

              or do you just have to manually do the PIF but the VIF's are auto setup ?

                • Re: Getting eIPoIB to work ?
                  ali

                  Hi drolfe,

                  Indeed, the ipoib daemon (ipoibd) should create and enslave the ibX.Y interfaces automatically. No need for user interference.

                   

                  Please make sure that it's running.

                  You can also try to edit /etc/init.d/ipoibd and enable the debug flag to see if it's really running, and whether it has any unexpected issues.

                    • Re: Getting eIPoIB to work ?

                      ipoibd is in the public beta is broken on ubuntu. I needed to hack it in order to make it run. Even when it is properly configured, the daemon immediately exits due to a check if the operating system is supported. Ubuntu is not. If you comment out this guard, the daemon starts, but has a bunch of hardcoded paths to executables that the daemon is calling that are in different locations on ubuntu than on redhat. Once these are fixed, the daemon works properly.

                        • Re: Getting eIPoIB to work ?
                          drolfe

                          Thanks for the info , I don't have access to my lab on the weekend but I'll have to check it out when back in the office. Do you have the details of what you changed on your ubuntu system ?

                            • Re: Getting eIPoIB to work ?

                              It was basically the initial distribution test, and then i removed the full paths in all of the calls to external commands.

                                • Re: Getting eIPoIB to work ?
                                  drolfe

                                  Ok I'll test it on Monday and post back the results, thanks again

                                    • Re: Getting eIPoIB to work ?

                                      Hi drolfe,

                                      I'm trying too to get eipoib to work on ubuntu 12.04.

                                       

                                      In the end, did you succeed to have eipoib working? Can you add your infiniband hca as an interface for a bridge device?

                                       

                                      I'm stuck in compiling eth_ipoib kernel module..

                                      Did you start from OFED download right? which version?

                                       

                                      Can you please paste here the procedure or better your linux history of commands to download, edit, and compile both the kernel module and the ipoibd daemon?

                                       

                                      Thanks very much in advance,

                                      Giovanni

                                        • Re: Getting eIPoIB to work ?
                                          drolfe

                                          www.mellanox.com/page/products_dyn?product_family=26

                                           

                                          First start with the mellanox driver 2.0 which now has support for ubuntu out of the box

                                           

                                          the binary path issue should be fixed in the next release I'm told

                                           

                                           

                                           

                                           

                                            • Re: Getting eIPoIB to work ?

                                              Unfortunately I have an Infinihost device with chipset MT23108..

                                              From release notes I understood that it is deprecated from mlnx 1.5.3 onward ... see below release notes for 2.0, 1.5.3 and 1.5.2

                                               

                                              Does anyone know if inifinihos MT23108 are really really unsopported after 1.5.2?

                                              What can I do to make eth_ipoib work with infinihost devices?

                                               

                                              Thanks,

                                              Giovanni.

                                               

                                              ===============================================================================

                                              http://www.mellanox.com/related-docs/prod_software/Mellanox_OFED_Linux_Release_Notes_2_0-2_0_5.txt

                                              MLNX_OFED_LINUX 2.0 supports the following adapters:

                                                - Mellanox Technologies HCAs:

                                                - ConnectX-3 (Rev 2.11.0500 and above)

                                                - ConnectX-2 (Rev 2.9.1200 and above)

                                                - Connect-IB (Rev 10.0.2400 and above)

                                              ===============================================================================

                                              http://www.mellanox.com/related-docs/prod_software/Mellanox_OFED_Linux_Release_Notes_1_5_3-4_0_35.txt

                                              Mellanox supports the following adapters with MLNX_OFED_LINUX 1.5.3:

                                                - Mellanox Technologies HCAs (SDR and DDR Modes are Supported):

                                                - ConnectX-3 (fw-4099 Rev 2.11.0500)

                                                - ConnectX-2/ConnectX-2 EN (fw-ConnectX2 Rev 2.9.1200)

                                              ===============================================================================

                                              Mellanox supports the following adapters with OFED 1.5.2:

                                                 - Mellanox Technologies HCAs (SDR and DDR Modes are Supported):

                                                   - InfiniHost(R) (fw-23108 Rev 3.5.000)

                                                   - InfiniHost(R) III Ex (MemFree: fw-25218 Rev 5.3.000

                                                                       with memory: fw-25208 Rev 4.8.200)

                                                   - InfiniHost(R) III Lx (fw-25204 Rev 1.2.000)

                                                   - ConnectX(R) and ConnectX EN (fw-25408 Rev 2.8.0600)

                                                   - ConnectX-2 (fw-ConnectX2 Rev 2.8.0600)

                                                   - ConnectX-2 EN (fw-ConnectX2 Rev 2.8.0600)

                                              Note: InfiniHost adapters will be deprecated in the next MLNX_OFED release.

                                    • Re: Getting eIPoIB to work ?
                                      ali

                                      Thanks nldesai,

                                      We've captured your feedback, and will fix ipoibd in the next release.

                              • Re: Getting eIPoIB to work ?

                                Hi,

                                 

                                I would also like to get eIPoIB up&running. I have a couple of servers with CentOS 6.4 (2.6.32-358.6.2.el6.x86_64) and ConnectX-3 dual-port adapters. I've installed latest OFED-2 (I had to add my kernel support with "./mlnx_add_kernel_support.sh -m . -v"). IPoIB works fine.

                                 

                                Then I followed the instructions in the manual and enabled eIPoIB by adding "E_IPOIB_LOAD=yes" to /etc/infiniband/openib.conf and restarted InfiniBand drivers by /etc/init.d/openibd restart.

                                 

                                Then manual says "When eth_ipoib is loaded,"... ok, how do I know if "eth_ipoib is loaded"? Is this supposed to be a kernel module?

                                 

                                # modprobe eth_ipoib

                                FATAL: Module eth_ipoib not found

                                 

                                Also, OFED didn't install /etc/init.d/ipoibd, but I found one in /usr/src/ofa_kernel-2.0/ofed_scripts/ipoibd ...

                                 

                                Obviously something hasn't been installed correctly? Do I have to compile&install it manually from /usr/src/ofa_kernel-2.0/drivers/net/eipoib/?

                                 

                                Thanks for help!

                                • Re: Getting eIPoIB to work ?

                                  Hi,

                                  I'm having the same issue as kenshiro. Has there been any updates? Although I'm using CentOS 6.3 64bit with MLNX_OFED_LINUX-2.0-2.0.5-rhel6.3-x86_64.

                                   

                                  What I found was when trying to manually start the ipoibd daemon is gives me an error "This OS is not supported".

                                   

                                  Thanks,

                                  Iliyas.

                                  • Re: Getting eIPoIB to work ?
                                    ali

                                    To bypass ipoibd service, please follow this document:

                                    http://community.mellanox.com/docs/DOC-1316