11 Replies Latest reply on Jun 3, 2014 9:02 AM by javi3388

    Trouble with ConnectX-3 VPI adapter card over XenServer 6.2 (Service Pack 1)

      Hi all,

       

      we are trying to use a ConnectX-3 VPI adapter card over XenServer 6.2, for this we have followed the following steps:

      1) Install XenServer 6.2 on my system (Supermicro 1027GR-TRF)
      2) Install XenServer 6.2 updates (service pack 1) following the directions from http://support.citrix.com/article/CTX138115#XenServer 6.2

      3) Install MLNX_OFED 2.1-1.0.6 and update firmware (./mlnxofedinstall --force-fw-update). Installation and update have been completed without errors. The firmware version is 2.30.8000.

      After that, we have restarted the system and the openibd service but the InfiniBand network interfaces have not been detected:

      [root@xenserver ~]# /etc/init.d/openibd restart

      hostname: `Host' unknown

      Unloading HCA driver:                                      [  OK  ]
      Loading HCA driver and Access Layer:                       [  OK  ]
      Setting up InfiniBand network interfaces:
      Setting up service network . . .                           [  done  ]

      I.e. the ib0 network interface has not been detected by the system.
      Do you know if XenServer 6.2 works with ConnectX-3? Could you please get some information about that? We have devoted a lot of time and now we wonder whether it really works.


        • Re: Trouble with ConnectX-3 VPI adapter card over XenServer 6.2 (Service Pack 1)

          Have you tried version 1_5_3-4_0_42 instead? You may need to contact Mellanox support for XenServer driver.

            • Re: Trouble with ConnectX-3 VPI adapter card over XenServer 6.2 (Service Pack 1)

              Hi iliyasa, thank you for your reply.

              I can not install the version 1.5.3-4.0.42. I have installed the Driver Development Kit for XenServer 6.2.0 (Service Pack 1) and I'm trying to build ofed modules against the 2.6.32.43-0.4.1.xs1.8.0.847.170785xen kernel but I get the following error:

              ./mlnx_add_kernel_support.sh -m /mnt/tmp/MLNX_OFED_LINUX-1.5.3-4.0.42-xenserver-i686/ -t /mnt/tmp/temp --make-tgz -v

              Note: This program will create MLNX_OFED_LINUX TGZ for rhel5.7 under /tmp directory.

                    All Mellanox, OEM, OFED, or Distribution IB packages will be removed.

              Do you want to continue?[y/N]:y

              See log file /tmp/mlnx_ofed_iso.4237.log

               

               

              Detected MLNX_OFED_LINUX-1.5.3-4.0.42

              Running cp -a /mnt/tmp/MLNX_OFED_LINUX-1.5.3-4.0.42-xenserver-i686/ /mnt/tmp/temp/mlnx_iso.4237/MLNX_OFED_LINUX-1.5.3-4.0.42-rhel5.7-i686

              Running tar xzf /mnt/tmp/temp/mlnx_iso.4237/MLNX_OFED_LINUX-1.5.3-4.0.42-rhel5.7-i686/src/MLNX_OFED_SRC-1.5.3-4.0.42.tgz

              Building OFED RPMs. Please wait...

              Running MLNX_OFED_SRC-1.5.3-4.0.42/install.pl -c /mnt/tmp/temp/mlnx_iso.4237/ofed.conf --kernel 2.6.32.43-0.4.1.xs1.8.0.847.170785xen --kernel-sources /lib/modules/2.6.32.43-0.4.1.xs1.8.0.847.170785xen/build/ --builddir /mnt/tmp/temp/mlnx_iso.4237 --disable-kmp

               

               

              ERROR: Failed executing "MLNX_OFED_SRC-1.5.3-4.0.42/install.pl -c /mnt/tmp/temp/mlnx_iso.4237/ofed.conf --kernel 2.6.32.43-0.4.1.xs1.8.0.847.170785xen --kernel-sources /lib/modules/2.6.32.43-0.4.1.xs1.8.0.847.170785xen/build/ --builddir /mnt/tmp/temp/mlnx_iso.4237 --disable-kmp"

              ERROR: See /tmp/mlnx_ofed_iso.4237.log

               

              The log file does not show additional info:

              cxgb3 is not available on this platform

              qib is not available on this platform

              knem is not available on this platform

              ib-bonding is not available on this platform

               

              Below is the list of OFED packages that you have chosen

              (some may have been added by the installer due to package dependencies):

              ofed-scripts

              kernel-ib

              kernel-ib-devel

              kernel-mft

               

               

              Build ofed-scripts RPM

              Running  rpmbuild --rebuild  --define '_topdir /mnt/tmp/temp/mlnx_iso.4237/OFED_topdir' --define 'dist %{nil}' --target i386 --define '_prefix /usr' --define '_exec_prefix /usr' --define '_sysconfdir /etc' --define '_usr /usr' /mnt/tmp/temp/mlnx_iso.4237/MLNX_OFED_SRC-1.5.3-4.0.42/SRPMS/ofed-scripts-1.5.3-OFED.1.5.3.4.0.42.src.rpm

              Install ofed-scripts RPM:

              Running rpm -iv  /mnt/tmp/temp/mlnx_iso.4237/MLNX_OFED_SRC-1.5.3-4.0.42/RPMS/centos-release-5-7.el5.centos/i686/ofed-scripts-1.5.3-OFED.1.5.3.4.0.42.i386.rpm

              Build ofa_kernel RPM

              Running rpmbuild --rebuild  --define '_topdir /mnt/tmp/temp/mlnx_iso.4237/OFED_topdir' --define '_target_cpu i686' --nodeps --define '_dist .rhel5u7' --define 'configure_options   --with-core-mod --with-user_mad-mod --with-user_access-mod --with-addr_trans-mod --with-mthca-mod --with-mlx4-mod --with-mlx4_en-mod --with-mlx4_ib-mod --with-mlx4_vnic-mod --with-nes-mod --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-rds-mod' --define 'build_kernel_ib 1' --define 'build_kernel_ib_devel 1' --define 'KVERSION 2.6.32.43-0.4.1.xs1.8.0.847.170785xen' --define 'K_SRC /lib/modules/2.6.32.43-0.4.1.xs1.8.0.847.170785xen/build/' --define 'network_dir /etc/sysconfig/network-scripts' --define '_prefix /usr' --define '__arch_install_post %{nil}' /mnt/tmp/temp/mlnx_iso.4237/MLNX_OFED_SRC-1.5.3-4.0.42/SRPMS/ofa_kernel-1.5.3-OFED.1.5.3.4.0.42.g3cb72fe.src.rpm

              kernel-ib was not created

               

            • Re: Trouble with ConnectX-3 VPI adapter card over XenServer 6.2 (Service Pack 1)

              Hi,

              Actually I meant MLNX_OFED_LINUX-2.2-1.0.1-xenserver6.x-i686. I have tried this driver and xenserver was able to pick up the card but the driver has put this into Ethernet mode. I'm trying to see why it has done this but this maybe a a separate issue or related. I'll try with a different card and follow-up.

               

              Edit: I try to change it to IB but it tells me it is an Illegal port configuration attempted.

                • Re: Trouble with ConnectX-3 VPI adapter card over XenServer 6.2 (Service Pack 1)

                  Ok I've had success with a different card as I was Ethernet card before.

                   

                  [root@xenserver6 ~]# /root/MLNX_OFED_LINUX-2.2-1.0.1-xenserver6.x-i686/mlnxofedinstall --fw-update-only

                  Logs dir: /tmp/MLNX_OFED_LINUX-2.2-1.0.1.10792.logs

                  Attempting to perform Firmware update...

                  Querying Mellanox devices firmware ...

                   

                   

                  Device #1:

                  ----------

                   

                   

                    Device Type:      ConnectX3

                    Part Number:      MCX353A-FCB_A2-A5

                    Description:      ConnectX-3 VPI adapter card; single-port QSFP; FDR IB (56Gb/s) and 40GigE; PCIe3.0 x8 8GT/s; RoHS R6

                    PSID:             MT_1100120019

                    PCI Device Name:  0000:04:00.0

                    Versions:         Current        Available

                       FW             2.30.8000      2.31.5050

                       PXE            3.4.0151       3.4.0225

                   

                   

                    Status:           Update required

                   

                   

                  ---------

                  Found 1 device(s) requiring firmware update...

                   

                   

                  Device #1: Updating FW ... Done

                   

                   

                  Restart needed for updates to take effect.

                  Log File: /tmp/MLNX_OFED_LINUX-2.2-1.0.1.10792.logs/fw_update.log

                  Please reboot your system for the changes to take effect.

                  [root@xenserver6 ~]#

                   

                   

                  [root@xenserver6 ~]# /sbin/connectx_port_config -s                              --------------------------------

                  Port configuration for PCI device: 0000:04:00.0 is:

                  auto (ib)

                  --------------------------------

                  [root@xenserver6 ~]#

                   

                   

                  [root@xenserver6 ~]# flint -d /dev/mst/mt4099_pci_cr0 q

                  Image type:      FS2

                  FW Version:      2.31.5050

                  FW Release Date: 30.4.2014

                  Product Version: 02.31.50.50

                  Rom Info:        type=PXE  version=3.4.225 devid=4099 proto=VPI

                  Device ID:       4099

                  Description:     Node             Port1            Port2            Sys image

                  GUIDs:           0002c903001e8a20 0002c903001e8a21 0002c903001e8a22 0002c903001e8a23

                  MACs:                                 0002c91e8a20     0002c91e8a21

                  VSD:

                  PSID:            MT_1100120019

                  [root@xenserver6 ~]#

                   

                   

                  [root@xenserver6 ~]# ibstat

                  CA 'mlx4_0'

                          CA type: MT4099

                          Number of ports: 1

                          Firmware version: 2.31.5050

                          Hardware version: 1

                          Node GUID: 0x0002c903001e8a20

                          System image GUID: 0x0002c903001e8a23

                          Port 1:

                                  State: Down

                                  Physical state: Polling

                                  Rate: 10

                                  Base lid: 0

                                  LMC: 0

                                  SM lid: 0

                                  Capability mask: 0x02514868

                                  Port GUID: 0x0002c903001e8a21

                                  Link layer: InfiniBand

                  [root@xenserver6 ~]#

                   

                   

                  [root@xenserver6 ~]# ifconfig -a

                  eth0      Link encap:Ethernet  HWaddr 00:25:90:C9:31:44

                            UP BROADCAST  MTU:1500  Metric:1

                            RX packets:0 errors:0 dropped:0 overruns:0 frame:0

                            TX packets:0 errors:0 dropped:0 overruns:0 carrier:0

                            collisions:0 txqueuelen:1000

                            RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

                   

                   

                  eth1      Link encap:Ethernet  HWaddr 00:25:90:C9:31:45

                            inet6 addr: fe80::225:90ff:fec9:3145/64 Scope:Link

                            UP BROADCAST RUNNING PROMISC  MTU:1500  Metric:1

                            RX packets:731 errors:0 dropped:0 overruns:0 frame:0

                            TX packets:382 errors:0 dropped:0 overruns:0 carrier:0

                            collisions:0 txqueuelen:1000

                            RX bytes:154168 (150.5 KiB)  TX bytes:50319 (49.1 KiB)

                   

                   

                  ib0       Link encap:InfiniBand  HWaddr A0:00:01:00:FE:80:00:00:00:00:00:00:00:0                                                                              0:00:00:00:00:00:00

                            BROADCAST MULTICAST  MTU:4092  Metric:1

                            RX packets:0 errors:0 dropped:0 overruns:0 frame:0

                            TX packets:0 errors:0 dropped:0 overruns:0 carrier:0

                            collisions:0 txqueuelen:128

                            RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

                   

                   

                  lo        Link encap:Local Loopback

                            inet addr:127.0.0.1  Mask:255.0.0.0

                            inet6 addr: ::1/128 Scope:Host

                            UP LOOPBACK RUNNING  MTU:16436  Metric:1

                            RX packets:24 errors:0 dropped:0 overruns:0 frame:0

                            TX packets:24 errors:0 dropped:0 overruns:0 carrier:0

                            collisions:0 txqueuelen:0

                            RX bytes:14979 (14.6 KiB)  TX bytes:14979 (14.6 KiB)

                   

                   

                  xenbr0    Link encap:Ethernet  HWaddr 00:25:90:C9:31:44

                            inet6 addr: fe80::225:90ff:fec9:3144/64 Scope:Link

                            UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1

                            RX packets:0 errors:0 dropped:0 overruns:0 frame:0

                            TX packets:6 errors:0 dropped:0 overruns:0 carrier:0

                            collisions:0 txqueuelen:0

                            RX bytes:0 (0.0 b)  TX bytes:468 (468.0 b)

                   

                   

                  xenbr1    Link encap:Ethernet  HWaddr 00:25:90:C9:31:45

                            inet addr:192.168.50.1  Bcast:192.168.50.255  Mask:255.255.255.0

                            inet6 addr: fe80::225:90ff:fec9:3145/64 Scope:Link

                            UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1

                            RX packets:732 errors:0 dropped:0 overruns:0 frame:0

                            TX packets:389 errors:0 dropped:0 overruns:0 carrier:0

                            collisions:0 txqueuelen:0

                            RX bytes:154228 (150.6 KiB)  TX bytes:52753 (51.5 KiB)

                   

                   

                  [root@xenserver6 ~]#

                   

                  Xencenter-NICs.JPG.jpg

                  1 of 1 people found this helpful
                    • Re: Trouble with ConnectX-3 VPI adapter card over XenServer 6.2 (Service Pack 1)

                      Hi, I have tried with MLNX_OFED_LINUX-2.2-1.0.1-xenserver6.x-i686 and the firmware update, but I have obtained the same results that was shown on the first post. So I have some questions:

                      Did you use XenServer 6.2 (Service Pack 1)?

                      What's the command used to install the ofed? (I have used ./mlnxofedinstall --force-firmware-update)

                      Did you use another additional driver or only the mlnx_ofed software stack?

                       

                       

                      Thank you in advance.

                        • Re: Trouble with ConnectX-3 VPI adapter card over XenServer 6.2 (Service Pack 1)

                          Yes, XenServer SP1 with all updates. And just by running ./mlnxofedinstall

                           

                          Might I ask did you install the additional packages required by the driver? Before you install MLNX_OFED 2.2.

                          yum install pciutils python libxml2-python libnl expat glib2 tcl bc libstdc++ tk



                            • Re: Trouble with ConnectX-3 VPI adapter card over XenServer 6.2 (Service Pack 1)

                              Hi. I have same problem. can you help me?

                              I also use XenServer 6.2 SP1 and installed MLNX_OFED_LINUX-2.2-1.0.1-xenserver6.x-i686.

                              but my server couldn't find ib0.

                               

                              heres some output :

                              [root@Epiclesis MLNX_OFED_LINUX-2.2-1.0.1-xenserver6.x-i686]# ./mlnxofedinstall --fw-update-only

                              Logs dir: /tmp/MLNX_OFED_LINUX-2.2-1.0.1.10272.logs

                              Attempting to perform Firmware update...

                              Querying Mellanox devices firmware ...

                               

                               

                              Device #1:

                              ----------

                               

                               

                                Device Type:      InfiniHostIIILx

                                Part Number:      --

                                Description:     

                                PSID:            

                                PCI Device Name:  0000:07:00.0

                                Versions:         Current        Available    

                                   FW             --                          

                               

                               

                                Status:           Failed to open device

                               

                               

                              ---------

                              -E- Failed to query 0000:07:00.0 device, error : No such file or directory MFE_OLD_DEVICE_TYPE

                               

                               

                              Log File: /tmp/MLNX_OFED_LINUX-2.2-1.0.1.10272.logs/fw_update.log

                              Failed to update Firmware.

                              See /tmp/MLNX_OFED_LINUX-2.2-1.0.1.10272.logs/fw_update.log

                              To load the new driver, run:

                              /etc/init.d/openibd restart

                              [root@Epiclesis MLNX_OFED_LINUX-2.2-1.0.1-xenserver6.x-i686]# mst start

                              Starting MST (Mellanox Software Tools) driver set

                              [warn] mst_pci is already loaded, skipping

                              [warn] mst_pciconf is already loaded, skipping

                              Create devices

                              -W- Missing "lsusb" command, skipping MTUSB devices detection

                              [root@Epiclesis MLNX_OFED_LINUX-2.2-1.0.1-xenserver6.x-i686]# mst status

                              MST modules:

                              ------------

                                  MST PCI module loaded

                                  MST PCI configuration module loaded

                               

                               

                              MST devices:

                              ------------

                              /dev/mst/mt25204_pciconf0        - PCI configuration cycles access.

                                                                 domain:bus:dev.fn=0000:07:00.0 addr.reg=88 data.reg=92

                                                                 Chip revision is: A0

                              /dev/mst/mt25204_pci_cr0         - PCI direct access.

                                                                 domain:bus:dev.fn=0000:07:00.0 bar=0xc4100000 size=0x100000

                                                                 Chip revision is: A0

                              [root@Epiclesis MLNX_OFED_LINUX-2.2-1.0.1-xenserver6.x-i686]# flint -d /dev/mst/mt25204_pci_cr0 q

                              -E- Cannot open Device: /dev/mst/mt25204_pci_cr0. Operation not permitted MFE_OLD_DEVICE_TYPE

                              [root@Epiclesis MLNX_OFED_LINUX-2.2-1.0.1-xenserver6.x-i686]# hca_self_test.ofed

                               

                               

                              ---- Performing Adapter Device Self Test ----

                              Number of CAs Detected ................. 1

                              PCI Device Check ....................... PASS

                              Kernel Arch ............................ i686

                              Host Driver Version .................... MLNX_OFED_LINUX-2.2-1.0.1 (OFED-2.2-1.0.0): 2.6.32.43-0.4.1.xs1.8.0.847.170785xen

                              Host Driver RPM Check .................. PASS

                              Firmware on CA #0 HCA .................. v1.2.0

                              Firmware Check on CA #0 (HCA) .......... NA

                                  REASON: NO required fw version

                              Host Driver Initialization ............. PASS

                              Number of CA Ports Active .............. 1

                              Port State of Port #1 on CA #0 (HCA)..... UP 4X (InfiniBand)

                              Error Counter Check on CA #0 (HCA)...... PASS

                              Kernel Syslog Check .................... PASS

                              Node GUID on CA #0 (HCA) ............... 00:08:f1:04:03:99:2c:d4

                              ------------------ DONE ---------------------

                               

                               

                              [root@Epiclesis MLNX_OFED_LINUX-2.2-1.0.1-xenserver6.x-i686]# ibstat

                              CA 'mthca0'

                                CA type: MT25204

                                Number of ports: 1

                                Firmware version: 1.2.0

                                Hardware version: a0

                                Node GUID: 0x0008f10403992cd4

                                System image GUID: 0x0008f10403992cd7

                                Port 1:

                                State: Active

                                Physical state: LinkUp

                                Rate: 10

                                Base lid: 2

                                LMC: 0

                                SM lid: 1

                                Capability mask: 0x02510a68

                                Port GUID: 0x0008f10403992cd5

                                Link layer: InfiniBand

                              [root@Epiclesis MLNX_OFED_LINUX-2.2-1.0.1-xenserver6.x-i686]# lspci|grep Mella

                              07:00.0 InfiniBand: Mellanox Technologies MT25204 [InfiniHost III Lx HCA] (rev a0)


                              I really appreciate any help. thanks.

                              • Re: Trouble with ConnectX-3 VPI adapter card over XenServer 6.2 (Service Pack 1)

                                Hello Iliyasa,

                                 

                                yes, I installed the additional packages required by the driver, I used the CentOS Base.repo for this, is that correct?

                                 

                                There seems to be a problem loading the mlx4_core module, dmesg looks like this:

                                [ 2668.552338] mlx4_core: Mellanox ConnectX core driver v1.1 (Apr 29 2014)

                                [ 2668.552341] mlx4_core: Initializing 0000:01:00.0

                                [ 2668.552429] mlx4_core 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16

                                [ 2668.552538] mlx4_core 0000:01:00.0: setting latency timer to 64

                                [ 2674.974496] mlx4_core 0000:01:00.0: get owner: 7ff0

                                [ 2674.976285] mlx4_core 0000:01:00.0: irq 1265 (313) for MSI/MSI-X

                                [ 2674.976287] mlx4_core 0000:01:00.0: get owner: 7ff0

                                [ 2674.977975] mlx4_core 0000:01:00.0: irq 1264 (312) for MSI/MSI-X

                                [ 2674.977977] mlx4_core 0000:01:00.0: get owner: 7ff0

                                [ 2674.979637] mlx4_core 0000:01:00.0: irq 1263 (311) for MSI/MSI-X

                                [ 2674.979639] mlx4_core 0000:01:00.0: get owner: 7ff0

                                [ 2674.981309] mlx4_core 0000:01:00.0: irq 1262 (310) for MSI/MSI-X

                                [ 2674.981311] mlx4_core 0000:01:00.0: get owner: 7ff0

                                [ 2674.982980] mlx4_core 0000:01:00.0: irq 1261 (309) for MSI/MSI-X

                                [ 2674.982982] mlx4_core 0000:01:00.0: get owner: 7ff0

                                [ 2674.984659] mlx4_core 0000:01:00.0: irq 1260 (308) for MSI/MSI-X

                                [ 2674.984661] mlx4_core 0000:01:00.0: get owner: 7ff0

                                [ 2674.986341] mlx4_core 0000:01:00.0: irq 1259 (307) for MSI/MSI-X

                                [ 2674.986343] mlx4_core 0000:01:00.0: get owner: 7ff0

                                [ 2674.988054] mlx4_core 0000:01:00.0: irq 1258 (306) for MSI/MSI-X

                                [ 2674.988056] mlx4_core 0000:01:00.0: get owner: 7ff0

                                [ 2674.989742] mlx4_core 0000:01:00.0: irq 1257 (305) for MSI/MSI-X

                                [ 2674.989744] mlx4_core 0000:01:00.0: get owner: 7ff0

                                [ 2674.991413] mlx4_core 0000:01:00.0: irq 1256 (304) for MSI/MSI-X

                                [ 2735.064254] mlx4_core 0000:01:00.0: command CONF_SPECIAL_QP (0x23) timed out: in_param=0x0, in_mod=0x40, op_mod=0x0, get_status err=0, status_reg=0x23006000, go_bit=0, t_bit=1, toggle=0x0

                                [ 2735.064266] mlx4_core 0000:01:00.0: Failed to initialize queue pair table (err=1), aborting.

                                [ 2735.106905] mlx4_core 0000:01:00.0: get owner: 7ff0

                                [ 2735.108107] mlx4_core 0000:01:00.0: get owner: 7ff0

                                [ 2735.109302] mlx4_core 0000:01:00.0: get owner: 7ff0

                                [ 2735.110502] mlx4_core 0000:01:00.0: get owner: 7ff0

                                [ 2735.111699] mlx4_core 0000:01:00.0: get owner: 7ff0

                                [ 2735.112901] mlx4_core 0000:01:00.0: get owner: 7ff0

                                [ 2735.114100] mlx4_core 0000:01:00.0: get owner: 7ff0

                                [ 2735.115299] mlx4_core 0000:01:00.0: get owner: 7ff0

                                [ 2735.116498] mlx4_core 0000:01:00.0: get owner: 7ff0

                                [ 2735.117701] mlx4_core 0000:01:00.0: get owner: 7ff0

                                 

                                Have you had any similar problem?

                                 

                                Thanks in advance.