2 Replies Latest reply on Mar 31, 2018 11:23 PM by samerka

    mlx4_core : Missing UAR, aborting

    semenmsu@gmail.com

      OS: Ubuntu 16.04.3 LTS

       

      $ lspci | grep Mellanox

      01:00.0 Ethernet controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]

       

      $ mstflint -d 01:00.0 q

      Image type:        FS2
      FW Version:        2.42.5000
      FW Release Date:   5.9.2017
      Product Version:   02.42.50.00
      Rom Info:          type=PXE version=3.4.752 devid=4103
      Device ID:         4103
      PSID:              MT_2340111023

       

      I have installed all drivers, ran  '/etc/init.d/mlnx-en.d retstart' , got :

      Unloading NIC driver: [ OK ]

      Loading NIC driver:     [ OK ]

       

      BUT can't see mellanox interface (ifconfig -a)

      when I type now 'dmesg | grep mlx' , I get:

      [0.991278] mlx_compat: loading out-of-tree module taints kernel.
      [0.991286] mlx_compat: module verification failed: signature and/or required key missing - tainting kernel
      [0.992456] mlx4_core: Mellanox ConnectX core driver v4.1-1.0.2 (27 Jun 2017)
      [0.992479] mlx4_core: Initializing 0000:01:00.0
      [0.992621] mlx4_core 0000:01:00.0: Missing UAR, aborting

       

      What is the problem here?

       

      P.S. I had no problem with this NIC on my computer, when used Ubuntu 14.04 LTS

        • Re: mlx4_core : Missing UAR, aborting
          yuriis

          Is this issue still actual to you?

          I had the same issue on ConnectX-3 Pro, caused by large number of VFs I put into the HCA configuration file.

           

          Symptoms (dmesg output):

           

          Mar 31 17:36:17 macpro kernel: pci 0000:06:00.0: [15b3:1007] type 00 class 0x028000

          Mar 31 17:36:17 macpro kernel: pci 0000:06:00.0: reg 0x10: [mem 0xf7900000-0xf79fffff 64bit]

          Mar 31 17:36:17 macpro kernel: pci 0000:06:00.0: reg 0x18: [mem 0xf4000000-0xf47fffff 64bit pref]

          Mar 31 17:36:17 macpro kernel: pci 0000:06:00.0: reg 0x30: [mem 0xf7800000-0xf78fffff pref]

          Mar 31 17:36:17 macpro kernel: pci 0000:06:00.0: reg 0x134: [mem 0x00000000-0x007fffff 64bit pref]

          Mar 31 17:36:17 macpro kernel: pci 0000:06:00.0: VF(n) BAR2 space: [mem 0x00000000-0x03ffffff 64bit pref] (contains BAR2 for 8 VFs)

          Mar 31 17:36:17 macpro kernel: pci 0000:06:00.0: BAR 2: no space for [mem size 0x00800000 64bit pref]

          Mar 31 17:36:17 macpro kernel: pci 0000:06:00.0: BAR 2: failed to assign [mem size 0x00800000 64bit pref]

          ...

          Mar 31 21:09:48 macpro kernel: mlx4_core: Mellanox ConnectX core driver v4.3-1.0.1

          Mar 31 21:09:48 macpro kernel: mlx4_core: Initializing 0000:06:00.0

          Mar 31 21:09:48 macpro kernel: mlx4_core 0000:06:00.0: enabling device (0100 -> 0102)

          Mar 31 21:09:48 macpro kernel: mlx4_core 0000:06:00.0: Missing UAR, aborting

          Solution:

           

          1. Collect FW-related information about your HCA, you need to know PSID

           

          # mstflint -d 06:00.0 q full

          Image type:            FS2

          FW Version:            2.42.8016

          FW Release Date:       21.3.2018

          MIC Version:           2.0.0

          Config Sectors:        2

          PRS Name:              cx3pro_MCX354A_fdr_09v.prs

          Rom Info:              type=PXE version=3.4.752

          Device ID:             4103

          ...

          PSID:                  MT_1090111019

           

          2. Download FW for your particular HCA and unpack it.

             - In my case it is  ConnectX3Pro-rel-2_42_6000-web.tgz

             - the archive contains set of *.ini files for different HCAs based on the same chip.

             - find the configuration file for your PSID:

           

          # grep MT_1090111019 *.ini

          MCX354A-FCC_Ax.ini:PSID = MT_1090111019

           

          3. Dump HCA configuration file, compare it to the original configuration file:

               # mstflint -d 06:00.0 dc current.ini

               # diff -u MCX354A-FCC_Ax.ini  current.ini

          4. Next, add a few parameters into the [HCA] section to disable sr_iov and reduce the number of VFs to 4 (safe value in my case):

          [HCA]

          hca_header_subsystem_id = 0x0003

          hca_header_device_id = 0x1007

          dpdp_en = true

          eth_xfi_en = true

          mdio_en_port1 = 0

          num_pfs = 1

          total_vfs = 4

          sriov_en = false

           

          5. Create the new firmware image, which includes the new configuration file

          # mlxburn -fw ./fw-ConnectX3Pro-rel.mlx -c current-sriov.ini \

             -wrimage ./fw-ConnectX3Pro-rel-2_42_8016-MCX354A-FCC_Ax-FlexBoot-3.4.752.bin

          6. Burn it

          mlxfwmanager -d /dev/mst/mt4103_pciconf0 -i fw-ConnectX3Pro-rel-2_42_8016-MCX354A-FCC_Ax-FlexBoot-3.4.752.bin -u

          7. Reboot and check that mlx4_core driver loads now for your HCA.

          • Re: mlx4_core : Missing UAR, aborting
            samerka

            Hi,

            The issue is related to PCI allocation in Ubuntu 16.04.3 (Ubuntu kernel cannot allocate enough memory to bring up network interfaces during boot)

            Please add the following line to grub2 :

            - edit /etc/default/grub

            - add GRUB_CMDLINE_LINUX_DEFAULT=" pci=realloc=off"

            - update-grub

            - Reboot

             

            Thanks,

            Samer

             

            1 of 1 people found this helpful