6 Replies Latest reply on May 17, 2017 12:17 PM by sophie

    Host Driver Initialization error (FAIL)

    guodong

      Hello Mellanox Support,

      I installed a fresh Ubuntu 16.04.02 LTS, and the Mellanox OFED 16.04 Linux driver from iso file.

      The installation is smooth and can proceed successfully.

      However, when I run hca_self_test.ofed, it reports:

      "Host Driver Initialization FAILed"

       

      run sudo /etc/init.d/openibd restart, it also reports loading driver error.

       

      Please see enclosed the error pictures.

       

      and the sys log as requested.

       

      Please advise.

        • Re: Host Driver Initialization error (FAIL)
          sophie

          Hi Guodong,

           

          Would you happened to have a service contract with Mellanox?

           

          Thank you,

          Sophie.

          • Re: Host Driver Initialization error (FAIL)
            guodong

            Hello Sophie,

             

            We have recently purchased some Connectx-3 boards through your distributor. but there is no service contract.

             

            Could you elaborate more on how your service contract work?

             

            It was in my understanding the Support Community is here to support customers?

             

            Thank you.

            Mei Guodong

            • Re: Host Driver Initialization error (FAIL)
              sophie

              Hi Guodong,

               

              Indeed our Community website is to assist our customers though if you would like to inquire further about our support contract options, please send an email to "contracts@mellanox.com".

               

              Regards,

              Sophie.

              • Re: Host Driver Initialization error (FAIL)
                sophie

                Hi Guodong,

                 

                Looking at the sysinfo-snapshot provided, none of our modules are loaded.

                IE:

                # lsmod | egrep -i "ib|mlx*"

                ib_ucm                 22642  0

                ib_ipoib              159750  0

                ib_cm                  52470  3 ib_ucm,rdma_cm,ib_ipoib

                ib_uverbs              71505  2 rdma_ucm,ib_ucm

                ib_umad                22283  6

                mlx4_en               134317  0

                mlx4_ib               193439  0

                mlx4_core             353345  2 mlx4_en,mlx4_ib

                mlx5_ib               188935  0

                ib_core               250100  10 rdma_ucm,ib_ucm,rdma_cm,iw_cm,ib_ipoib,ib_cm,ib_uverbs,ib_umad,mlx4_ib,mlx5_ib

                ipv6                  361510  69 bridge,ip6t_REJECT,rdma_cm,ib_ipoib,ib_core

                mlx5_core             547647  1 mlx5_ib

                mlx_compat             17075  14 rdma_ucm,ib_ucm,rdma_cm,iw_cm,ib_ipoib,ib_cm,ib_uverbs,ib_umad,mlx4_en,mlx4_ib,mlx4_core,mlx5_ib,ib_core,mlx5_core

                ptp                    18580  3 mlx4_en,mlx5_core,igb

                libahci                32073  1 ahci

                libsas                 84132  1 isci

                scsi_transport_sas     40863  2 isci,libsas

                 

                The syslog file reports odd messages on the drivers version upon loading then fail to load:

                 

                May 13 11:30:56 dynamicc4 openibd[908]: Loading Mellanox MLX4 HCA driver:#033[60G[#033[1;31mFAILED#033[0;39m]

                May 13 11:30:56 dynamicc4 openibd[908]: Loading Mellanox MLX4_IB HCA driver:#033[60G[#033[1;31mFAILED#033[0;39m]

                May 13 11:30:56 dynamicc4 openibd[908]: Loading Mellanox MLX4_EN HCA driver:#033[60G[#033[1;31mFAILED#033[0;39m]

                May 13 11:30:56 dynamicc4 openibd[908]: Loading Mellanox MLX5 HCA driver:#033[60G[#033[1;31mFAILED#033[0;39m]

                May 13 11:30:56 dynamicc4 openibd[908]: Loading Mellanox MLX5_IB HCA driver:#033[60G[#033[1;31mFAILED#033[0;39m]

                 

                Can you validate the modules versions:

                 

                IE:

                # modinfo mlx4_core | grep -i version

                version:        4.0-2.0.0

                srcversion:     8D664781D9FEAD80E98F82E

                vermagic:       3.10.105-1.el6.elrepo.x86_64 SMP mod_unload modversions

                 

                Also, can you compare the srcversion between the modinfo and the actual modules, they should be the same:

                 

                IE:

                # cat /sys/module/mlx4_core/srcversion

                8D664781D9FEAD80E98F82E

                 

                I would also suggest to verify the content of your initramfs image (lsinitrd), check modules and versions of our MLX drivers.

                Make sure all Inbox drivers have been removed.

                 

                Sophie.

                  • Re: Host Driver Initialization error (FAIL)
                    guodong

                    Hi Sophie,

                     

                    Thank you for your kind support!

                    this case happened when I installed the driver from driver source code (install.pl), although the installation procedure was smooth, probably it did not load the drivers successfully. that is why it reports the error.

                    I made a re-freshed installation from mlnxofedinstall script, it works and I can use the card now.

                    Please consider this case closed.

                    BTW, one more question, for the linux drivers for infiniband, are these drivers provided by Mellanox in the package, or is it from the Linux distribution?

                    I checked the /lib/modules/kernels/drivers/infiniband/, it seems these kernel drivers' date are old(not on the compilation date), so I think it is not from source code re-build?

                     

                    Thank you

                    Mei Guodong

                  • Re: Host Driver Initialization error (FAIL)
                    sophie

                    Hi Guodong,

                     

                    You are very welcome.

                    The /lib/modules/<kernel>/extra/mlnx-ofa_kernel/drivers/infiniband/core are the modules provided by Mellanox OFED Driver and in use when you install our drivers.

                    The /lib/modules/kernel/drivers/infiniband/core are the modules from the Inbox driver (embedded into the OS) and are no longer in use by the Kernel though originally compiled from initial installation of the OS/Kernel.

                     

                    Regards,

                    Sophie.