HowTo Compile MLNX_OFED for Different Linux Kernel Distribution  

Version 3

    In many cases servers are installed not with vanilla Linux OS distributions, but with variants of those distributions.

    This post shows how to compile and install MLNX_OFED for a non-vanilla different kernel.

     

    References

     

    Setup

    In this example, we have RHEL 7.1 but with a different kernel.

    While inbox RHEL 7.1 will come with kernel version  3.10.0-229.el7.x86_64, we happen to have a server with kernel version 3.10.0-229.11.1.el7.x86_64.

    # uname -a

    Linux mti-mar-s5 3.10.0-229.11.1.el7.x86_64 #1 SMP Thu Aug 6 01:06:18 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

     

    In this case MLNX_OFED 3.2 was installed with errors.

     

    Observations

    Trying to install MLNX_OFED for RHEL7.1 might not go well. Below are a couple of scenarios indicating a non-successful MLNX_OFED installation.

    1. After installation, driver restart fails.

    # /etc/init.d/openibd restart

    Unloading HCA driver:                                      [  OK  ]

    Module mlx4_core belong to kernel which is not a part of ML[FAILED] skipping...

    Module mlx4_ib belong to kernel which is not a part of MLNX[FAILED]kipping... 

    Module mlx4_en belong to kernel which is not a part of MLNX[FAILED]kipping... 

    Module mlx5_core belong to kernel which is not a part of ML[FAILED] skipping...

    Module mlx5_ib belong to kernel which is not a part of MLNX[FAILED]kipping... 

    Module ib_umad belong to kernel which is not a part of MLNX[FAILED]kipping... 

    Module ib_uverbs belong to kernel which is not a part of ML[FAILED] skipping...

    Module ib_ipoib belong to kernel which is not a part of MLN[FAILED]skipping...

    Loading HCA driver and Access Layer:                       [  OK  ]           

    Module rdma_cm belong to kernel which is not a part of MLNX[FAILED]kipping... 

    Module ib_ucm belong to kernel which is not a part of MLNX_[FAILED]ipping...  

    Module rdma_ucm belong to kernel which is not a part of MLN[FAILED]skipping...

     

    2. The kernel modules were not loaded correctly.

    You can see that the module info (modinfo) of the driver is under the kernel (and not under the "extra" directory).

    # modinfo mlx4_core   | head

    filename: /lib/modules/3.10.0-229.11.1.el7.x86_64/kernel/drivers/net/ethernet/mellanox/mlx4/mlx4_core.ko

    version: 2.2-1

    license:        Dual BSD/GPL

    description:    Mellanox ConnectX HCA low-level driver                                                       

    author:         Roland Dreier                                                                                

    rhelversion: 7.1

    srcversion: 5151013CD3D852DD21274AB                                                                      

    alias:          pci:v000015B3d00001010sv*sd*bc*sc*i*

     

    Or, for example, it is located under the weak_updates directory.

    # modinfo mlx4_core | head

    filename:       /lib/modules/3.10.0-229.11.1.el7.x86_64/weak-updates/mlnx-ofa_kernel/drivers/net/ethernet/mellanox/mlx4/mlx4_core.ko

    version:        3.2-1.0.1

    license:        Dual BSD/GPL

    description:    Mellanox ConnectX HCA low-level driver

    author:         Roland Dreier

    rhelversion:    7.1

    srcversion:     A77A50A91D147BDC3E48BA9

    alias:          pci:v000015B3d00001010sv*sd*bc*sc*i*

    alias:          pci:v000015B3d0000100Fsv*sd*bc*sc*i*

    alias:          pci:v000015B3d0000100Esv*sd*bc*sc*i*

     

    Configuration

    1. Uninstall the unsuccessful MLNX_OFED installation

    # ./uninstall.sh

     

    This program will uninstall all MLNX_OFED_LINUX-3.2-1.0.1.1 packages on your machine.

     

    Do you want to continue?[y/N]:y

     

    rpm -e --allmatches --nodeps  kmod-mlnx-ofa_kernel libmlx5 libsdp-devel ibutils2 knem-mlnx libibcm-devel opensm-devel dump_pr libibverbs ibsim mstflint mxm mpitests_openmpi libibverbs-utils librdmacm-utils srptools hcoll mlnx-ethtool kmod-knem-mlnx libibverbs-devel-static libibumad librdmacm opensm-static sdpnetstat ar_mgr openmpi mlnx-ofa_kernel kmod-isert libmlx4-devel libibmad libsdp dapl-devel-static infiniband-diags-compat mpitests_mvapich2 kmod-srp libmlx4 libibumad-static librdmacm-devel dapl-devel rds-tools infiniband-diags libibprof mlnx-ofa_kernel-devel libmlx5-devel libibmad-static opensm-libs perftest ibutils fca libibmad-devel dapl-utils qperf mlnxofed-docs libibverbs-devel ibacm mvapich2 kmod-kernel-mft-mlnx libibcm opensm cc_mgr kmod-iser libibumad-devel dapl ibdump rds-devel infiniband-diags-compat rds-tools infiniband-diags ofed-scripts mft kmod-kernel-mft-mlnx libmlx5-devel-1.0.2mlnx1-OFED.3.2.0.1.1.32101.x86_64 libibverbs-devel-1.1.8mlnx1-OFED.3.2.0.1.2.32101.x86_64 libibverbs-utils-1.1.8mlnx1-OFED.3.2.0.1.2.32101.x86_64 fca-2.5.2431-1.32101.x86_64 libibprof-1.1.22-1.32101.x86_64 mxm-3.4.3076-1.32101.x86_64 mvapich2-2.2a-1.32101.x86_64 hcoll-3.4.806-1.32101.x86_64 libibmad-devel-1.3.12.MLNX20151122.d140cb1-0.1.32101.x86_64 opensm-devel-4.6.1.MLNX20160112.774e977-0.1.32101.x86_64 opensm-static-4.6.1.MLNX20160112.774e977-0.1.32101.x86_64 libibumad-devel-1.3.10.2.MLNX20150406.966500d-0.1.32101.x86_64 libibumad-static-1.3.10.2.MLNX20150406.966500d-0.1.32101.x86_64 ibutils-1.5.7.1-0.12.gdcaeae2.32101.x86_64 infiniband-diags-1.6.6.MLNX20151130.7f0213e-0.1.32101.x86_64 infiniband-diags-compat-1.6.6.MLNX20151130.7f0213e-0.1.32101.x86_64 librdmacm-utils-1.0.21mlnx-OFED.3.1.1.5.5.32101.x86_64 librdmacm-devel-1.0.21mlnx-OFED.3.1.1.5.5.32101.x86_64 mlnx-ofa_kernel-devel-3.2-OFED.3.2.1.0.1.1.gc05c99f.rhel7u1.x86_64 libibmad-static-1.3.12.MLNX20151122.d140cb1-0.1.32101.x86_64 libsdp-devel-1.1.108-OFED.3.0.8.gfbd01df.32101.x86_64 libmlx4-devel-1.0.6mlnx1-OFED.3.2.0.1.1.32101.x86_64 opensm-4.6.1.MLNX20160112.774e977-0.1.32101.x86_64 libibcm-devel-1.0.5mlnx2-OFED.3.0.11.gd7d485d.32101.x86_64 dapl-devel-2.1.7mlnx-OFED.3.2.0.0.9.32101.x86_64 dapl-utils-2.1.7mlnx-OFED.3.2.0.0.9.32101.x86_64

    Uninstall finished successfully

     

    2. Compile and install OFED with kernel support.

    # ./mlnxofedinstall --add-kernel-support

    Note: This program will create MLNX_OFED_LINUX TGZ for rhel7.1 under /tmp/MLNX_OFED_LINUX-3.2-1.0.1.1-3.10.0-229.11.1.el7.x86_64 directory.

    See log file /tmp/MLNX_OFED_LINUX-3.2-1.0.1.1-3.10.0-229.11.1.el7.x86_64/mlnx_ofed_iso.6743.log

     

     

    Building OFED RPMS . Please wait...

    ...

     

    Verification

    1. Restart the driver and make sure that there are no errors.

    # /etc/init.d/openibd restart                                                                                                                     

    Unloading HCA driver:                                      [  OK  ]                                                                                                                                           

    Loading HCA driver and Access Layer:                       [  OK  ]  

     

    2. Check the modinfo for mellanox kernel modules. For example:

    # modinfo mlx4_core | head

    filename:       /lib/modules/3.10.0-229.11.1.el7.x86_64/extra/mlnx-ofa_kernel/drivers/net/ethernet/mellanox/mlx4/mlx4_core.ko

    version:        3.2-1.0.1

    license:        Dual BSD/GPL

    description:    Mellanox ConnectX HCA low-level driver

    author:         Roland Dreier

    rhelversion:    7.1

    srcversion:     A77A50A91D147BDC3E48BA9

    alias:          pci:v000015B3d00001010sv*sd*bc*sc*i*

    alias:          pci:v000015B3d0000100Fsv*sd*bc*sc*i*

    alias:          pci:v000015B3d0000100Esv*sd*bc*sc*i*

     

    Make sure that the filename used is located under the "extra" directory as above.