8 Replies Latest reply on Oct 13, 2014 4:46 PM by rc-support

    el5.10 ofed build problem.




      I updated to RHEL 5.10 and I'm trying to build ofed modules against the latest 2.6.18-371.1.2.el5 kernel.

      ./mlnx_add_kernel_support.sh -k 2.6.18-371.1.2.el5 -m /mnt/ofed --make-tgz


      Below is the list of OFED packages that you have chosen
      (some may have been added by the installer due to package dependencies):



      Uninstalling the previous version of OFED
      Build ofed-scripts RPM
      Running  rpmbuild --rebuild  --define '_topdir /tmp/mlnx_iso.14232/OFED_topdir' --define 'dist %{nil}' --target x86_64 --define '_prefix /usr' --define '_exec_prefix /usr' --define '_sysconfdir /etc' --define '_usr /usr' /tmp/mlnx_iso.14232/MLNX_OFED_SRC-1.5.3-4.0.42/SRPMS/ofed-scripts-1.5.3-OFED.
      Install ofed-scripts RPM:
      Running rpm -iv  /tmp/mlnx_iso.14232/MLNX_OFED_SRC-1.5.3-4.0.42/RPMS/redhat-release-5Server-
      Build ofa_kernel RPM
      Running rpmbuild --rebuild  --define '_topdir /tmp/mlnx_iso.14232/OFED_topdir' --nodeps --define '_dist .rhel5u10' --define 'configure_options   --with-core-mod --with-user_mad-mod --with-user_access-mod --with-addr_trans-mod --with-mthca-mod --with-mlx4-mod --with-nes-mod --with-qib-mod --with-ipoib-mod --with-sdp-mod --with-srp-mod' --define 'build_kernel_ib 1' --define 'build_kernel_ib_devel 1' --define 'KVERSION 2.6.18-371.1.2.el5' --define 'K_SRC /lib/modules/2.6.18-371.1.2.el5/build/' --define 'network_dir /etc/sysconfig/network-scripts' --define '_prefix /usr' --define '__arch_install_post %{nil}' /tmp/mlnx_iso.14232/MLNX_OFED_SRC-1.5.3-4.0.42/SRPMS/ofa_kernel-1.5.3-OFED.
      Failed to build ofa_kernel RPM
      See /tmp/OFED.14279.logs/ofa_kernel.rpmbuild.log


      And the error is:


      In file included from include/linux/inetdevice.h:7,
                       from /tmp/mlnx_iso.7969/OFED_topdir/BUILD/ofa_kernel-1.5.3/kernel_addons/backport/2.6.18-EL5.7/include/linux/inetdevice.h:4,
                       from /tmp/mlnx_iso.7969/OFED_topdir/BUILD/ofa_kernel-1.5.3/drivers/infiniband/core/addr.c:37:
      /tmp/mlnx_iso.7969/OFED_topdir/BUILD/ofa_kernel-1.5.3/kernel_addons/backport/2.6.18-EL5.7/include/linux/netdevice.h:25: error: conflicting types for 'netif_is_bond_slave'
      include/linux/netdevice.h:884: error: previous definition of 'netif_is_bond_slave' was here
      make[4]: *** [/tmp/mlnx_iso.7969/OFED_topdir/BUILD/ofa_kernel-1.5.3/drivers/infiniband/core/addr.o] Error 1
      make[3]: *** [/tmp/mlnx_iso.7969/OFED_topdir/BUILD/ofa_kernel-1.5.3/drivers/infiniband/core] Error 2
      make[2]: *** [/tmp/mlnx_iso.7969/OFED_topdir/BUILD/ofa_kernel-1.5.3/drivers/infiniband] Error 2
      make[1]: *** [_module_/tmp/mlnx_iso.7969/OFED_topdir/BUILD/ofa_kernel-1.5.3] Error 2
      make[1]: Leaving directory `/usr/src/kernels/2.6.18-371.1.2.el5-x86_64'
      make: *** [kernel] Error 2
      error: Bad exit status from /var/tmp/rpm-tmp.35818 (%build)


      What I'm doing wrong?




        • Re: el5.10 ofed build problem.

          I'm having the same problem. Did you fix it yet?




          • Re: el5.10 ofed build problem.

            Same problem here, on RHEL 5.10 x86_64.  Looks like the 2.6.18-371 kernel brought some header changes which MLNX_OFED_LINUX-1.5.3-4.0.42-rhel5.10-x86_64 was not prepared for.

            • Re: el5.10 ofed build problem.

              Your logs point to conflicts in headers between different kernels you have on the system. Please make sure you install all rpms (including kernel-devel and kernel-headers) corresponding to your target kernel and resolve conflicts before using add_kernel_support (use rpm with "--replacefiles"). Also if you are running add_kernel_support under target kernel (that is what I'd do) you can skip "-k 2.6.18-371.1.2.el5" in command line, it will take "uname -r" value itself.

                • Re: el5.10 ofed build problem.

                  I only have kernel-devel and kernel-headers for 2.6.18-371.1.2.el5 installed, so I don't think there's a conflict between different kernels I have installed.


                  The problem looks like in 2.6.18-371.1.2.el5 they added the definition of netid_is_bond_slave:



                  static inline bool netif_is_bond_slave(struct net_device *dev)


                          return dev->flags & IFF_SLAVE && dev->priv_flags & IFF_BONDING;



                  It doesn't match the "backported" version in ofa-kernel-1.5.3:




                  static inline int netif_is_bond_slave(struct net_device *dev)


                          return dev->flags & IFF_SLAVE && dev->priv_flags & IFF_BONDING;