2 Replies Latest reply on Jul 29, 2016 3:02 PM by promanov

    ConnectX-4 LX: NFSoRDMA failures in MOFED-3.3

    promanov

      Hi,

       

      CentOS-7.2: While trying NFS over RDMA in the new MOFED-3.3 run into the following kernel messages and NFS client failure (I think NFS works one way for client able to read but not write ;)

       

      [ 1100.982756] mlx5_warn:mlx5_0:mlx5_ib_post_send:4163:(pid 6267): Failed to prepare FAST_REG_MR WQE

      [ 1100.982761] svcrdma: Error -12 posting RDMA_READ

      [ 1100.993392] mlx5_warn:mlx5_0:mlx5_ib_post_send:4163:(pid 6268): Failed to prepare FAST_REG_MR WQE

      [ 1100.993396] svcrdma: Error -12 posting RDMA_READ

      [ 1106.022300] mlx5_warn:mlx5_0:mlx5_ib_post_send:4163:(pid 6267): Failed to prepare FAST_REG_MR WQE

      [ 1106.022306] svcrdma: Error -12 posting RDMA_READ

      [ 1116.038002] mlx5_warn:mlx5_0:mlx5_ib_post_send:4163:(pid 6268): Failed to prepare FAST_REG_MR WQE

      [ 1116.038018] svcrdma: Error -12 posting RDMA_READ

      [ 1136.070196] mlx5_warn:mlx5_0:mlx5_ib_post_send:4163:(pid 6267): Failed to prepare FAST_REG_MR WQE

      [ 1136.070202] svcrdma: Error -12 posting RDMA_READ

      [ 1166.085986] mlx5_warn:mlx5_0:mlx5_ib_post_send:4163:(pid 6268): Failed to prepare FAST_REG_MR WQE

      [ 1166.085991] svcrdma: Error -12 posting RDMA_READ

       

      Same setup works fine with ConnectX-3. Also, I think, I observed ConnectX-4 working OK earlier with either MOFED-3.2 or kernel.org (~4.6) versions of the drivers.

       

      Any suggestions on it?

       

      Regards,

       

          Philip

        • Re: ConnectX-4 LX: NFSoRDMA failures in MOFED-3.3
          sophie

          Hi Philip,

           

          Did you install on client and server Mellanox OFED version 3.3-1.0.4.0?

          What is the kernel revision running on client and server?

          Could you confirm the FW on the HCA's (both side client & server).

          Putting aside NFS, are you able to validate an RDMA connection between client and server?

           

          Regards,

          Sophie.

            • Re: ConnectX-4 LX: NFSoRDMA failures in MOFED-3.3
              promanov

              Hi, Sophie, thank you for reply: noticed that you are referring to MOFED 3.3-1.0.4.0 -- while problem what observed on MOFED-3.3-1.0.0.0 -- upgraded the OFED distro and NFSoRDMA works fine on ConnectX-4 Lx now.

               

              Original problem was observed on CentOS 7.2, 3.10.0-327.el7.x86_64, OFED 3.3-1.0.0.0 & ConnectX-4 LX NICs (server & client identical systems).

               

              Thanks & Regards,

               

                 Philip Romanov