2 Replies Latest reply on Mar 29, 2016 12:41 PM by alkx

    Mellanox Infiniband + mmap() + MPI one-sided communication fails when DAPL UD enabled

    kjlee

      Hi!

      I used a trick in order to read a page located in a remote machine's disk.
      (using mmap() over the whole file in each machine and creating MPI_one_sided communication windows on it)

      It works fine when DAPL UD disabled but it spits the following error messages if I enable DAPL UD by setting 'I_MPI_DAPL_UD=1'.

      XXX001:UCM:1d1a:84d2ab40: 271380 us(271380 us):  DAPL ERR reg_mr Cannot allocate memory

      [0:XXX001] rtc_register failed 196608 [0] error(0x30000):  unknown error

      Assertion failed in file ../../src/mpid/ch3/channels/nemesis/netmod/dapl/dapl_send_ud.c at line 1468: 0

      internal ABORT - process 0

      XXX002:UCM:31e2:27bacb40: 263683 us(263683 us):  DAPL ERR reg_mr Cannot allocate memory

      [1:XXX002] rtc_register failed 196608 [1] error(0x30000):  unknown error

      Assertion failed in file ../../src/mpid/ch3/channels/nemesis/netmod/dapl/dapl_send_ud.c at line 1468: 0

       

      Pleased refer to the attached file for the code I used.

       

      and I ran above program with following flags enabled:

       

      export I_MPI_FABRICS=dapl

      export I_MPI_DAPL_UD=1

       

      command: mpiexec.hydra  -genvall -machinefile ~/machines -n 2 -ppn 1 ${PWD}/test2

       

       

       

      Here are my general questions:

       

      (1) When the window over mmaped region is created, does the ib driver try to pin the whole memory region to prevent page faults?

       

      (2) Is the behavior when ib driver tries to register the memory region different depending on whether DAPL UD enabled/disabled?

       

       

       

       

      Experimental Environment:

       

      Hardware Spec:
      OS : CentOS 6.4 Final
      CPU : 2 * Intel® Xeon® CPU E5-2450 @ (2.10GHz, 8 physical cores)
      RAM : 32GB per each
      Ethernet: InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE]

      Mellanox Infiniband driver: MLNX_OFED_LINUX-3.1-1.1.0.1 (OFED-3.1-1.1.0): 3.19.0

       

      thanks,