5 Replies Latest reply on Mar 25, 2016 2:35 PM by paklui

    OpenMPI MXM problem

    nsmeds

      I am probably making a stupid error, but I don't really know where I should look.

       

      This is all on RHEL 6.5.

       

      I have previously used both HPC-X and compiled OpenMPI against libmxm (yalla driver).

      HPC-X 1.3.336 works well for me.

       

      Now I am trying to install HPC-X 1.5.370  and also compile OpenMPI 1.10.2. All efforts have resulted in
      code that hangs shortly after MPI_Init(). I compile Intel IMB benchmark and run it on 2 tasks using the
      yalla driver and it hangs in the first MPI_Bcast() which is the first communicating routine after the initial
      setup (MPI_Init/MPI_Comm_size/MPI_Comm_rank).


      If disable libmxm and use "-mca pml ob1 -mca btl openib,self,sm" the program runs correctly.

       

      I have tried two different versions of libmxm

      HPC-X 1.3.336:  MXM_VERNO_STRING "3.3.3055"

      HPC-X 1.5.370:  MXM_VERNO_STRING "3.4.3079"

       

      If I build OpenMPI 1.10.2 using v 3.3 of mxm I get a working implementation with yalla.

      If I use HPC-X 1.3.336 everything also works fine with yalla

      If I run HPC-X 1.5.370 or if I build OpenMPI 1.10.2 against the 3.4 version of mxm I get the problem.

       

      The software installed in /opt/mellanox and related software is at the same level as HPC-X 1.5.370.

       

      Anyone on this list that has suggestion what may be my problem and/or how to diagnose it?

        • Re: OpenMPI MXM problem
          paklui

          Hi Nils,

          I think you can try the IMB test that is included in HPC-X that is in $HPCX_MPI_TESTS_DIR

          mpirun -mca pml yalla -np 2 ${HPCX_MPI_TESTS_DIR}/imb/IMB-MPI1

          Could it possibly be that there are other MPI libraries in your LD_LIBRARY_PATH?

          Have you tried running 2 processes on a single node, and would that work?

            • Re: OpenMPI MXM problem
              nsmeds

              Same behaviour using the version of IMB provided in the gcc version of HPC-X 1.5.370.

              Runs fine on a single node, but hangs if scheduled across two nodes.

               

              The LD_LIBRARY_PATH only contains pointers to HPC-X (line changes inserted by me below for reeadability

              LD_LIBRARY_PATH=/lsf/9.1/linux2.6-glibc2.3-x86_64/lib:
              /hpc/base/ctt/packages/hpcx/1.5.370/gcc/hcoll/lib:
              /hpc/base/ctt/packages/hpcx/1.5.370/gcc/fca/lib:
              /hpc/base/ctt/packages/hpcx/1.5.370/gcc/mxm/lib:
              /hpc/base/ctt/packages/hpcx/1.5.370/gcc/ompi-v1.10/lib
                • Re: OpenMPI MXM problem
                  alkx

                  Does the same failure if pre-compiled OpenMPI used?

                  Did you try to avoid LSF and run mpirun directly? Maybe allocate nodes used LSF and the from the other terminal run the MPI job.

                  Try to add LD_LIBRARY_PATH path to mpirun command, something like mprin -x LD_LIBRARY_PATH

                    • Re: OpenMPI MXM problem
                      nsmeds

                      Does the same failure if pre-compiled OpenMPI used?

                       

                      Yes - that is what I meant by running HPC-X

                       

                       

                      Did you try to avoid LSF and run mpirun directly? Maybe allocate nodes used

                      LSF and the from the other terminal run the MPI job.

                       

                      No I did not try that. What kind of incompatibility between LSF and libmxm

                      are you suggesting would demonstrate the behaviour I see?

                       

                       

                      Try to add LD_LIBRARY_PATH path to mpirun command, something like mprin -x

                      LD_LIBRARY_PATH

                       

                      That is how I always run. It is the only way to be at least somewhat

                      certain what is going to be used during execution.

                       

                      It is also the only way to be able to have several different runs with

                      different set-up queued and be able to know the

                       

                      circumstances on how they were executed.

                • Re: OpenMPI MXM problem
                  paklui

                  Hi Nils,

                  Turns out that you need to add this "-x MXM_OOB_FIRST_SL=0" to your mpirun on your cluster.

                  Otherwise if you do a pstack on a process which it is hung, you would find that the process is hung in some routine in hcoll, because hcoll used the pml for OOB messaging.

                  Anyway, this seems to work works for me:

                  $ mpirun -np 2 -host nxt0111,nxt0110 -x MXM_OOB_FIRST_SL=0 ${HPCX_MPI_TESTS_DIR}/imb/IMB-MPI1