Bring Up Ceph RDMA - Developer's Guide

Version 50

    This post provides bring up examples for Ceph RDMA cluster.





    1. (Optional): Install the latest MLNX_OFED and restart openibd driver.


    2. Insure that rping is running between all nodes

         Server: rping –s –v server_ip

         Client: rping –c –v –a server_ip



    1. Get latest stable CEPH version with RDMA support from the following branch:

    This version is based on luminous 12.1.0 RC.


    2. Compile:


    # ./


    # cd build

    # make -j8

    # sudo make install


    3. Insure that your version has RDMA:

    # strings /usr/bin/ceph-osd |grep -i rdma


    4. Kill all Ceph processes on all nodes:

    # sudo systemctl stop

    # sudo systemctl stop


         Or by using "kill" command


    5. Insure that all Ceph processes are down on every ceph node:

    # ps aux |grep ceph

    6. Bring up ceph in tcp mode (default Async messenger)


    7. Test that CEPH is up and running

         # ceph -s


    8. Turn down all ceph processes


    9. Add to your Ceph conf under [global] section:

    // for setting frontend and backend to RDMA

    ms_type = async+rdma


    // for setting backend only to RDMA

    ms_cluster_type = async+rdma


    //set a device name according to IB or ROCE device used, e.g.

    ms_async_rdma_device_name = mlx5_0


    // for better performance if using LUMINOUS 12.2.x release

    ms_async_rdma_polling_us = 0


    //Set local GID for ROCEv2 interface used for CEPH

    //The GID corresponding to IPv4 or IPv6 networks

    //should be taken from show_gids command output

    //This parameter should be uniquely set per OSD server/client

    //Not defining this parameter limits the network to RoCEv1

    //That means no routing and no congestion control (ECN)



    You can get the GID index using show_gids script, see Understanding show_gids Script .


    10. Update the configuration file in all Ceph nodes.


    11. If you are using systemd services:

    11.1     Validate that the following parameters are set in relevant systemd files in /usr/lib/systemd/system/:

















    Note, in case you modify systemd configuration for Ceph-mon/Ceph-osd you may need to run the below:

    # systemctl daemon-reload


    11.2     Restart all cluster processes on the monitor node:

    # sudo systemctl start //also starts ceph-mgr

    # sudo systemctl start


    On the OSD nodes:

    # sudo systemctl start


    # for i in `sudo ls /var/lib/ceph/osd/  | cut -d -  -f 2` ;do sudo systemctl start ceph-osd@$i ;done


    12. For manual start up of CEPH processes

    12.1     Open /etc/security/limits.conf and add the following lines. The RDMA is tightly coupled with the physical memory address.

    * soft memlock unlimited
    * hard memlock unlimitedroot

    soft memlock unlimitedroot

    hard memlock unlimited

    12.2     Run the processes

         On the monitor node

    # sudo /usr/bin/ceph-mon --cluster ceph --id clx-ssp-056 --setuser ceph --setgroup ceph

    # sudo /usr/bin/ceph-mgr --cluster ceph --id clx-ssp-056 --setuser ceph --setgroup ceph


         On the OSD nodes

    # for i in `sudo ls /var/lib/ceph/osd/  | cut -d -  -f 2` ;do sudo /usr/bin/ceph-osd --cluster ceph --id  $i --setuser ceph --setgroup ceph &  done



    1. Check health:

    # ceph -s


    2. Check RDMA is working as expected.


    The following command can show whether RDMA traffic occurs on server hosting osd.0:

    # ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok config show | grep ms_type

    "ms_type": "async+rdma"



    # ceph daemon osd.0 perf dump AsyncMessenger::RDMAWorker-1


        "AsyncMessenger::RDMAWorker-1": {

    "tx_no_mem": 0,

    "tx_parital_mem": 0,

    "tx_failed_post": 0,

    "rx_no_registered_mem": 0,

      "tx_chunks": 30063062,

    "tx_bytes": 1512924920228,

    "rx_chunks": 23115500,

    "rx_bytes": 480212597532,

    "pending_sent_conns": 0



    Known issues

    VersionKnown issueSolution/WA

    ceph pg dump

    ceph osd df tree

    fail with the following error:

    Error EACCES: access denied' does your client key have mgr caps?


    to resolve the issue please run:

    ceph auth caps client.admin osd 'allow *' mds 'allow *' mon 'allow *' mgr 'allow *'