HowTo Setup RDMA Connection using Inbox Driver (RHEL, Ubuntu)

Version 27
    This post is showing how to raise a setup and enable RDMA via the Inbox driver for RHEL7 and Ubuntu 14.04

    Setup:

    • Make sure you have two servers equipped with Mellanox ConnectX-3 adapter cards
    • Connect the two servers via an Ethernet switch, you can use access port (VLAN 1 as default)
    • Install RHEL7 (upsteam kernel) or Ubuntu 14.04 OS on both servers

     

    RHEL Installation:

    Run the following installation commands on both servers:

    # yum -y groupinstall "InfiniBand Support"

    # yum -y install perftest infiniband-diags

                                         

    Make sure that RDMA is enabled on boot.

    # dracut --add-drivers "mlx4_en mlx4_ib mlx5_ib" -f

    # systemctl enable rdma

    Ubuntu Installation:

    Run the following installation commands on both servers:

    # apt-get install libmlx4-1 infiniband-diags ibutils ibverbs-utils rdmacm-utils perftest

      

    For tgt target support install:

    # apt-get install tgt

      

    For LIO target support install:

    # apt-get install targetcli

      

    For client install:

    # apt-get install open-iscsi-utils open-iscsi

      

     

     

    Port type configuration:

    Find the PCI device connected to the adapter card (in this case ConnectX-3 Pro).

    For example (05:00.0):

     

    # lspci | grep Mellanox

    05:00.0 Network controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]

                                     

     

    Edit the file /etc/rdma/mlx4.conf:
    Note: This file is read when the mlx4_core module is loaded and used to set the port types for any hardware found.  If a card is not listed in this file, then its port types are left alone.
    Format:
    <pci_device_of_card> <port1_type> [port2_type]
    port1 and port2: One of "auto", "ib", or "eth". port1 is required at all times, port2 is required for dual port cards.
    For example:
    0000:05:00.0 eth eth
    Perform reboot to reload the modules
    #reboot

    Configure port parameters:

    In order to find the exact mapping between the interface name to the physical port run:

     

    cat /sys/class/net/<net-device>/dev_id

     

     

    The net-device will be ib0/ib1/ib2 in case of InfiniBand link and enp5s0 or similar in case of Ethernet.
    For example:

     

    # cat /sys/class/net/enp5s0/dev_id
    0x0
    0x0 -means that the interface is mapped to physical port 1
    0x1- means that the interface is mapped to physical port 2

     

    Configure IP Address and enable the port.

    It can be done via console scripts such as nmtui, ifconfig (not permanent) or other method

    For example:
    #ifconfig enp5s0 12.12.12.1/24 up
    Make sure that both servers have IPs on the same network.
    #ifconfig enp5s0 12.12.12.2/24 up
    At this point RDMA should be able to run between the two servers.

     

    Verification:

    To check basic RDMA CM you can simply use several testing scripts
    1. udaddy
    This script covers RDMA_CM UD connections. (It establishes a set of unreliable RDMA datagram communication paths between two nodes using the librdmacm, optionally transfers datagrams between the nodes, then tears down the communication)
    Run the following command on one server (act as a server):

     

    #udaddy

     

    Run the following command on the second server (act as a client)

    # udaddy -s 12.12.12.1

    udaddy: starting client

    udaddy: connecting

    initiating data transfers

    receiving data transfers

    data transfers complete

    test complete

    return status 0

                                   

     

    "return status=0" means good exit (RDMA is running).
    2. rdma_server, rdma_client commands
    Another options is to use rdma_server and rdma_client commands:
    Those commands  are simple RDMA CM connection and ping-pong test (It uses synchronous librdmam calls to establish an RDMA connections between two nodes).
    Run the following command on one server (act as a server):
    #rdma_server

     

    Run the following command on the second server (act as a client)

    rdma_client -s 12.12.12.1

    rdma_client: start

    rdma_client: end 0

                                 

     

    "rdma_client: end 0" means good exit (RDMA is running).

    3. ib_send_bw (performance test)

    Run pefformance test such as ib_send_bw, ib_read_bw or similar

     

    For Example:

    Run the following command on one server (act as a server):

    # ib_send_bw -d mlx4_0 -i 1 -F --report_gbits

     

    Run the following command on the second server (act as a client):

    # ib_send_bw -d mlx4_0 -i 1 -F --report_gbits 12.12.12.1

    ---------------------------------------------------------------------------------------

                        Send BW Test

    Dual-port       : OFF          Device         : mlx4_0

    Number of qps   : 1            Transport type : IB

    Connection type : RC

    RX depth        : 512

    CQ Moderation   : 100

    Mtu             : 1024[B]

    Link type       : Ethernet

    Gid index       : 0

    Max inline data : 0[B]

    rdma_cm QPs     : OFF

    Data ex. method : Ethernet

    ---------------------------------------------------------------------------------------

    local address: LID 0000 QPN 0x0065 PSN 0xc8f367

    GID: 254:128:00:00:00:00:00:00:246:82:20:255:254:23:27:129

    remote address: LID 0000 QPN 0x005d PSN 0x884d7d

    GID: 254:128:00:00:00:00:00:00:246:82:20:255:254:23:31:225

    ---------------------------------------------------------------------------------------

    #bytes     #iterations    BW peak[Gb/sec]    BW average[Gb/sec]   MsgRate[Mpps]

    65536      1000           0.00               36.40                0.069428

    ---------------------------------------------------------------------------------------

                           

    4. rping

    This script covers RDMA_CM RC connections, but only userspace (It establishes a set of reliable RDMA connections between two nodes using the librdmacm, optionally transfers data between the nodes, then disconnects).

     

    Run the following on one of the servers (act as a rping server)

     

    # rping -s  -C 10 -v

    Run the following on one of the servers (act as a rping client)

     

    rping  -c -a 12.12.12.1  -C 10 -v

    ping data: rdma-ping-0: ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqr

    ping data: rdma-ping-1: BCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrs

    ping data: rdma-ping-2: CDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrst

    ping data: rdma-ping-3: DEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstu

    ping data: rdma-ping-4: EFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuv

    ping data: rdma-ping-5: FGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvw

    ping data: rdma-ping-6: GHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwx

    ping data: rdma-ping-7: HIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxy

    ping data: rdma-ping-8: IJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz

    ping data: rdma-ping-9: JKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyzA

    client DISCONNECT EVENT...

             

    5. ucmatose

    This script covers RDMA_CM RC connections, but only userspace (same as rping) (It establishes a set of reliable RDMA connections between two nodes using the librdmacm, optionally transfers data between the nodes, then disconnects).

     

    Run the following on one of the servers (act as a server)

    # ucmatose

     

    Run the following on the other server (act as a client)

    #ucmatose -s 12.12.12.1

    cmatose: starting client

    cmatose: connecting

    receiving data transfers

    sending replies

    data transfers complete

    test complete

    return status 0

            

    6. krping

    The krping module is a kernel loadable module that utilizes the Open Fabrics verbs to implement a client/server ping/pong program.

    This module should be unzipped and complied into both servers.

    [Note: The package can be downloaded from here]

     

     

     

    # cd /tmp

    # tar xvzf krping.tgz

    ...

    # cd krping

    # make

    ...

    # make install

    ...

    # modinfo rdma_krping

    filename:       /lib/modules/3.10.0-123.el7.x86_64/extra/rdma_krping.ko

    license:        Dual BSD/GPL

    description:    RDMA ping server

    author:         Steve Wise

    srcversion:     C4533E67F73469BA240B78D

    depends:        ib_core,rdma_cm

    vermagic:       3.10.0-123.el7.x86_64 SMP mod_unload modversions

    parm:           debug:Debug level (0=none, 1=all) (int)

    # modprobe rdma_krping debug=1

          

     

    Run the following on one of the servers (act as a server)

    #echo "server,addr=12.12.12.1,port=9999",verbose >/proc/krping

          

    Run the following on the other server (act as a client)

     

    #echo "client,addr=12.12.12.1,port=9999,count=100",verbose >/proc/krping

          

    You can check the dmesg or /var/log/messages for debug output. Additional command options can be found in the README file within the package.