How-To Dump RDMA traffic Using the Inbox tcpdump tool (ConnectX-4)

Version 10

    This is a beginners guide on how to dump RDMA/RoCE traffic using tcpdump for ConnectX-4 adapters.

    When RDMA traffic bypasses the kernel it cannot be monitored using tcpdump, wireshark or other tools, but it can be done by monitoring a switch port in the network and sending the traffic to a designated server.

    One of the important features introduced in MLNX_OFED 3.2 is the support for sniffing capabilities via standard configuration of the adapter (ethtool), which enables sniffing traffic which bypasses the kernel.

    The new mechanism, targets offload traffic including RoCE and raw Ethernet.

    This tool can be utilized for non-kernel-bypassed traffic as well.

     

    Note: in case you use ConnectX-3/Pro adapters, use ibdump tool, for more details refer to MLNX_OFED User Manual.

     

    References

     

    Note: Support for tcpdump is available from MLNX_OFED v3.2 onwards.

     

    Setup

    In this example we will use two servers connected back to back using ConnectX-4 adapters.

     

    Configuration

    • Link layer: Ethernet
    • Traffic: RoCE

    1. Configure the IP address for both adapter ports, and make sure ping is running between the servers.

     

    2. Enable Sniffer using ethtool.

     

    In this example, interface name is ens785f0:

    # ethtool --set-priv-flags ens785f0 sniffer on

    3. run tcpdump to capture the packet and open it in Linux (e.g. vi)

    # tcpdump -i ens785f0 -XXvv > ~/rdma_traffic.txt

     

    to view the file parsed in wireshark, run:

    # tcpdump -i ens785f0 -s 65535 -w rdma_traffic.pcap

     

    After the test completion, open the file in wireshark.

     

    Refer to tcpdump main page to see more examples.

     

    4. Run RDMA traffic:

     

    Run on one server:

    # ib_send_bw

    ************************************

    * Waiting for client to connect... *

    ************************************

     

    Now run on the other server:

    # ib_send_bw 99.99.99.5 --report_gbits -F

    ---------------------------------------------------------------------------------------

                        Send BW Test

    Dual-port       : OFF Device         : mlx5_1

    Number of qps   : 1 Transport type : IB

    Connection type : RC Using SRQ      : OFF

    TX depth        : 128

    CQ Moderation   : 100

    Mtu             : 4096[B]

    Link type       : Ethernet

    Gid index       : 0

    Max inline data : 0[B]

    rdma_cm QPs : OFF

    Data ex. method : Ethernet

    ---------------------------------------------------------------------------------------

    local address: LID 0000 QPN 0x01ae PSN 0x31a206

    GID: 00:00:00:00:00:00:00:00:00:00:255:255:12:12:12:06

    remote address: LID 0000 QPN 0x020a PSN 0xa2824e

    GID: 00:00:00:00:00:00:00:00:00:00:255:255:12:12:12:05

    ---------------------------------------------------------------------------------------

    #bytes     #iterations    BW peak[Gb/sec]    BW average[Gb/sec]   MsgRate[Mpps]

    65536      1000             95.16              95.16     0.181502

    ---------------------------------------------------------------------------------------

     

    5. Check the output file:
    In this example, UDP port 4791 is for RoCEv2 traffic.

    # cat ~/rdma_traffic.txt

     

    ...

    14:48:23.007280 IP (tos 0x0, ttl 64, id 5066, offset 0, flags [DF], proto UDP (17), length 308)

        1.1.6.2.49153 > 1.1.5.2.4791: [no cksum] UDP, length 280

            0x0000:  248a 0780 5401 e41d 2df2 a45c 8100 0006  $...T...-..\....

            0x0010:  0800 4500 0134 13ca 4000 4011 18ea 0101  ..E..4..@.@.....

            0x0020:  0602 0101 0502 c001 12b7 0120 0000 6440  ..............d@

            0x0030:  ffff 0000 0001 0000 002a 8001 0000 0000  .........*......

            0x0040:  0001 0107 0203 0000 0000 0000 0011 7a2f  ..............z/

            0x0050:  ac19 0010 0000 0000 0000 19ac 2f7a 0000  ............/z..

            0x0060:  0000 0000 0000 0106 4853 e41d 2d03 00f2  ........HS..-...

            0x0070:  a45c 0000 0000 0000 0000 0001 a400 0000  .\..............

            0x0080:  0000 0000 00b0 28a5 38b7 ffff 37f0 ffff  ......(.8...7...

            0x0090:  ffff 0000 0000 0000 0000 0000 ffff 0101  ................

            0x00a0:  0602 0000 0000 0000 0000 0000 ffff 0101  ................

            0x00b0:  0502 0000 0007 0040 0098 0000 0000 0000  .......@........

            0x00c0:  0000 0000 0000 0000 0000 0000 0000 0000  ................

            0x00d0:  0000 0000 0000 0000 0000 0000 0000 0000  ................

            0x00e0:  0000 0000 0000 0040 b1ee 0000 0000 0000  .......@........

            0x00f0:  0000 0000 0000 0101 0602 0000 0000 0000  ................

            0x0100:  0000 0000 0000 0101 0502 0000 0000 0000  ................

            0x0110:  0000 0000 0000 0000 0000 0000 0000 0000  ................

            0x0120:  0000 0000 0000 0000 0000 0000 0000 0000  ................

            0x0130:  0000 0000 0000 0000 0000 0000 0000 0000  ................

            0x0140:  0000 5870 f2dd                           ..Xp..

     

    In case you saved the file for Wireshark,you will receive the following message:

     

    "Make sure that you are using the latest version of wireshark, as old versions may not parse InfiniBand well."