RoCE Configuration on Mellanox Adapters (PCP-Based Lossless Traffic)

Version 15

    This post provides a Linux configuration example for enabling L2 priority (Priority Code Point [PCP]) based lossless RoCE traffic, when using Trust L2 in the switch configuration. This method is described as Profile 4.


    For other RoCE Profile solutions, see Getting Started with RoCE Configuration.





    This solution involves a simple network setup and basic configuration on the adapter.



    <interface> refers to parent interface (for example ens2f0)

    <vlan-interface> refers to vlan interface (for example ens2f0.100)

    <mst-device> refers to MST device. (for example  /dev/mst/mt4115_pciconf0)

    <mlx-device> refers to mlx device (for example mlx5_0)


    Note: Some of the configuration steps below can either be done permanently or temporarily (can be kept for the next boot).



    1. Enable DCQCN on priority 3 (used for RoCE traffic).

    Firmware configuration (non-volatile):

    # mlxconfig -d /dev/mst/<mst-device> -y s ROCE_CC_PRIO_MASK_P1=8 RPG_THRESHOLD_P1=1 DCE_TCP_G_P1=1019 ROCE_CC_PRIO_MASK_P2=8 RPG_THRESHOLD_P2=1 DCE_TCP_G_P2=1019


    Driver configuration (volatile):

    # echo 1 > /sys/class/net/<interface>/ecn/roce_np/enable/3

    # echo 1 > /sys/class/net/<interface>/ecn/roce_rp/enable/3

    # echo 1 > /sys/class/net/<interface>/ecn/roce_rp/rpg_threshold

    # echo 1019 > /sys/class/net/<interface>/ecn/roce_rp/dce_tcp_g


    2. Set the CNP priority to 6.

    Firmware configuration (non-volatile):

    # mlxconfig -d /dev/mst/<mst-device> -y s CNP_802P_PRIO_P1=6 CNP_802P_PRIO_P2=6


    Driver configuration (volatile):

    # echo 6 > /sys/class/net/<interface_name>/ecn/roce_np/cnp_802p_prio


    3. Configure RDMA-CM (volatile).

    # cma_roce_mode -d <mlx-device> -p 1 -m 2        # Set the RDMA Version to RoCEv2

    # cma_roce_tos -d <mlx_dev> -t 105               # Setting the TOS for RDMA-CM to TOS=105, mapped to sk_prio=2

    # vconfig set_egress_map <vlan-interface> 2 3    # Map sk_prio=2 to SL=3


    4. [Optional] Enable ECN on TCP traffic and set egress priority for the TCP traffic (volatile).

    # sysctl -w net.ipv4.tcp_ecn=1

    # vconfig set_egress_map <vlan-interface> <sk_prio_number> <priority>


    5.  Activate PFC on priority 3.

    Using mlnx_qos tool (non-volatile):

    # mlnx_qos -i <interface> --pfc 0,0,0,1,0,0,0,0

    For more information, see HowTo Configure PFC on ConnectX-4.



    Using LLDP DCBX, and the firmware configuration in the switch:

    # mlxconfig -d /dev/mst/mt4115_pciconf0 -y s LLDP_NB_DCBX_P1=TRUE LLDP_NB_TX_MODE_P1=2 LLDP_NB_RX_MODE_P1=2 LLDP_NB_DCBX_P2=TRUE LLDP_NB_TX_MODE_P2=2 LLDP_NB_RX_MODE_P2=2

    For more information, see HowTo Auto-Config PFC and ETS on ConnectX-4 via LLDP DCBX.



    Using LLDP DCBX, and the driver/OS configuration in the switch:

    # service lldpad start

    # lldptool -T -i <interface_name> -V PFC enabled=3


    For end to end configuration example and troubleshooting, see How To Configure Lossless RoCE (PFC + ECN) End-to-End Using ConnectX-4 and Spectrum (Trust L2).