RoCE Configuration for Mellanox Adapters (PCP-Based)

Version 20

    This post provides a configuration example for Mellanox devices installed with MLNX_OFED running RoCE over a lossy network, in PCP-based QoS mode.

     

    Notes:

    • Mellanox adapters and switches support DSCP based QoS and flow control, which is easier and simpler to configure and doesn't require VLANS, QoS is maintained across routers.
    • QoS parameters are set on QP creation, When working with RDMA-CM it is possible to set QoS parameters for RDMA-CM created QPs
    • Some of the configuration steps below can either be done permanently or temporarily (can be kept for the next boot).

           For permanent configuration after running mlxconfig, a device reset (mlxfwreset) or host reboot is required.

     

    Configuration

    Step 1 - set QoS parameters

    Map sk-prio 2 to SL 3 (Note: This command is nonpersistent)

    # vconfig set_egress_map <vlan-interface> 2 3

    [Optional] Set ToS to 106 (DSCP 26) for ALL RoCE traffic (Note: This command is nonpersistent)

    # echo 106 > /sys/class/infiniband/<mlx-device>/tc/1/traffic_class

    [Optional] Set the RDMA-CM ToS to 106 (DSCP 26) (Note: This command is nonpersistent)

    # cma_roce_tos -d <mlx-device> -t 106

    [Optional] Enable ECN for TCP traffic (Note: This command is nonpersistent)

    # sysctl -w net.ipv4.tcp_ecn=1

    # vconfig set_egress_map <vlan-interface> <sk_prio_number> <priority>

    Notations

    <interface> refers to parent interface (for example ens2f0)

    <vlan-interface> refers to vlan interface (for example ens2f0.100)

    <mst-device> refers to MST device. (for example  /dev/mst/mt4115_pciconf0)

    <mlx-device> refers to mlx device (for example mlx5_0)

     

    References