Lossless RoCE Configuration for Spectrum-based Cumulus Switches in DSCP-Based QoS Mode

Version 2

    This post provides a configuration example of lossless RoCE for Spectrum-based Cumulus-OS switches in DSCP-based QoS mode.

    Notes:

     

    References

     

    Overview

    This solution offers the following network setup:

     

    Configuration

    1. Map the NIC DSCP 26 to priority 1, set lossless to priority 1

    Example for linux with mlnx_qos:

    mlnx_qos -i <interface> --pfc 0,1,0,0,0,0,0,0 --dscp2prio=set,26,1

    Note: other configurations remain the same (all RoCE traffic should be with tclass=106 or DSCP=26)

     

    2. Enable ECN for priority 1

    ## File: /etc/cumulus/datapath/traffic.conf

    ecn.port_group_list = [ecn_port_group]

    ecn.ecn_port_group.cos_list = [1]

    ecn.ecn_port_group.port_set = swp1-swp32

    ecn.ecn_port_group.min_threshold_bytes = 153600
    ecn.ecn_port_group.max_threshold_bytes = 1536000

    ecn.ecn_port_group.probability = 100

     

    3. Set trust mode to DSCP, map DSCP values to COS

    ## File: /etc/cumulus/datapath/traffic.conf

    traffic.packet_priority_source_set = [dscp]
    traffic.cos_0.packet_priorities.dscp = [0,1,…,63] #for all Priorities
    traffic.cos_1.packet_priorities.dscp = [26]  # for RoCE

    traffic.cos_2.packet_priorities.dscp = [48]  # for CNPs

    traffic.cos_3.packet_priorities.dscp = []

    traffic.cos_4.packet_priorities.dscp = []

    traffic.cos_5.packet_priorities.dscp = []

    traffic.cos_6.packet_priorities.dscp = []

    traffic.cos_7.packet_priorities.dscp = []

     

    4. Map switch priority to priority groups

    ## File: /etc/cumulus/datapath/traffic.conf

    traffic.priority_group_list = [service, bulk]

    priority_group.service.cos_list = [1]

    priority_group.bulk.cos_list = [0,2,3,4,5,6,7]

     

    5. Enable PFC for priority 1

    ## File: /etc/cumulus/datapath/traffic.conf

    pfc.port_group_list = [pfc_port_group]
    pfc.pfc_port_group.cos_list = [1]
    pfc.pfc_port_group.port_set = swp1-swp32
    pfc.pfc_port_group.port_buffer_bytes = 40000
    pfc.pfc_port_group.xoff_size = 17000
    pfc.pfc_port_group.xon_delta = 0
    pfc.pfc_port_group.tx_enable = true
    pfc.pfc_port_group.rx_enable = true

     

    6. Assign group IDs, create buffer pools

    ## File: /usr/lib/python2.7/dist-packages/cumulus/__chip_config/mlx/datapath.conf

    priority_group.service.id = 0

    priority_group.bulk.id = 1

    priority_group.control.service_pool = 0

    priority_group.service.service_pool = 0

    priority_group.bulk.service_pool = 0

    ingress_service_pool.0.percent = 75.0 # all priority groups

    ingress_service_pool.1.percent = 0.0

    ingress_service_pool.2.percent = 0.0

    ingress_service_pool.3.percent = 0.0

    egress_service_pool.0.percent = 100.0 # all priority groups

    egress_service_pool.1.percent = 0.0

    egress_service_pool.2.percent = 0.0

    egress_service_pool.3.percent = 0.0

     

    7. Configure alpha values

    Note: the values are set to maximize performance and shouldn't be altered without consulting Mellanox

    ## File: /usr/lib/python2.7/dist-packages/cumulus/__chip_config/mlx/datapath.conf

    priority_group.control.ingress_buffer.dynamic_quota = 11

    priority_group.service.ingress_buffer.dynamic_quota = 11

    priority_group.bulk.ingress_buffer.dynamic_quota = 11

     

    priority_group.bulk.egress_buffer.uc.sp_dynamic_quota = 11

    priority_group.service.egress_buffer.uc.sp_dynamic_quota = 255

    priority_group.control.egress_buffer.uc.sp_dynamic_quota = 11

     

    priority_group.bulk.egress_buffer.mc.sp_dynamic_quota    = 255

    priority_group.service.egress_buffer.mc.sp_dynamic_quota = 255

    priority_group.control.egress_buffer.mc.sp_dynamic_quota = 255