Raw Ethernet Programming: Packet Pacing - Code Example

Version 10

    This post lists the configuration steps of packet pacing (traffic shaping) per flow (send queue) on ConnectX-4 and ConnectX-4 Lx over libibverbs (libibverbs are using libmlx5).

    This feature is supported in MLNX_OFED v3.4 and above. This post is irrelevant for the upstream driver version (different API).

    The reader is assumed to be a developer that is aware of packet pacing configuration as described in HowTo Configure Packet Pacing on ConnectX-4.

     

    References

     

    Overview

    Before starting, make sure you understand packet pacing configuration for ConnectX-4. You can refer to HowTo Configure Packet Pacing on ConnectX-4.

    Packet pacing is a rate-limiting and shaping per QP. Setting and changing the rate is done by modifying the QP.

     

    Configuration

    1. Verify that the adapter supports packet pacing:

    # ibv_devinfo -v

     

     

    ...

    packet_pacing_caps:

            qp_rate_limit_min:              0kbps

            qp_rate_limit_max:              100000000kbps

            supported_qp:

                                            SUPPORT_RAW_PACKET

            support_burst_control:          YES

    ...

    • The minimum qp_rate_limit_min and maximum qp_rate_limit_max rate limit in Kb/s are provided by the adapter hardware.
    • supported_qp is the QP type which supports packet pacing operations.
    • SUPPORT_RAW_PACKET is Raw Ethernet only.
    • support_burst_control should be Yes if user wants to configure max_burst_size and typical_packet_size. It's not necessary to use rate limit.

     

    2. Change the packet pacing rate limit by calling ibv_exp_modify_qp function  (in /usr/include/infiniband/verbs_exp.h) after creating the QP.

     

    3. [Optional] For changing the rate limit for QP, update the following QP state transactions:

    • IBV_QPS_RTR to IBV_QPS_RTS  (RTR2RTS)
    • IBV_QPS_RTS to IBV_QPS_RTS  (RTS2RTS)

     

    The user application should specify the rate_limit inside struct ibv_exp_qp_attr, and set the flag IBV_QP_EXP_RATE_LIMIT with exp_attr_mask (qp_flags parameter in the example below). The rate limit value must be within the range of minimum and maximum values supported by adapter hardware.

    static inline int ibv_exp_modify_qp (struct ibv_qp *qp, struct ibv_exp_qp_attr *attr, uint64_t exp_attr_mask)

    4. [Optional] If device supports burst control:

    Application could configure max_burst_sz and typical_pkt_sz via ibv_exp_qp_attr->burst_info while calling ibv_exp_modify_qp and set attr_mask IBV_EXP_QP_RAET_LIMIT together with comp_mask IBV_EXP_QP_ATTR_BURST_INFO. Note that rate_limit must be a non 0 value to configure burst info.

     

    max_burst_sz: The device will schedule bursts of packets for a QP connected to this rate, smaller than or equal to this value. Value 0x0 indicates packet bursts will be limited to the device defaults. This field should be used if bursts of packets must be strictly kept under a certain value.

     

    typical_pkt_sz: When the rate limit is intended for a stream of similar packets, stating the typical packet size can improve the accuracy of the rate limiter. The expected packet size will be the same for all QPs associated with the same rate limit index.

     

    Example:

    The example below is taken from the full example found in Raw Ethernet Programming: Basic Introduction - Code Example with some modifications to match this feature. See the Sender example in sections 6-8 with the red colored changes.

     

    1. Move the QP to "Ready to Receive" (IBV_QPS_RTR).

    2. Move the QP to "Ready to Send" (IBV_QPS_RTS)

    3. Set the rate limit (qp_attr.rate_limit) in Kb/s.

     

    Note: the IBV_QP_STATE is a required parameter in the qp_flags.

            ...

        /* 6. Create Queue Pair (QP) - Send Ring */

        qp = ibv_create_qp(pd, &qp_init_attr);

        if (!qp)  {

            fprintf(stderr, "Couldn't create RSS QP\n");

            exit(1);

        }

     

        /* 7. Initialize the QP (receive ring) and assign a port */

        struct ibv_exp_qp_attr qp_attr;

        int qp_flags;

        memset(&qp_attr, 0, sizeof(qp_attr));

     

        qp_flags = IBV_QP_STATE | IBV_QP_PORT;

        qp_attr.qp_state        = IBV_QPS_INIT;

        qp_attr.port_num        = 1;

        ret = ibv_modify_qp(qp, &qp_attr, qp_flags);

        if (ret < 0) {

            fprintf(stderr, "failed modify qp to init\n");

            exit(1);

        }

        memset(&qp_attr, 0, sizeof(qp_attr));

     

        /* 8. Move the ring to ready to send in two steps (a,b) */

        /*    a. Move ring state to ready to receive, this is needed to be able to move ring to ready to send even if receive queue is not enabled */

        qp_flags = IBV_QP_STATE;

        qp_attr.qp_state = IBV_QPS_RTR;

        ret = ibv_modify_qp(qp, &qp_attr, qp_flags);

        if (ret < 0) {

            fprintf(stderr, "failed modify qp to recevie\n");

            exit(1);

        }

     

        /*    b. Move the ring to ready to send and set the rate limit */

        qp_flags = IBV_QP_STATE |  IBV_QP_EXP_RATE_LIMIT

        qp_attr.qp_state = IBV_QPS_RTS;

        qp_attr.rate_limit = 1000; //For example, 1000 Kb/s

        /* Make sure Support_burst_control is YES when configuring burst info.

     

         * Also qp_attr.rate_limit must be a positive number.*/

        qp_attr.burst_info.max_burst_sz = 2; // For example, 2 bytes

        qp_attr.burst_info.typical_pkt_sz = 1500; // For example, 1500 bytes

        qp_attr.comp_mask |= IBV_EXP_QP_ATTR_BURST_INFO;

     

        ret = ibv_exp_modify_qp(qp, &qp_attr, qp_flags);

        if (ret < 0) {

            fprintf(stderr, "failed modify qp to receive\n");

            exit(1);

        }