Raw Ethernet Programming: ToS - Code Example

Version 10

    Type of Service (ToS) / differentiated services code point (DSCP) is an 8-bit field in the IP packet that enables different service levels to be assigned to handle network traffic. This involves marking each packet on the network with a DSCP code and appropriating to it the corresponding level of service. The feature is available starting with MLNX_OFED Rel. 3.4.

     

    References

     

     

    IP Packet Formats

    ToS

     

    DSCP

     

    Aggregating ToS with DSCP IP Packets

    In the example below the ToS/DSCP field is an 8-bit number. When you need to specify ECN and DSCP (6 bits) values, you will need to aggregate that to a single 8-bit number. For example, with a DSCP value of 000001b and an ECN value of 11b, the total value of the field would be 00000111b = 0x07.

     

    Before you start, you can check the h files in include/infiniband/verbs_exp.h and include/infiniband/verbs.h to see that you have the API for the ToS field in the ipv4_spec, as shown in the example below.

     

    Configuration

    On the receiver side, you need to create a rule the will steer the packet according to the ToS/DSCP field in the IP header. For that we will create a raw ethernet flow attribute struct and fill in those packet header details using raw_eth_flow_attr. In the example below, flow_attr defines a rule in priority 0 to match a source ipv4 address (0x0B86C806) and a specific DSCP/TOS (0x40). If there is a hit on this rule it means that the received packet has source ip: 0x0B86C806 and TOS: 0x40, and the packet is steered to its attached qp.

     

    Note: In this example the num_of_specs is specified as 1, which means there is one specification, .spec_ipv4. There is an option to add more specifications, such as Ethernet or others. If desired, you need to make sure that the num_of_specs parameter is aligned with the actual number of specifications.

     

    The example below is based on Raw Ethernet Programming: Basic Introduction - Code Example. In the example, we changed section 12 in the receiver side to create a flow attribute rule to match IP and ToS-specific values. Those are shown below in red.

    Note that the struct is experimental, as it uses the exp notation.

       /* 12. Register steering rule to match 0x0B86C806 source IP and 0x40 TOS and place packet in ring pointed by ->qp */

           struct raw_eth_flow_attr {

                   struct ibv_exp_flow_attr            attr;

                   struct ibv_exp_flow_spec_ipv4_ext   spec_ipv4;

           } __attribute__((packed));

     

           struct raw_eth_flow_attr flow_attr = {

                           .attr = {

                                   .comp_mask      = 0,

                                   .type           = IBV_EXP_FLOW_ATTR_NORMAL,

                                   .size           = sizeof(flow_attr),

                                   .priority       = 0,

                                   .num_of_specs   = 1,

                                   .port           = 1,

                                   .flags          = 0,

                           },

                           .spec_ipv4 = {

                                   .type   = IBV_EXP_FLOW_SPEC_IPV4_EXT,

                                   .size   = sizeof(struct ibv_exp_flow_spec_ipv4_ext),

                                   .val = {

                                           .src_ip = 0x0B86C806,

                                           .dst_ip = 0,

                                           .tos = 0x40,

                                   },

                                   .mask = {

                                           .src_ip = 0xFFFFFFFF,

                                           .dst_ip = 0,

                                           .tos = 0xff,

                                   }

              };

     

    Struct ibv_exp_flow *flow = ibv_exp_create_flow(qp,  &flow_attr.attr);

     

     

    Verification

    Use the perftest package to test and verify that behavior using the --tos flag.

    For example:

    # raw_ethernet_bw --server -d mlx5_0 -E 28:33:44:55:66:77 -B 77:22:33:44:55:88 --dest_ip 1.1.1.2 --tos <VALUE>

    # raw_ethernet_bw --client -d mlx5_0  -B 28:33:44:55:66:77 -E 77:22:33:44:55:88 --dest_ip 1.1.1.2 --tos <VALUE>

     

    include/infiniband/verbs_exp.h and  include/infiniband/verbs.h.