MTU Considerations for RoCE based Applications

Version 3

    InfiniBand protocol Maximum Transmission Unit (MTU) defines several fix size MTU: 256, 512, 1024, 2048 or 4096 bytes.

    RoCE based application that uses RDMA that runs over Ethernet should take into account that the RoCE MTU is smaller than the Ethernet MTU. (normally 1500 is the default).

     

    The driver selects "active" MTU that is the largest value from the list above that is smaller than Eth MTU in the system (and takes in the account RoCE transport headers and CRC fields). So for example with default Ethernet MTU (1500 bytes) RoCE will use 1024 and with 4200 it will use 4096 as an “active MTU”. The "active_mtu" values can be checked with "ibv_devinfo".

    RoCE protocol exchanges "active_mtu" values and negotiates it between both ends. The minimum MTU will be used.

     

    Check the port MTU:

    # ifconfig eth2
    eth2      Link encap:Ethernet  HWaddr F4:52:14:17:1B:81
              inet6 addr: fe80::f652:14ff:fe17:1b81/64 Scope:Link
              UP BROADCAST RUNNING MULTICAST  MTU:1500 Metric:1
              RX packets:30 errors:0 dropped:0 overruns:0 frame:0
              TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:1000
              RX bytes:3230 (3.1 KiB)  TX bytes:492 (492.0 b)

    #

     

    Check the InfiniBand MTU:

    # ibv_devinfo -d mlx4_0
    hca_id: mlx4_0
            transport:                      InfiniBand (0)
            fw_ver:                         2.31.5050
            node_guid:                      f452:1403:0017:1b80
            sys_image_guid:                 f452:1403:0017:1b83
            vendor_id:                      0x02c9
            vendor_part_id:                 4103
            hw_ver:                         0x0
            board_id:                       MT_1090111019
            phys_port_cnt:                  2
                    port:   1
                            state:                  PORT_ACTIVE (4)
                            max_mtu:                4096 (5)
                            active_mtu:             1024 (3)
                            sm_lid:                 0
                            port_lid:               0
                            port_lmc:               0x00
                            link_layer:             Ethernet

                    port:   2
                            state:                  PORT_DOWN (1)
                            max_mtu:                4096 (5)
                            active_mtu:             4096 (5)
                            sm_lid:                 0
                            port_lid:               0
                            port_lmc:               0x00
                            link_layer:             InfiniBand

    #

     

    It is recommended to enlarge the MTU for applications that uses large IOs.

    Note: if you change the port MTU, it should be changed also across all networks elements (switches and routers).

    Once you change the port MTU the InfiniBand active MTU will be aligned automatically to the largest possible size that can suit that MTU.

     

    As an example, once setting the port MTU to 4200, the active_mtu will be changed to 4096. However, It is better not to configure the port MTU to 9000, as this is a waste of memory. The suggested values for post MTU are as follows:

    • For active MTU of 4096 - configure the port MTU to 4200
    • For active MTU of 2048 - configure the port MTU to 2200

     


    # ifconfig eth2 mtu 4200
    # ibv_devinfo -d mlx4_0
    hca_id: mlx4_0
            transport:                      InfiniBand (0)
            fw_ver:                         2.31.5050
            node_guid:                      f452:1403:0017:1b80
            sys_image_guid:                 f452:1403:0017:1b83
            vendor_id:                      0x02c9
            vendor_part_id:                 4103
            hw_ver:                         0x0
            board_id:                       MT_1090111019
            phys_port_cnt:                  2
                    port:   1
                            state:                  PORT_ACTIVE (4)
                            max_mtu:                4096 (5)
                            active_mtu:             4096 (5)
                            sm_lid:                 0
                            port_lid:               0
                            port_lmc:               0x00
                            link_layer:             Ethernet

                    port:   2
                            state:                  PORT_DOWN (1)
                            max_mtu:                4096 (5)
                            active_mtu:             4096 (5)
                            sm_lid:                 0
                            port_lid:               0
                            port_lmc:               0x00
                            link_layer:             InfiniBand

    #