HowTo Configure VXLAN for ConnectX-3 Pro (Linux OVS)

Version 39
    This post explains how to configure VXLAN on Linux used on ConnectX®-3 Pro adapter cards via OVS.

    Related posts

    Server requirements:

    • ConnectX-3 Pro
    • Operating system and kernel options:
      • upstream Linux 3.14 or later
      • RHEL7 beta snapshot 10 (kernel 3.10.0-105.el7) or later
      • Ubuntu 14.04 (kernel 3.13.0-24-generic) or later
    • openvswitch 2.0
    • KVM Hypervisor using para-virtual NIC (e.g. virtio with vhost backend on the hypervisor)
    • MLNX_OFED (2.2 or later) installation is optional, as driver support is inbox (RHEL 7 or Ubuntu 14.04)
     

    Configuring VXLAN:


    1. Make sure the server is equipped with Mellanox ConnecX-3 Pro adapter card (device ID is 0x1007 / name MT27520)
    # lspci | grep Mellanox
    07:00.0 Network controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]
    # lspci -n | grep 15b3
    07:00.0 0280: 15b3:1007
    2. Enable the VXLAN offloads. Load the mlx4_core driver with Device-Managed Flow-steering (DMFS) enabled. The best course of action would be to create the file /etc/modprobe.d/mlx4_core.conf file with "log_num_mgm_entry_size=-1" as follows:
    options mlx4_core log_num_mgm_entry_size=-1 debug_level=1

    Note: When DMFS is disabled, VXLAN offloads are disabled as well.

     

    Note: Since in MLNX-OFED 2.3, flow steering is enabled by default (-1). To disable flow steering add "log_num_mgm_entry_size=10" to  /etc/modprobe.d/mlx4.conf file. For additional information of flow steering configuration refer to MLNX_OFED User Manual.

     

    The module mlx4_core which is provided by Ubuntu 14.04 / kernel 3.13.0-24-generic is built without the "CONFIG_MLX4_DEBUG" option, hence the "debug_level" module param does not exist. This will be fixed in later Ubuntu kernels.

    3. Make sure the detected port link type is Ethernet (not InfiniBand) such that Ethernet interfaces are created. For example:
    # lspci | grep Mellanox
    07:00.0 Network controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]
    # echo eth >  /sys/bus/pci/devices/0000:07:00.0/mlx4_port1
    # echo eth >  /sys/bus/pci/devices/0000:07:00.0/mlx4_port2
    # cat /sys/bus/pci/devices/0000:07:00.0/mlx4_port1
    auto (eth)
    # cat /sys/bus/pci/devices/0000:07:00.0/mlx4_port2
    auto (eth)
    4. Make sure the firmware running on ConnectX-3 Pro device is version 2.31.5050 or higher. Run "ethtool -i $DEV" on the mlx4_en Ethernet net-device. For example:
    # ethtool -i eth3
    driver: mlx4_en
    version: 2.2-1 (Feb 2014)
    firmware-version: 2.31.5050
    bus-info: 0000:07:00.0
    [...]
    5. Validate the setup. When DMFS (flow steering) is enabled, VXLAN offload is enabled. If debug_level=1 is configured for the mlx4_core module while the mlx4_core driver loads, the following prints can be seen in the kernel dmseg:
    [   42.997662] mlx4_core 0000:07:00.0:     Device manage flow steering support
    [....]
    [   42.997671] mlx4_core 0000:07:00.0:     TCP/IP offloads/flow-steering for VXLAN support
    [...]
    [   45.434827] mlx4_core 0000:07:00.0: Steering mode is: Device managed flow steering [...]
    [   45.434828] mlx4_core 0000:07:00.0: Tunneling mode is: vxlan
    Once you ensure that all the above is properly set, the Ethernet net-device, created by the mlx4_en driver, advertises the NETIF_F_GSO_UDP_TUNNEL feature which can be seen by "eththool -k $DEV | grep udp". For example:
    # ethtool -k eth3 | grep udp_tnl
    tx-udp_tnl-segmentation: on
    To further validate that things are in order, run TCP traffic between VMs or between virtual Ethernet devices (vEth) which pass through OVS VXLAN encapsulation/decapsulation. As this runs, observe the ConnectX uplink NIC traffic to see that LSO and GRO come into play. Namely, you should see large 64KB-sized VXLAN UDP packets sent to/received by the NIC by the host networking stack (hypervisor). When the offloads are off, no LSO/GRO is applied for VXLAN traffic.

    Additional Notes:

     

    1. MTU considerations: VXLAN tunneling adds 50 bytes (14-eth + 20-ip + 8-udp + 8-vxlan) to the VM Ethernet frame. You should make sure that either the MTU of the NIC which sends the packets (e.g. the VM virtio-net NIC or the host side vEth device or the uplink) takes into account the tunneling  overhead. In other words, the MTU of the sending NIC has to be decremented by 50 bytes (e.g 1450 instead of 1500), or the uplink NIC MTU has to be incremented by 50 bytes (e.g 1550 instead of 1500, but note to make sure this is OK with your Ethernet switch).

     

    2. The default UDP port for VXLAN is 4789 (this is also the FW default). When working with external management tools such as OpenStack Neutron make sure to use this port. For example:
      # ovs-vsctl add-port ovs-vx vxlan0 -- set interface vxlan0 type=vxlan options:dst_port=4789

     

    Under upstream kernel 3.15-rc1 and onward or when working with Ubuntu 14.04, it is possible to use an arbitrary UDP port for VXLAN. To that end, you need to enable the kernel configuration option CONFIG_MLX4_EN_VXLAN=y.

     

    3. If you build the kernel yourself, the below configuration parameters must be set:
    CONFIG_VXLAN=m
    CONFIG_OPENVSWITCH=m
    CONFIG_OPENVSWITCH_VXLAN=y
    CONFIG_MLX4_EN=m
    CONFIG_MLX4_EN_DCB=y
    CONFIG_MLX4_CORE=m
    CONFIG_MLX4_DEBUG=y
    CONFIG_MLX4_INFINIBAND=m
    from 3.15 and on, add this one too
    CONFIG_MLX4_EN_VXLAN=y


    Example:

    This is a configuration example for two hosts. In this example the VXLAN tenant IP addresses are 192.168.52.0/24 and the hypervisor IP network that serves it is 192.168.30.0/24.
    This example uses an OVS instance named ovs-vx. The OVS instance has the following interfaces attached:
    • A VXLAN port named vxlan0 that uses UDP port 4789 and vnid 99
    • A vEth interface named veth1 which mimics the interface with a router
    • A tap interface for the VM named vnet1


    28.png

     

    Host A configuration:
    Host address of the mlx4_en net-device (eth2): 192.168.30.44/24
    # modprobe openvswitch
    # service openvswitch start
    # ovs-vsctl add-br ovs-vx
    # ovs-vsctl add-port ovs-vx vxlan0 -- set interface vxlan0 type=vxlan options:remote_ip=192.168.30.43 options:key=99 options:dst_port=4789
    Either manually attach the host side VM NICs (tap) or let the virtualization manager do that. To do this manually run:
    # ovs-vsctl add-port ovs-vx vnet1
    For router functionality, add a vETH NIC pair. Run:
    # ip link add type veth
    # ifconfig veth0 192.168.52.44/24 up
    # ifconfig veth1 up
    # ifconfig veth0 mtu 1450
    # ifconfig veth1 mtu 1450
    # ovs-vsctl add-port ovs-vx veth1
    Host B Configuration
    Host Address of the mlx4_en net-device: 192.168.30.43/24
    # modprobe openvswitch
    # service openvswitch start
    # ovs-vsctl add-br ovs-vx
    # ovs-vsctl add-port ovs-vx vxlan0 -- set interface vxlan0 type=vxlan options:remote_ip=192.168.30.44 options:key=99 options:dst_port=4789
    Either manually attach the host side VM NICs (tap) or let the virtualization manager do that. To do this manually run:
    # ovs-vsctl add-port ovs-vx vnet1
    Output example for host A:
    # ovs-vsctl show
    Bridge ovs-vx
    Port ovs-vx
        Interface ovs-vx
            type: internal
    Port "vxlan0"
        Interface "vxlan0"
            type: vxlan
            options: {dst_port="4789", key="99", remote_ip="192.168.30.43"}
    Port "vnet1"
        Interface "vnet1"
    Port "veth1"
        Interface "veth1“

    # ip addr show veth0
    22: veth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc pfifo_fast state UP qlen 1000
        link/ether e6:94:71:e4:6c:81 brd ff:ff:ff:ff:ff:ff
        inet 192.168.52.44/24 brd 192.168.52.255 scope global veth0

    # ovs-dpctl show
    system@ovs-system:
             lookups: hit:227022514 missed:590 lost:0
             flows: 2
             masks: hit:530552752 total:2 hit/pkt:2.34
             port 0: ovs-system (internal)
             port 1: vxlan_sys_4789 (vxlan: df_default=false, ttl=0)
             port 2: ovs-vx (internal)
             port 3:
             port 4:
             port 5: vnet1
             port 6: veth1

    # ovs-dpctl dump-flows
    skb_priority(0),in_port(5),eth(src=52:54:00:64:3e:1a,dst=52:54:00:7e:da:6d),eth_type(0x0800),ipv4(src=192.168.52.145/0.0.0.0,dst=192.168.52.245/0.0.0.0,proto=6/0,tos=0/0x3,ttl=64/0,frag=no/0xff), packets:170358, bytes:11246288, used:0.005s, actions:set(tunnel(tun_id=0x63,src=0.0.0.0,dst=192.168.30.44,tos=0x0,ttl=64,flags(df,key))),1
    skb_priority(0),tunnel(tun_id=0x63,src=192.168.30.44,dst=192.168.30.43,tos=0x0,ttl=64,flags(key)),in_port(1),skb_mark(0),eth(src=52:54:00:7e:da:6d,dst=52:54:00:64:3e:1a),eth_type(0x0800),ipv4(src=192.168.52.145/0.0.0.0,dst=192.168.52.245/0.0.0.0,proto=6/0,tos=0/0,ttl=64/0,frag=no/0xff), packets:213549, bytes:8764612024, used:0.001s, actions:5
    Notes:
    (1) The MTU of the VM sitting on top of vnet1 was reduced to 1450.
    (2) The tun_id in the example (0x63) is the VNID (key configured to 99 in decimal).
    VXLAN and OpenStack Neutron
    If you are running OpenStack icehouse version or above you must use ml2 plugin. Edit the/etc/neutron/plugins/ml2/ml2_conf.ini file as follows:
    [ovs]
    bridge_mappings = default:br-eth5
    enable_tunneling = True
    local_ip = 192.168.215.1

    [agent]
    vxlan_udp_port = 4789
    tunnel_types = vxlan
    l2_population = True
    root_helper = sudo /usr/local/bin/neutron-rootwrap /etc/neutron/rootwrap.conf
    Additional information on VXLAN support in OpenStack can be found here: