QinQ Considerations and Configuration on Mellanox Switches

Version 22

    This post shows how to configure QinQ (802.1ad) on Mellanox Switches.  QinQ (802.1ad) allows multiple C-VLANs to be tunneled via a new S-VLAN tag on the same Ethernet frame.

    If you need some background, refer to the What is QinQ blog post or others on the web.

    This post is basic and is meant for beginners that wish to understand this feature.

    The reader is assumed to have some basic networking knowledge as well as basic knowledge using Mellanox Ethernet switches.

     

     

    References

     

    Overview

    The QinQ feature can be used to turn a Mellanox switch into a virtual pass through device, providing transparent connectivity with port aggregation, where the rack of servers or blade server chassis appear to the upstream network as a single server.

    In this very simple mode, no specific VLANs or spanning tree mode needs to be configured on the switch.  Any VLAN that a server node (or storage device) uses will be passed through to the core of the network and vice-versa.  This mode is useful for integrated systems or scale out storage solutions where the network should be invisible and new workloads should be accommodated without reconfiguring VLAN IDs or spanning tree settings.

     

    This mode is particularly useful for clients that measure configuration steps in terms of headcount and want to reduce the configuration steps needed to deploy new workloads.  These are the same clients that are attracted to Cisco’s FEX architecture and use “end-host mode” to simplify the management of the edge of the data center network.  This QinQ mode is superior to Cisco’s FEX architecture in that it allows server to server traffic to be forwarded locally, taking the optimal path, while the FEX architecture forces local traffic through the uplinks to the controlling bridge which then gets sent back down the uplinks.  This forcing of east/west traffic through the uplinks increases latency and reduces the available north/south bandwidth.

     

    For active-active server I/O configurations, this mode will work with MLAG which allows a pair of switches to appear as a server to the upstream network.  For scale out solutions, this feature works with multi-tiered MLAG which extends this configuration-free zone to many hundreds of 10GbE ports.

     

    Where multi-tenant security is important, this configuration can be extended to split a switch into 2 or more virtual pass through devices, with the extreme example of this where each server has its own virtual pass through port.  This last example is useful for turning an embedded 40GbE blade server switch into a virtual 40GbE pass through module, where each blade server has its own external 40GbE port and no port aggregation takes place.

     

    1. Switchport Mode

    A new switchport mode type has been added to the CLI called dot1q-tunnel. This mode should be applied to the ports that you want to tunnel over the S-VLAN.

    switch (config interface ethernet 1/1) # switchport mode dot1q-tunnel

    The S-VLAN tag, 100 for example, is configured as follows:

    switch (config interface ethernet 1/1) # switchport access vlan 100

    In this case VLAN 100 is added to all ingress traffic to ethernet port 1/1.

     

    Note: There is no option to add S-VLAN to specific group of C-VLANs, all traffic through the port gets tagged with one S-VLAN.

     

    The uplink tunnel itself (switchport mode) should be configured as trunk.

    Note: Switchport trunk mode allows passing packets with one VLAN as well as packets with two VLANs. The switch uses the upper VLAN tag for the switching purpose.

     

    2. QoS Consideration

    In case of tunneling, there is a question what priority to grant to the S-VLAN.

    There are two options in this case:

    • Take it from the inner VLAN priority (pipe) - default
    • Take it from the port priority (uniform)

     

    For example:

    switch(config)# interface ethernet 1/1

    switch(config interface ethernet 1/1)# switchport mode dot1q-tunnel qos-mode uniform

     

    3. Ethertype Considerations

    The default Ethertype of the S-VLAN tag is 0x8100.

     

    4. Supported Interface Types

    The dot1q-tunnel option is supported on all interface types:

    • Ethernet interfaces
    • LAGs
    • MLAGs

     

    5. MAC Learning

    MAC learning is done on the S-VLAN (see the example below). C-VLANs are not learned on the switch.

     

    6. Broadcast

    C-VLAN broadcast is passed via the S-VLAN tunnel, and reaches all tunnel endpoints (i.e. all ports configured with switchport mode dot1q-tunnel for the same S-VLAN tunnel).

     

    In the example below, there are two C-VLANs (10 and 20), and in case there's broadcast on C-VLAN 10, the servers S3 and S4 will also receive the broadcast tagged with VLAN 10 as they are on the same tunnel.

    QinQ 2 - New Page.jpeg

     

     

    7. Three VLAN Tag Example

    It is less common, but it is possible to have a setup that creates a frame with three VLANs using the guidelines below and several switches.

    Here is a wireshark example:

    • Inner VLAN is 10
    • Middle VLAN is 100
    • Upper VLAN is 1000

    3.PNG

     

    8. IP Interface

    It is not possible to create a VLAN interface or router port that is assigned to the VLAN attached to a dot1q-tunnel.

    For example, if VLAN 100 is already configured as the access VLAN for dot1q-tunnel, the creation of the VLAN interface 100 is blocked.

    switch (config) # interface vlan 100

    % Vlan is used as PVID for dot1q-tunnel interface

     

    9. Port-to-Port Tunneling Over One Switch

    It is possible to create a QinQ tunnel within the same switch.

    In order to do that, the two ports of the tunnel should be configured as dot1q-tunnel (switchport mode).

    In the example below any traffic (any VLAN) is able to pass through the switch from server S1 to server S2.

    QinQ 3 - New Page.jpeg

    Example:

    switch (config) # vlan 100

    switch (config vlan 100) # exit

    switch (config)# interface ethernet 1/1 switchport mode dot1q-tunnel

    switch (config)# interface ethernet 1/1 switchport access vlan 100

    switch (config)# interface ethernet 1/2 switchport mode dot1q-tunnel

    switch (config)# interface ethernet 1/2 switchport access vlan 100

     

    10. MTU Considerations

    The default MTU is 1500 bytes without the Ethernet header (Maximum packet size 1522 bytes with the Ethernet header including one VLAN). The Q-in-Q feature increases the frame size by 4 bytes (adding another VLAN) when the extra tag is added, you must configure all switches in the network to be able to process maximum frames by increasing the switch interface MTU size to at least 1526 bytes in total.

     

    Configuration Example

    • Use MLNX-OS 3.4.3002 or later.
    • In the following diagrams, sx01 and sx02 are SX1710 switches, while S1 and S2 are Linux Servers.
    • The 802.1Q links can be configured on the servers as untagged, one VLAN or multiple VLANs.
    • S-VLAN 100 is used in this case for the tunneling of the traffic from the servers.
    • Switchport trunk mode is configured on the link (port 1/2) between the switches (while allowing all VLANs - including VLAN 100).

    QinQ - New Page (4).jpeg

     

    Switch Configuration

     

    The configuration is similar for both switches.

     

    1. Create VLAN 100. Run:

    switch (config) # vlan 100

    switch (config vlan 100) # exit

     

    2. Set the switchport mode to dot1q-tunnel for the ports connected to the servers (eth 1/1). Run:

    switch (config)# interface ethernet 1/1
    switch (config interface ethernet 1/1) # switchport mode dot1q-tunnel

     

    3. Set the S-VLAN tag to 100. Run:

    switch (config interface ethernet 1/1) # switchport access vlan 100

     

    4. Set the QinQ link to trunk. Run:

    switch (config interface ethernet 1/2) # switchport mode trunk

    Note: By default, trunk mode allows all VLANs.

     

    Server Configuration

    Any interface (with or without VLAN) could be configured on the server.

     

    Monitoring

    1. Check the switchport status. Run:

    switch (config) # show interfaces switchport

    Interface       Mode         Access vlan        Allowed vlans

    --------------------------------------------------------------

    Eth1/1          dot1q-tunnel 100               

    Eth1/2          trunk        N/A                1, 100

    ...

     

    2. Check VLAN configuration. Run:

    switch (config) # show vlan

     

    VLAN    Name            Ports

    ----    -----------     -------------------------

    ...

     

    100                     Eth1/1, Eth1/2

     

     

     

    3. Check interface configuration. Run:

    switch (config) #  show interfaces ethernet 1/1

     

    Eth1/1

      Admin state: Enabled

      Operational state: Up

      Description: N\A

      Mac address: e4:1d:2d:37:50:f1  

      MTU: 1500 bytes(Maximum packet size 1522 bytes)

      Flow-control: receive off send off

      Actual speed: 40 Gbps            

      Width reduction mode: Unknown

      Switchport mode: dot1q-tunnel

      QoS mode: pipe

      MAC learning mode: Enabled

      Last clearing of "show interface" counters : Never              

      60 seconds ingress rate: 0 bits/sec, 0 bytes/sec, 0 packets/sec

      60 seconds egress rate: 0 bits/sec, 0 bytes/sec, 0 packets/sec

     

    Rx

      81                   packets

      56                   unicast packets

      19                   multicast packets

      6                    broadcast packets

      11221                bytes

      0                    error packets

      0                    discard packets

     

    Tx

      420                  packets

      55                   unicast packets

      364                  multicast packets

      1                    broadcast packets

      33845                bytes

      0                    discard packets

     

    4. Check MAC address table. Run:

    switch (config) # show mac-address-table

     

    Vlan    Mac Address         Type         Port

    ----    -----------         ----         ------------

    100     E4:1D:2D:26:3B:C1   Dynamic      Eth1/2

    100     E4:1D:2D:26:3C:E1   Dynamic      Eth1/1

    Number of unicast:    2

    Number of multicast:    0

    Both server MACs are learned on the S-VLAN 100.

     

    5. Wireshark snapshot.

    Here is an example of an ARP frame sent on port 1/2 of the switch (swichport mode trunk - running QinQ). You can see two VLANs in this case:

    • The S-VLAN is 100
    • The C-VLAN is 10

    2.PNG