[Written by Arne Heitmann]
This post describes how to setup MLAG using Mellanox NEO.
Data centers are a key part of any modern business. Be it for media and entertainment, database services, storage, web services or interactive communication and artificial intelligence; the applications are countless and we all rely on them. Critical datacenter applications, therefore, have to be always available: Servers need dual port or two NICs, racks require 2 Top-of-the-Rack (ToR) switches, and the leaf and spine redundancy is a given these days.
The Spanning-Tree Protocol (STP) also provides redundant paths at Layer 2. However, by using only one single dedicated path at a time, with redundant links blocked to avoid loops, STP is inefficient in utilizing available network bandwidth. On the other hand, rolling out Layer 3 routing up to the virtual machine (VM) is either a bit too much for some environments or not supported by the application. Therefore, we need an alternative.
Multi-Chassis Link Aggregation (MLAG) is such an alternative solution. MLAG provides redundancy over two switches either as a technique on its own running link aggregation or adding more flexibility and efficiency to an STP setup. As such, MLAG offers a resilient network, scaling from just a small environment, like a typical Ethernet Storage Fabric or a small private cloud, to large data centers, like public or hybrid clouds. Flexibility and scalability is key here and, of course, the efficient use of the bandwidth provided: It is nice to have two redundant 25GbE links for your VMs or two redundant 50GbE links for your storage cluster, but it is even nicer to use the full bandwidth of both links concurrently in an “active/active” fashion through link aggregation. Ideally, the available additional bandwidth which simply doubles the original one is load balanced based on your preference, e.g., session based or just address based.
Setting Up MLAG
There are several ways to set up an MLAG domain in Mellanox Onyx or in Cumulus Linux (where MLAG is often referred to as “CLAG”). The traditional way is to open the console and configure both switches in the CLI. This requires some planning - it is always recommended to make a drawing and planning sheet. You have to configure everything manually which is time consuming. Some configuration steps are identical, some are mirrored, but all are prone to human errors as the average user does not configure MLAG on a daily basis. So, we need to check, double check and troubleshoot. Hopefully, at the end of the day we have a working MLAG domain – fingers crossed.
But there is a better way than configuring MLAG on each individual switch with a CLI. Imagine you could sit in your office, and remotely, automatically, and visually monitor and configure the entire cluster or cloud for high availability, and that you could monitor a heat-map to show traffic hot spots. By installing and running Mellanox NEO, Mellanox’s network orchestrator you can do all that and more. Using NEO automation and visualization simplifies network configuration and eliminates manual errors. It is easy to install NEO: just download a virtual machine for your hypervisor, import it, and start discovering the Mellanox infrastructure - ConnectX hosts and the switches running ONYX or Cumulus Linux. Now you have a management suite for automated data center monitoring and maintenance. You will be automatically informed if critical thresholds are met. You can schedule upgrades and configuration tasks for the whole infrastructure (or just parts of it), and leave the rest to NEO. Mellanox NEO automatically saves configurations and handles events based on configurable policies. Furthermore, NEO can provision functions and protocols based on groups of devices or ports. And it’s very easy to roll over the MLAG configuration to another rack - NEO has a nice feature for service provisioning and MLAG is one of such services. All you need to do is to create a new service for MLAG using the “add” button. In a new window, you select your MLAG switch pair from a drop-down menu of the managed devices, choose the network operating system (Mellanox Onyx or Cumulus Linux), enter the name and description for your MLAG pair and optionally choose interfaces.
Your next step will be applying the configuration, which will be done in a few seconds:
Verifying MLAG Configuation
Now you are done and can verify your configuration – for example in the CLI:
## MLAG configurations
mlag-vip neo-mlag-vip-4094 ip 192.168.1.1 /24 force
no mlag shutdown
mlag system-mac 00:00:5E:00:01:00
interface port-channel 1 ipl 1
interface vlan 4094 ipl 1 peer-address 10.10.10.2
Alternatively, you may consider using the WebUI:
Or have a look at the NEO dashboard:
Summary and References
Rather than having two CLI sessions, which may involve some typos and some troubleshooting, you have done the MLAG provisioning with NEO in seconds. In NEO, there is no need to use correct syntax. Just choose and set up with a few clicks – 5 minutes work – verification included – and time for a cup of coffee.
If you would like to dig a bit into the details, visit:
- Mellanox Community - MLAG: https://community.mellanox.com/docs/DOC-1434
- Mellanox Community - MLAG with MAGP: https://community.mellanox.com/docs/DOC-1476
- Mellanox Community – VXLAN with MLAG using Cumulus Linux https://community.mellanox.com/docs/DOC-2728
- Mellanox Community – NEO: https://community.mellanox.com/docs/DOC2348
- Mellanox Community - Mellanox NEO plugin for Nutanix: https://community.mellanox.com/docs/DOC-2784
- Mellanox Ansible Solutions: https://www.ansible.com/integrations/networks/mellanox
- Mellanox Ansible Link Aggregation (LAG and MLAG): https://github.com/ansible/ansible/pull/34204/files/096cc20473c9bc7ba1cfcdc651b01ab634efee41
- Mellanox Ansible MLAG VIP: https://docs.ansible.com/ansible/devel/modules/onyx_mlag_vip_module.html - onyx-mlag-vip-module
- Mellanox Ansible IPL: https://docs.ansible.com/ansible/devel/modules/onyx_mlag_ipl_module.html - onyx-mlag-ipl-module
- Mellanox NEO and Nutanix solution: http://www.mellanox.com/related-docs/solutions/SB_Nutanix_NEO.pdf
- Mellanox Ethernet Storage Fabric: http://www.mellanox.com/ethernet-storage-fabric/