L3 OSPF Network Provisioning at Scale with Ansible over Mellanox Ethernet Switches running Cumulus Linux OS (Draft)

Version 18

    References

    Ansible Documentation — Ansible Documentation

    Configure network interfaces using NCLU — Ansible Documentation

    Automating Cumulus Linux with Ansible

    Getting Started with Ansible Management of Spectrum Switches Installed with Cumulus Linux

    Ansible Playbook Example of Copy/Fetch with Mellanox Spectrum Switch installed with Cumulus Linux

    L3 Network Design with OSPF at Scale with Mellanox NEO (Draft)

    Single Rack HA Layer 2 MAGP network deployment with Ansible (DRAFT)

     

    Introduction

    In addition to Mellanox Onyx software, Mellanox Spectrum based devices are introducing the Linux experience with running Cumulus-Linux operating system over ONIE (Open Network Install Environment). A revolutionary approach to switch operating systems using native Linux tools within a switch operating system and a rich set of L3 features to provide a wide range of data center applications.

    Ansible provides simple automation framework for network automation which has modern dev-ops feature functionalities ready. Mellanox Spectrum based switches installed with Cumulus Linux supports Ansible automation.

    This document will demonstrate how to deploy L3 OSPF at scale over Mellanox Spectrum Switches running Cumulus Linux OS using Ansible automation.

     

    Overview of Mellanox Components

    • Mellanox Spectrum Switch family provides the most efficient network solutions for the ever-increasing performance demands of Data Center applications.
    • Mellanox LinkX Cables and Transceivers family provides the industry’s most complete line of 10, 25, 40, 50, 100, 200, and 400Gb interconnect products for Cloud, Web 2.0, Enterprise, telco, and storage data centers applications. They are often used to link top-of-rack switches downwards to servers, storage & appliances and upwards in switch-to-switch applications

     

    Data Center Overview

    Data center networks have traditionally been built in a three-layer hierarchical tree consisting of access, aggregation, and core layers.

    Increasing east-west traffic within the data center (server-server, server-storage, etc.), raised an alternative to the traditional access-aggregation-core network model, which is becoming widely used over time.

    This architecture shown below, known as a Clos or leaf-spine network, is designed to minimize the number of hops between hosts.

     

     

     

    The aggregation and core layers are merged into the spine layer. Every leaf switch connects to every spine switch, ensuring that all leaf switches are only one hop away from one another in order to minimize latency and chances for bottlenecks in the network.

     

    Solution Overview

    Solution Diagram



     

    Solution Description

    • Mellanox Spectrum 100GB SN2700 Switches are running Cumulus Linux OS and connected in Leaf-spine topology with a single ToR per rack
    • Maximal scale can be achieved by using 8 x SN2700 Switches as Spine and 32 x SN2700 as Leaf - allowing a scale of up to 768 nodes with a single host link to each Leaf switch, 32 racks, 24 servers per rack with 3:1 blocking ratio
    • 8x100GbE connection from leaf to spine switches, 32x100GbE per spine switch via QSFP28 100GbE passive copper cables
    • Dedicated management port in each Mellanox switch connected to a switch management network - We strongly recommend using out-of-band management.
    • Ansible is used as configuration automation tool
    • OSPF is used as L3 routing protocol
      • Single ospf area
      • Leaf switches are redistributing upstream their connected interfaces
    • Software Versions:
    • Ansible: 2.7
    • Cumulus: 3.7.1

     

    Notes:

    • It is possible to change the blocking ratio in order to get a different capacity
    • SN2100 Switch is sharing the same feature set with SN2700 and can be used in this solution when lower capacity is required
    • SN2410 Switch can be used as Leaf when 25GB host facing port is required instead of 100GB, see L3 Network Design with OSPF at Scale with Mellanox NEO (Draft) for scale details
    • 2-Leaf-Per-Rack Solution for host HA can be used as well on expanse of nodes scale (384 instead of 768)

     

    Network Configuration

    Example Configuration

    Our example shows multi-rack configuration connectivity of two leaf switches to two spine switches. Each leaf switch is connected to every Spine with 100GB cable.

     

    The following is a cross-switch port connectivity table of our example:

     

    Interface type

    Spine1-SWX-009

    Spine2-SWX-008

    Leaf1-SWX-011

    Leaf2-SWX-010

    OSPF

    swp10

     

    swp8

     

    swp10

    swp9

    OSPFswp11swp8
    swp11swp9

     

     

     

    The procedure below describes how to configure L3 OSPF Fabric with Ansible playbooks

     

    Prerequisites

    • One Ubuntu 16.04 physical\virtual server with a sudo non-root user as Ansible management server
    • Ansible release 2.3 and up (Cumulus Network Command Line Utility [NCLU] module support)
    • Mellanox Spectrum Ethernet Switched with Cumulus Linux OS release 3.4 and up (NCLU support)
    • Default configuration and a valid license on the Cumulus Switches

     

    Step 1: Install the Ansible Management Server

    Ansible for Ubuntu is provided by PPA(personal package archive) repository. We can add the Ansible PPA by typing the following command:

    sudo apt-add-repository ppa:ansible/ansible

    Press ENTER to accept the PPA addition.Next, we need to refresh our system's package index so that it is aware of the packages available in the PPA. Afterwards, we can install the software:

    sudo apt-get update
    sudo apt-get install ansible

    We now have all of the software required to provision our switches with Ansible which primarily uses SSH to communicate with the switches.

    Before starting, establish SSH connection to each one of the switches from the Ansible server to ensure they are accessible and to load their known host keys:

    ssh cumulus@your_switch_ip

    Default cumulus user password is CumulusLinux!

     

    Step 2: Configuring Ansible Hosts and Variables

    Ansible is using an inventory "hosts" file to manage and provision devices. It can include explicit hosts, hosts groups, and groups of groups.

    In our case the file will include 3 groups:

    - main group named "switches" with all switches as members next to a variable named "ansible_hosts" to specify the mgmt IP per device for SSH access

    - group named "ospf-rids" with all switches as members next to a variable named "ospf-rids" to specify the OSPF router-id per device during OSPF configuration

    - group named "leafs" with the two Leaf switches as member to be used for OSPF configuration needs to be applied only on the leaf switches in our topology

     

    Open the file for editing:

    sudo vim /etc/ansible/hosts

    Add the following:

    [switches]

    spine-swx-008 ansible_host=10.7.215.55

    spine-swx-009 ansible_host=10.7.215.56

    leaf-swx-010 ansible_host=10.7.215.68

    leaf-swx-011 ansible_host=10.7.215.85

     

    [ospf-rids]

    spine-swx-008 rid=2.2.2.8

    spine-swx-009 rid=1.1.1.9

    leaf-swx-010 rid=2.2.2.10

    leaf-swx-011 rid=1.1.1.11

     

    [leafs]

    leaf-swx-010

    leaf-swx-011

    Now we will create a common group variable file for the "switches" groups to include the SSH credentials which are common to all group members, this information will be used by Ansible when trying to access the devices

    sudo mkdir -p /etc/ansible/group_vars/switches

    sudo vim /etc/ansible/group_vars/switches/main.yml

    Add the following into the YAML file, notice "---" is indicating the start of a YAML formatted file:

    ---

     

    ansible_ssh_user: cumulus

    ansible_ssh_pass: CumulusLinux!

     

     

    Step 3: Using Ansible Playbook for Fabric Configuration

    Creating Roles

    We will create several roles that will be used during the playbook execution:

    - Role per device in the topology to apply specific configuration relevant to each device

    - General role for enabling OSPF on all devices in the topology

    - Specific OSPF role to set OSPF configuration relevant only to leaf devices in our topology (connected interfaces redistribution)

     

    "ansible-galaxy" tool will be used for initializing the roles and will create the following directory tree per role:

    ├── <Role_Name>

    │ ├── defaults

    │ │ └── main.yml

    │ ├── files

    │ ├── handlers

    │ │ └── main.yml

    │ ├── meta

    │ │ └── main.yml

    │ ├── README.md

    │ ├── tasks

    │ │ └── main.yml

    │ ├── templates

    │ ├── tests

    │ │ ├── inventory

    │ │ └── test.yml

    │ └── vars

    │ └── main.yml

    Issue the following commands to create the roles directory and initialize the roles:

    sudo mkdir /etc/ansible/roles

    cd /etc/ansible/roles

    ansible-galaxy init spine1_rack1

    ansible-galaxy init spine2_rack2

    ansible-galaxy init leaf1_rack1

    ansible-galaxy init leaf2_rack2

    ansible-galaxy init ospf

    ansible-galaxy init ospf_leafs

     

    Creating Tasks

    We will now to define tasks per role. The tasks are defined in main.yml YAML file under "tasks" directory for each role.

     

    Spine1 tasks:

    Edit /etc/ansible/roles/spine1_rack1/tasks/main.yml with the following tasks to Set hostname and IP interfaces

    ---

     

    # Set Hostname

    - name: Set Hostname for spine1_rack1

    nclu:

    commands:

    - add hostname spine-swx-009

    atomic: true

    description: "Set Hostname"

     

    # Set Interfaces

    - name: Set IP interfaces for spine1_rack1

    nclu:

    commands:

    - add interface {{ item.swp_int }} ip address {{ item.ip_addr }}

    atomic: true

    description: "Add ip interfaces"

    with_items:

    - { swp_int: swp10, ip_addr: 192.168.109.9/24 }

    - { swp_int: swp11, ip_addr: 192.168.119.9/24 }

    Note: Cumulus NCLU commands must be followed by "commit" statement to be actually applied on the switch as effective configuration. A "commit" statement can be replaced by "atomic" as we used above.

    The "atomic" statement flushes anything in the commit pending configuration on the switch before executing the commands to ensure that no other pending, manual, misconfigured changes get committed during playbook run.

     

    Spine2 tasks:

    Edit /etc/ansible/roles/spine2_rack2/tasks/main.yml with the following tasks to Set hostname and IP interfaces

    ---

     

    # Set Hostname

    - name: Set Hostname for spine2_rack2

    nclu:

    commands:

    - add hostname spine-swx-008

    atomic: true

    description: "Set Hostname"

     

    # Set Interfaces

    - name: Set IP interfaces for spine2_rack2

    nclu:

    commands:

    - add interface {{ item.swp_int }} ip address {{ item.ip_addr }}

    atomic: true

    description: "Add ip interfaces"

    with_items:

    - { swp_int: swp10, ip_addr: 192.168.108.8/24 }

    - { swp_int: swp11, ip_addr: 192.168.118.8/24 }

    Leaf1 tasks:

    Edit /etc/ansible/roles/leaf1_rack1/tasks/main.yml with the following tasks to Set hostname and IP interfaces

    ---

     

    # Set Hostname

    - name: Set Hostname for leaf1_rack1

    nclu:

    commands:

    - add hostname leaf-swx-011

    atomic: true

    description: "Set Hostname"

     

    # Set Interfaces

    - name: Set IP interfaces for leaf1_rack1

    nclu:

    commands:

    - add interface {{ item.swp_int }} ip address {{ item.ip_addr }}

    atomic: true

    description: "Add ip interfaces"

    with_items:

    - { swp_int: swp8, ip_addr: 192.168.118.11/24 }

    - { swp_int: swp9, ip_addr: 192.168.119.11/24 }

    Leaf2 tasks:

    Edit /etc/ansible/roles/leaf2_rack2/tasks/main.yml with the following tasks to Set hostname and IP interfaces

    ---

     

    # Set Hostname

    - name: Set Hostname for leaf2_rack2

    nclu:

    commands:

    - add hostname leaf-swx-010

    atomic: true

    description: "Set Hostname"

     

    # Set Interfaces

    - name: Set IP interfaces for leaf2_rack2

    nclu:

    commands:

    - add interface {{ item.swp_int }} ip address {{ item.ip_addr }}

    atomic: true

    description: "Add ip interfaces"

    with_items:

    - { swp_int: swp8, ip_addr: 192.168.108.10/24 }

    - { swp_int: swp9, ip_addr: 192.168.109.10/24 }

    ospf tasks:

    Edit /etc/ansible/roles/ospf/tasks/main.yml with the following tasks to set router-id and enable OSPF on the interfaces of all devices

    Note: the "rid" per device is taken from the hosts file.

    ---

    - name: Enable OSPF

    nclu:

    commands:

    - add ospf router-id {{ rid }}

    - add ospf network {{ item.prefix }} area {{ item.area }}

    atomic: true

    description: "Enable OSPF"

    with_items:

    - { prefix: 192.168.0.0/16, area: 0.0.0.0 }

    ospf_leafs tasks:

    Edit /etc/ansible/roles/ospf_leafs/tasks/main.yml with the following tasks to redistribute connected routes on the Leaf devices

    ---

    - name: Redistribute Connected Routes on Leaf Switches

    nclu:

    commands:

    - add ospf redistribute connected

    atomic: true

    description: "OSPF redistribute connected"

     

    Creating and Running the Playbook

    We will create the playbook YAML file to execute the tasks we defined above.

    The playbook will run the following:

    1. Configure hostname and IP interfaces on every device (all members of "switches" hosts inventory group)

    2. Configure router-id and enable OSPF on all devices (all members of "ospf-rids" hosts inventory group)

    3. Configure redistribute connected routes on Leaf devices (all members of "leafs" hosts inventory group)

     

    Create file named setup.yml under /etc/ansible directory and open it for edit:

    sudo vim /etc/ansible/setup.yml

    Add the following:

    ---

     

    #Configure switches per role

    - name: configure spine1_rack1

    hosts: spine-swx-009

    gather_facts: no

    roles:

    - spine1_rack1

     

    - name: configure spine2_rack2

    hosts: spine-swx-008

    gather_facts: no

    roles:

    - spine2_rack2

     

    - name: configure leaf1_rack1

    hosts: leaf-swx-011

    gather_facts: no

    roles:

    - leaf1_rack1

     

    - name: configure leaf2_rack2

    hosts: leaf-swx-010

    gather_facts: no

    roles:

    - leaf2_rack2

     

    #Enable OSPF on all nodes

    - name: Enable OSPF

    hosts: ospf-rids

    gather_facts: no

    roles:

    - ospf

     

    #Redistribute Connected Routes on Leaf Switches

    - name: Redistribute Connected Routes on Leaf Switches

    hosts: leafs

    gather_facts: no

    roles:

    - ospf_leafs

    Issue the following command from /etc/ansible directory to run the playbook:

    # ansible-playbook -i hosts setup.yml

    This is the playbook output while devices are being configured:


    PLAY [configure spine1_rack1] **************************************************************************************************************************************************************************************************************

     

    TASK [spine1_rack1 : Set Hostname for spine1_rack1] ****************************************************************************************************************************************************************************************

    changed: [spine-swx-009]

     

    TASK [spine1_rack1 : Set IP interfaces for spine1_rack1] ***********************************************************************************************************************************************************************************

    changed: [spine-swx-009] => (item={u'swp_int': u'swp10', u'ip_addr': u'192.168.109.9/24'})

    changed: [spine-swx-009] => (item={u'swp_int': u'swp11', u'ip_addr': u'192.168.119.9/24'})

     

    PLAY [configure spine2_rack2] **************************************************************************************************************************************************************************************************************

     

    TASK [spine2_rack2 : Set Hostname for spine2_rack2] ****************************************************************************************************************************************************************************************

    changed: [spine-swx-008]

     

    TASK [spine2_rack2 : Set IP interfaces for spine2_rack2] ***********************************************************************************************************************************************************************************

    changed: [spine-swx-008] => (item={u'swp_int': u'swp10', u'ip_addr': u'192.168.108.8/24'})

    changed: [spine-swx-008] => (item={u'swp_int': u'swp11', u'ip_addr': u'192.168.118.8/24'})

     

    PLAY [configure leaf1_rack1] ***************************************************************************************************************************************************************************************************************

     

    TASK [leaf1_rack1 : Set Hostname for leaf1_rack1] ******************************************************************************************************************************************************************************************

    changed: [leaf-swx-011]

     

    TASK [leaf1_rack1 : Set IP interfaces for leaf1_rack1] *************************************************************************************************************************************************************************************

    changed: [leaf-swx-011] => (item={u'swp_int': u'swp8', u'ip_addr': u'192.168.118.11/24'})

    changed: [leaf-swx-011] => (item={u'swp_int': u'swp9', u'ip_addr': u'192.168.119.11/24'})

     

    PLAY [configure leaf2_rack2] ***************************************************************************************************************************************************************************************************************

     

    TASK [leaf2_rack2 : Set Hostname for leaf2_rack2] ******************************************************************************************************************************************************************************************

    changed: [leaf-swx-010]

     

    TASK [leaf2_rack2 : Set IP interfaces for leaf2_rack2] *************************************************************************************************************************************************************************************

    changed: [leaf-swx-010] => (item={u'swp_int': u'swp8', u'ip_addr': u'192.168.108.10/24'})

    changed: [leaf-swx-010] => (item={u'swp_int': u'swp9', u'ip_addr': u'192.168.109.10/24'})

     

    PLAY [Enable OSPF] *************************************************************************************************************************************************************************************************************************

     

    TASK [ospf : Enable OSPF] ******************************************************************************************************************************************************************************************************************

    changed: [spine-swx-008] => (item={u'prefix': u'192.168.0.0/16', u'area': u'0.0.0.0'})

    changed: [spine-swx-009] => (item={u'prefix': u'192.168.0.0/16', u'area': u'0.0.0.0'})

    changed: [leaf-swx-011] => (item={u'prefix': u'192.168.0.0/16', u'area': u'0.0.0.0'})

    changed: [leaf-swx-010] => (item={u'prefix': u'192.168.0.0/16', u'area': u'0.0.0.0'})

     

    PLAY [Redistribute Connected Routes on Leaf Switches] **************************************************************************************************************************************************************************************

     

    TASK [ospf_leafs : Redistribute Connected Routes on Leaf Switches] *************************************************************************************************************************************************************************

    changed: [leaf-swx-011]

    changed: [leaf-swx-010]

     

    PLAY RECAP *********************************************************************************************************************************************************************************************************************************

    leaf-swx-010 : ok=4 changed=4 unreachable=0 failed=0

    leaf-swx-011 : ok=4 changed=4 unreachable=0 failed=0

    spine-swx-008 : ok=4 changed=4 unreachable=0 failed=0

    spine-swx-009 : ok=4 changed=4 unreachable=0 failed=0

    Make sure no errors during the process and that all tasks were executed with status "ok".

    Once done you can login to the devices and verify OSPF neighbors are established as expected:

    cumulus@spine-swx-009:~$ net show ospf neighbor

     

    Neighbor ID Pri State Dead Time Address Interface RXmtL RqstL DBsmL

    2.2.2.10 1 Full/DR 31.957s 192.168.109.10 swp10:192.168.109.9 0 0 0

    1.1.1.11 1 Full/DR 34.345s 192.168.119.11 swp11:192.168.119.9 0 0 0