Understanding Subnet Manager (SM) High Availability (HA) on Mellanox InfiniBand Switches

Version 5

    This post explains the InfiniBand SM High Availability (HA) synchronization functionality on Mellanox InfiniBand switches.

     

    References

     

    Overview

     

    High Availability in InfiniBand

    In InfiniBand, only one SM manages an InfiniBand subnet. However, Multiple SMs can be enabled on the same subnet. In such a case, one of the SMs will be elected as the subnet SM and the rest will be operationally disabled (standby). If the administrator SM dies for any reason, another SM will be elected to manage the network.

     

    Could there be an issue with this?

    The SM configuration files may not be in sync. For example, assuming two IB nodes (A and B) are enabled with SM. Let's also assume that the user configures an SM parameter on node A but does not configure it on node B. If the SM that runs on node A dies, the new SM to be elected from node B will not have that configuration and thus the network may not operate as before.

     

    Mellanox SM HA Solution (Mellanox InfiniBand Switches)

    • When enabling SM HA (configuration synchronization) on Mellanox IB switches, the SM database is synchronized with all the switches enabled with SM.
    • The synchronization is done out-of-band using an Ethernet management network. All switches participating in the SM HA should be connected to the same management subnet (same network) without the need to go through a router. This is because the switches send multicast control frames that do not cross routers normally.
    • All the switches that participate in the Mellanox SM HA are joined to the InfiniBand subnet ID. Once joined, the synchronized SMs are launched. One of the nodes is elected as SM Master and the others are Slaves.
    • The SM HA allows the systems’ manager to enter and modify all InfiniBand SM configuration of the different subnet managers from a single location using a Virtual IP (VIP). All subnet managers can be controlled, started, or stopped from this VIP address. The user is expected to use the VIP address for SM configuration. Trying to configure SM parameters on a master or slave IP will be disabled.

     

    Setup

    • InfiniBand network with several switches (at least two). The SM HA will be enabled on the switches. To test the feature, a minimum setup of two switches connected together suffices.
    • All switches participating in the SM HA should have the same CPU type (either all x86 or all PPC)
    • All switches should have the same MLNX-OS version.
    • All switches participating in the SM HA should be connected to the same management subnet (same network) without the need to pass through a router.

     

    For this post, two Mellanox SX6036 FDR (36 56Gb/s port) switches (sx21 and sx22) are used, connected to each other on ports 1/1 and 1/2.

     

    Planning

    The plan is to enable SM HA on both switches.

    We need to generate a Virtual IP address for the SM HA, as part of the management network.

    In this example

    Switch / SM cluster namemgmt0 IP address
    sx2110.20.2.21/16
    sx2210.20.2.22/16
    my-sm-cluster10.20.2.160/16

     

    Configuration

    1. Create an SM HA cluster with planned VIP and SM HA cluster name, and Virtual IP on the first switch (sx21).

    sx21 [standalone: master] (config) # ib ha my-sm-cluster ip 10.20.2.160 /16

    sx21 [my-sm-cluster: master] (config) #

     

    2. Add the second switch (sx22) to the cluster. Just mention the cluster name (same name).

    sx22 [standalone: master] (config) #  ib ha my-sm-cluster

    sx22 [my-sm-cluster: standby] (config) #

     

    3. Enable SM on both switches (applicable only from the master).

    sx21 [my-sm-cluster: master] (config) # ib smnode sx21 enable

    sx21 [my-sm-cluster: master] (config) # ib smnode sx22 enable

     

    4. (Optional) Specify the SM priority (range: 0-15; higher number means higher priority) to manage the election of the SM in your desired order (applicable only from the master).

    sx21 [my-sm-cluster: master] (config) # ib smnode sx21 sm-priority 1

    sx21 [my-sm-cluster: master] (config) # ib smnode sx22 sm-priority 2

     

     

    Verification

    1. Check the IB HA status.

    sx21 [my-sm-cluster: master] (config) # show ib ha

     

    Global HA state

    ==================

    IB Subnet HA name: my-sm-cluster

    HA IP address:     10.20.2.160/16

    Active HA nodes:   2

     

    HA node local information

      Name:         sx21 (active)  <--- (local node)

      SM-HA state:  master

      IP:           10.20.2.21

      Virtual switch membership:    infiniband-default

     

    HA node local information

      Name:         sx22 (active)

      SM-HA state:  standby

      IP:           10.20.2.22

      Virtual switch membership:    infiniband-default

     

     

    Check a brief status of HA.

    sx21 [my-sm-cluster: master] (config) # show ib ha brief

     

    Global HA state

    ==================

    IB Subnet HA name: my-sm-cluster

    HA IP address:     10.20.2.160/16

    Active HA nodes:   2

     

    ID                   SM-HA state   IP              Virtual switch membership

    --------------------------------------------------------------------------------

    *sx21         master        10.20.2.21     infiniband-default

    sx22         standby       10.20.2.22     infiniband-default

     

    2. Show IB SM nodes status (per switch).

    mti-mar-sx21 [my-sm-cluster: master] (config) # show ib smnodes

     

    HA state of switch infiniband-default

    ========================================

    IB Subnet HA name: my-sm-cluster

    HA IP address:     10.20.2.160/16

    Active HA nodes:   2

     

    HA node local information

      Name:         sx21 (active)  <--- (local node)

      SM-HA state:  master

      SM Licensed:  yes

      SM Running:   running

      SM Enabled:   enabled

      SM Priority:  1

      IP:           10.20.2.21

     

    HA node local information

      Name:         sx22 (active)

      SM-HA state:  standby

      SM Licensed:  yes

      SM Running:   running

      SM Enabled:   enabled

      SM Priority:  2

      IP:           10.20.2.22

     

     

    MLNX-OS WebUI

    For the webUI, use VIP Address to change SM configuration.

     

    1. Login to 10.20.2.160 (VIP address).

     

    2. Go to System > HA to configure and change the HA cluster name and VIP.

     

     

    3. Go to IB SM Mgmt > Base SM to change the SM nodes parameters.

     

    Capture.PNG