HowTo Configure InfiniBand Gateway HA (Proxy ARP)

Version 10

    This post shows how to configure Two InfiniBand gateways in Proxy ARP High Availability (HA) unicast mode.

     

    Proxy ARP Modes

    The Proxy ARP can operate in two modes:

    • Unicast
    • Multicast

    There is no option to pass multicast traffic on the unicast mode. This post is focused on the High Availability for unicast traffic only.

     

    Scalability

    The Proxy Arp distributes traffic through multiple gateways by balancing load based on source and destination IP addresses (active-active). Optionally, you can change the mode to ‘active-standby’ and one unit will carry all traffic while others sit idle in standby mode. There can be up to 16 gateways configured together in this manner. One is the group Master while the others will be Slaves. This post shows HA with two gateways only.

     

    References

    • MLNX-OS User Manual

     

    Prerequisites

     

    1. Switch Licenses and System Capabilities: Before you start make sure that the switches you have equipped with gateway license.

    Note: In case you use SX6036G there is no need for the gateway license (it is embedded).

    In case you wish to have 56GbE on the Ethernet side, you need to install 56GbE license as well. 40-56GbE Switch Upgrade License is Now Free

    Gateway-A [standalone: master] (config) # show licenses

    License 1: <license key>

       Feature:          EFM_SX

       Description:      Generic SX license

       Valid:            yes

       Chassis serial number: <serial> (ok)

       Active:           yes

       Eth port SW speed limit: 56Gb

     

    License 2: <license key>

       Feature:          EFM_SX

       Description:      Generic SX license

       Valid:            yes

       Chassis serial number: <serial> (ok)

       Active:           yes

       Eth enabled:      true

       Full Eth L2 enabled: true

       Eth L3 enabled:   true

       GW ports number:  1

    Gateway-A [standalone: master] (config) #   

     

    Verify system capabilities:  Make sure that GW is supported.

    Gateway-A [standalone: master] (config) # show system capabilities

    IB: Supported

    Ethernet: Supported, Full L2, L3

    GW: Supported

    Max SM nodes: 648

    IB Max licensed speed: FDR

    Ethernet Max licensed speed: 56Gb

    Gateway-A [standalone: master] (config) # 

     

    3. Verify that you use 3.3.4402 or later MLNX-OS software.

    Gateway-A [standalone: master] (config) # show version

    Product name:      SX_PPC_M460EX

    Product release:   SX_3.4.0000

    Build ID:          #1-dev

    Build date:        2014-10-14 20:26:41

    Target arch:       ppc

    Target hw:         m460ex

    Built by:          jenkins@fit74

    Version summary:   SX_PPC_M460EX SX_3.4.0000 2014-10-14 20:26:41 ppc

     

     

    Product model:     ppc

    Host ID:           0002C9639E7A

     

     

    Uptime:            22h 35m 32.760s

    CPU load averages: 1.06 / 1.18 / 1.22

    Number of CPUs:    1

    System memory:     605 MB used / 1422 MB free / 2027 MB total

    Swap:              0 MB used / 0 MB free / 0 MB total

    Gateway-A [standalone: master] (config) #  

     

    3. Switch Systems CPU type: Make sure you have two switches of the same CPU type. Both switches in a Proxy-ARP HA group must have the same CPU architecture (i.e. PPC or x86). It is recommended to have two switches with the same model.

    Use the show version command (see above) to verify that.

     

    4. Make sure that the system profile of the switch is vpi-single-switch

    Gateway-A [standalone: master] (config) # show system profile

    vpi-single-switch

    In case it is not, you can change it using "system profile" command.

    Note: Changing the system profile deletes all the existing switch configurations and reboots the system. Management interface configuration remains.

     

    5. IP Routing, IB SM and IGMP must be disabled on the switches.

    Gateway-A [standalone: master] (config) # no ip routing

    Gateway-A [standalone: master] (config) # no ip igmp snooping

    Gateway-A [standalone: master] (config) # no ib sm

    Note: those commands will not be applicable in case the ip proxy-arp is already enabled.

     

    6. Management network: All Gateway systems must be on the same management subnet network.  The synchronization is done out-of-band using an Ethernet management network.

     

    7. Make sure that you have two types of ports (ib and ethernet) connected to two networks.

    For example, in case all ports are configured as InfiniBand, you need to set the proper port to be set as Ethernet. In this example, port 1/1 type is selected to be Ethernet.

    Gateway-A [standalone: master] (config) #show ports type

    InfiniBand: 1/1 1/2 1/3 1/4 1/5 1/6 1/7 1/8 1/9 1/10 1/11 1/12 1/13 1/14 1/15 1/16 1/17 1/18 1/19 1/20 1/21 1/22 1/23 1/24 1/25 1/26 1/27 1/28 1/29 1/30 1/31 1/32 1/33 1/34 1/35 1/36

    Gateway-A [standalone: master] (config) # port 1/1 type ethernet force

    Gateway-A [standalone: master] (config) # show port type

    Ethernet:   1/1

    InfiniBand: 1/2 1/3 1/4 1/5 1/6 1/7 1/8 1/9 1/10 1/11 1/12 1/13 1/14 1/15 1/16 1/17 1/18 1/19 1/20 1/21 1/22 1/23 1/24 1/25 1/26 1/27 1/28 1/29 1/30 1/31 1/32 1/33 1/34 1/35 1/36

     

     

    High Availability Group and Election Method

    All the switches that participate in the Proxy-ARP group are joined under a Proxy-ARP group name. One of the nodes is elected as master and the others become slaves. Proxy-ARP HA uses virtual IP address (VIP) that is always directed to the master node. The configuration is set via the VIP to the master node which in turn distributes it to the slave nodes. MLNX-OS centralized VIP is defined when the first Proxy-ARP HA group member is added by using the command proxy-arp ha <group-name> ip <ip_addr> <mask> (mentioned below).Proxy-ARP HA always directs the configured IP address (VIP) to the master node.

     

     

    Setup

    In this setup, there are 4 switches

    • Two SX6036 configured as a gateway.
    • one SX6036 configured as InfiniBand switch.
    • one SX1036 configured as Ethernet switch.

     

    1%3Fauth_token%3Dd051c08816b4c478b510c4c945f808f92107a22d

     

     

    Configuration

    Two high availability are shown in this configuration example - Gateway-A and Gateway-B.

     

    Plan your network:

    management network (1G):

    • Gateway-A management address is: 10.7.13.121/24
    • Gateway-B management address is: 10.7.13.122/24
    • VIP management address is:           10.7.13.123/24

     

    Production network (40GbE + InfiniBand FDR):

    • Subnet: 11.11.11.0/24

     

    Configuration on the first gateway (Gateway-A):

    1. Set the hostname:

    switch [standalone: master] (config) # hostname Gateway-A

    Gateway-A [standalone: master] (config) #

     

    2. Create a VLAN and map it to the relevant Ethernet port. In this case, VLAN 10 is used and mapped to port 1/1 as untagged.

    Gateway-A [standalone: master] (config) # vlan 10

    Gateway-A [standalone: master] (config vlan 10)# exit

    Gateway-A [standalone: master] (config)# interface ethernet 1/1 switchport access vlan 10

     

    3. Configure/Verify the required MTU on all ports. Make sure that the InfiniBand MTU is similar to the Ethernet MTU. In most cases the default the MTU is 1500 bytes for Ethernet subnets while 4K in InfiniBand.

    To set the MTU on the ethernets port run:

    Gateway-A [standalone: master] (config)# interface ethernet 1/1 mtu 4096 force

    Similarly, to set the MTU on the infiniBand ports, run

    Gateway-A [standalone: master] (config)# interface ib 1/2 mtu 4096 force

    Note: the default MTU for InfiniBand ports is 4096 (In most cases, the default configuration exists - there is no need to change it).

     

    4. Enable IP Proxy ARP globally.

    Gateway-A [standalone: master]  (config)# ip proxy-arp

     

    5. Enable Proxy ARP HA.

    The configured VIP address needs to be in the management IP subnet (same as mgmt0).

    Gateway-A [standalone: master]  (config)# proxy-arp ha my-ha-group ip 10.7.13.123 /24

     

    Verify that the the prompt is change - you can see the HA group משצקon the brackets.

    Gateway-A [my-ha-group: master] (config) #

     

    Configuration on the second gateway (Gateway-B):

    1. Set the hostname:

    switch [standalone: master] (config) # hostname Gateway-B

    Gateway-B [standalone: master] (config) #

     

    2. Create a VLAN and map it to the relevant Ethernet port. In this case, VLAN 10 is used and mapped to port 1/1 as untagged.

    Gateway-B [standalone: master] (config) # vlan 10

    Gateway-B [standalone: master] (config vlan 10)# exit

    Gateway-B [standalone: master] (config)# interface ethernet 1/1 switchport access vlan 10

     

    3. Configure/Verify the required MTU on all ports. Make sure that the InfiniBand MTU is similar to the Ethernet MTU. In most cases the default the MTU is 1500 bytes for Ethernet subnets while 4K in InfiniBand.

    To set the MTU on the ethernets port run:

    Gateway-B [standalone: master] (config)# interface ethernet 1/1 mtu 4096 force

    Similarly, to set the MTU on the infiniBand ports, run

    Gateway-B [standalone: master] (config)# interface ib 1/2 mtu 4096 force

     

    4. Enable IP Proxy ARP globally.

    Gateway-B [standalone: master]  (config)# ip proxy-arp

     

    5. Enable Proxy ARP HA.

    In this case, there is no need to mention the VIP address, but only the group name.

    Note: The group name must be identical to the group name of Gateway-A.

    Gateway-B [standalone: master]  (config)# proxy-arp ha my-ha-group

     

    -----

    Once you reach this point, the gateway should start synchronize (replace control frames) and select a master.

    Once a master is selected, the prompt will be changed.

     

    For example, Gateway A is selected as a master.

     

    Gateway-A [my-ha-group: master] (config) #

     

    Gateway-B will be the standby (regarding management)

    Gateway-B [my-ha-group: standby] (config) #

     

    In this case, Gateway-A will be the one to response own the VIP (respond to SSH)

     

    Gateway HA VIP Configuration

    Connect with SSH to the VIP (10.7.13.123 in this example) - The master Gateway will respond to this (in our case Gateway-A).

    Note: These commands below only work when connected to the VIP address. Do not confuse this VIP address with the VIP that is sometimes configured for SM HA (which wouldn't be configured on a gateways because the SM must be disabled).

     

    Production network (40GbE + InfiniBand FDR) in our case use subnet: 11.11.11.0/24

    It means that both hosts on the InfiniBand side of the network and the Ethernet side will have interfaces within this network.

     

    The following configuration are done while connecting with SSH to the VIP address.

    1. Create a Proxy Arp interface, assign to this interface the proper VLAN and pkey (in our case, VLAN 10 and PKEY 0x7fff

     

    Gateway-A [my-ha-group: master] (config) # interface proxy-arp 1

    Gateway-A [my-ha-group: master]  (config interface proxy-arp 1) # ip vlan 10

    Gateway-A [my-ha-group: master] (config interface proxy-arp 1) # ip pkey 0x7fff

     

    2. Assign group members (gateways) to this group, and assign each one an IP interface in the production network. Each gateway (member of the HA cluster) should have an IP address that belongs to the production network.

    Gateway-A [my-ha-group: master] (config interface proxy-arp 1) # ha member Gateway-A ip address 11.11.11.21

    Gateway-A [my-ha-group: master] (config interface proxy-arp 1) # ha member Gateway-B ip address 11.11.11.22

     

    3. Set the netmask of this network

     

    Gateway-A [my-ha-group: master] (config interface proxy-arp 1) # ip netmask /24

     

    4. Set the Proxy ARP MTU. The MTU should be aligned with the Ethernet and IB interfaces. Note that Proxy ARP MTU is limited to 4092 bytes.

     

    Gateway-A [my-ha-group: master] (config)interface proxy-arp 1 )# mtu 4092   

     

     

    5. (Optionally) if several subnets is configured on the network set the default route. If all hosts are in the same subnet, there is no need to configure this proxy-arp interface default route

     

    Gateway-A [my-ha-group: master] (config interface proxy-arp 1) # ip route default 11.11.11.254

     

    6. Enable the Proxy ARP interface

    Gateway-A [my-ha-group: master] (config interface proxy-arp 1) # no shutdown

     

    7. (Optionally) Set the HA load balance algorithm.

     

    The options are:

    • Active - Standby (active-standby) = One Proxy-ARP system carries all the traffic, other boxes are standby
    • Active - Active (ip-base-ib)= Traffic is distributed between the active systems. Load is distributed based on source and destination server IP addresses (default).
    Gateway-A [my-ha-group: master] (config) # proxy-arp ha lb-algorithm ib-base-ip

     

    Verification

     

    1. Make sure that all relevant ports (Ethernet and InfiniBand) are up and running in the physical layer and L2 (Up state).

    Gateway-A [my-ha-group: master] (config) # show interfaces ethernet status

     

     

    Port                   Operational state           Speed                  Negotiation

    ----                   -----------------           -----                  -----------

    Eth1/1                 Up                          40 Gbps             No-Negotiation

    Gateway-A [my-ha-group: master] (config) #        

    Gateway-A [my-ha-group: master] (config) # show interfaces ib status

     

    Interface      Description                                Speed                   Current line rate   Logical port state   Physical port state

    ---------      -----------                                ---------               -----------------   ------------------   -------------------

    Ib 1/2                                                    14.0 Gbps rate          56.0 Gbps           Active               LinkUp

    ...

    Gateway-A [my-ha-group: master] (config) #

     

     

    You can use other commands such as:

    switch (config) # show interfaces ethernet 1/1

    switch (config) # show interfaces ethernet 1/1 counters

    switch (config) # show interfaces ib 1/2

     

    2. Make sure SM is enabled on the InfiniBand fabric.

     

    To check if OpenSM is running on the run (on the server the SM is enabled)

    # /etc/init.d/opensmd status
    # OpenSM is running... pid=3258

    To start OpenSM run:

    # /etc/init.d/opensmd start

    Refer to MLNX_OFED or openSM manuals for more SM related configuration.

     

    3. Show Proxy ARP HA.

     

    Important: Run the following commands while connected to the VIP:

     

    This command shows the summary HA summary status.

    Gateway-A [my-ha-group: master] (config) # show proxy-arp ha

     

    Load balancing algorithm: ib-base-ip

    Number of Proxy-Arp interfaces: 1

     

    Proxy-ARP VIP

    =============

    Pra-group name: my-ha-group

    HA VIP address: 10.7.13.123/24

     

    Active nodes:

    ID                   State                IP

    -----------------------------------------------------

    Gateway-A            master               10.7.13.121

    Gateway-B            standby              10.7.13.122

     

    Gateway-A [my-ha-group: master] (config) # 

     

    4. Show Proxy ARP interface status:

     

    Gateway-A [my-ha-group: master] (config) # show interfaces proxy-arp 1

    Proxy-arp 1

      Admin state: Enabled

      Operational state: Up

      GUID: 00:02:C9:03:00:AD:31:48

      Internet Address: 11.11.11.21/24

      Broadcast Address: 11.11.11.255

      Description: N/A

      MTU: 4092

      Counters: Disabled

      Bridged interfaces: vlan 10, pkey 0x7fff

     

    Gateway-A [my-ha-group: master] (config) #  

     

    Another option to get the same information is by this command (applicable only while connected to the VIP)

    Gateway-A [my-ha-group: master] (config) # show interfaces proxy-arp 1 ha member Gateway-A

    Proxy-arp 1 member Gateway-A

      Admin state: Enabled

      Operational state: Up

      GUID: 00:02:C9:03:00:AD:31:48

      Internet Address: 11.11.11.21/24

      Broadcast Address: 11.11.11.255

      Description: N/A

      MTU: 4092

      Counters: Disabled

      Bridged interfaces: vlan 10, pkey 0x7fff

     

    Gateway-A [my-ha-group: master] (config) #   

     

    Show specific Proxy ARP member interface status. For example, the interface on Gateway-B  (applicable only while connected to the VIP).

    Gateway-A [my-ha-group: master] (config) # show interfaces proxy-arp 1 ha member Gateway-B

    Proxy-arp 1 member Gateway-B

      Admin state: Enabled

      Operational state: Up

      GUID: 00:02:C9:03:00:7E:55:A8

      Internet Address: 11.11.11.22/24

      Broadcast Address: 11.11.11.255

      Description: N/A

      MTU: 4092

      Counters: Disabled

      Bridged interfaces: vlan 10, pkey 0x7fff

     

    5. Get the details configuration and status of the Proxy ARP  interface.

     

    Important: Run the following commands while connected to the VIP.

     

    Gateway-A [my-ha-group: master] (config) # show interfaces proxy-arp 1 ha detail

    Proxy-arp 1

      Table update interval: 120 seconds

      PRA table fast learn time: 1 second

      Master election learning interval: 30 seconds

      Advertisement interval: 1 second

      PRA table learn time: 30 seconds

      Keep Alive loss threshold: 10

      Keep Alive loss interval: 120 seconds

      Host list differential update interval: 3 seconds

      Host list update interval: 300 seconds

      Load balancing algorithm: ib-base-ip

      IP masklen: 24

      Admin state: Enabled

      MTU: 4092

      Counters: Disabled

      Bridged interfaces: vlan 10, pkey 0x7fff

      Number of members: 2

     

      Member          Admin State     LB State        Operational State    IP                   Priority

      --------------------------------------------------------------------------------------------------

      Gateway-A       Enabled         Active          Up                   11.11.11.21          100

      Gateway-B*      Enabled         Active          Up                   11.11.11.22          100

    Gateway-A [my-ha-group: master] (config) #                              

     

    6. Verify connectivity on the production network via ICMP (ping) from one host to another.

     

    7. Check the ARP table.

    Note: run this command separably one each gateway (from the Gateway IP address and not from the VIP).

    For example, from Gateway-B.

    Gateway-B [my-ha-group: standby] (config) # show ip arp interface proxy-arp 1

     

    Total number of entries: 2

     

     

      Address              Type            Hardware Address          Interface

      ------------------------------------------------------------------------

      11.11.11.1           Dynamic ETH     00:02:C9:45:5F:E0         proxy-arp 1

    Gateway-B [my-ha-group: standby] (config) #

     

    8. Check which Gateway is passing traffic between two hosts (two IP addresses).

    Gateway-A [my-ha-group: master] (config) # show interfaces proxy-arp 1 ha designated-member 11.11.11.1 11.11.11.2

    Proxy-arp 1 Ethernet IP 11.11.11.2 IB IP 11.11.11.1 member Gateway-B

    Gateway-A [my-ha-group: master] (config) # 

     

    9. It is possible to turn on counters on the proxy-arp interface. The counters admin mode default is disabled.

    It is recommended to turn counters only when troubleshooting is needed.

     

    Gateway-A [my-ha-group: master] (config) # interface proxy-arp 1 counters

     

    Gateway-A [my-ha-group: master] (config) #  show interfaces proxy-arp 1 ha member Gateway-A

    Proxy-arp 1 member Gateway-A

      Admin state: Enabled

      Operational state: Down

      GUID: 00:02:C9:03:00:AD:31:48

      Internet Address: 11.11.11.21/24

      Broadcast Address: 11.11.11.255

      Description: N/A

      MTU: 4092

      Counters: Enabled

      Bridged interfaces: vlan 10, pkey 0x7fff

     

    Vlan counters

    -------------

    RX

      0                    Unicast packets

      0                    Multicast packets

      0                    Unicast bytes

      0                    Multicast bytes

      0                    Bad packets

      0                    Bad bytes

    TX

      0                    Unicast packets

      0                    Multicast packets

      0                    Unicast bytes

      0                    Multicast bytes

     

    Pkey counters

    -------------

    RX

      0                    Unicast packets

      0                    Multicast packets

      0                    Unicast bytes

      0                    Multicast bytes

      0                    Bad packets

      0                    Bad bytes

    TX

      0                    Unicast packets

      0                    Multicast packets

      0                    Unicast bytes

      0                    Multicast bytes

     

    Gateway-A [my-ha-group: master] (config interface proxy-arp 1) #