HowTo Configure IPoIB Networks with Gateway and Multiple PKEYs

Version 11

    This post explains how to create multiple isolated IPoIB networks (via Ethernet VLANs and InfiniBand PKEYs) over InfiniBand and Ethernet via SX6036G gateway.

     

    References

     

    Setup

    The setup discussed in this post consists of  three networks:

    1. Default network (management): PKEY 0x7FFF, VLAN 1. The domain is 11.11.11.0.

    2. Web network: PKEY 0x10, VLAN 10. The domain is 20.20.1.0.

    3. Storage network: PKEY 0x20, VLAN 20. The domain is 20.20.2.0.

     

    In this example, four servers are installed as shown in the figure below:

     

    1%3Fauth_token%3Dc9c186eeb697df11e820bf0cc901210d25054d21

    In this setup, the default network connects Server 3, Server 4 and both VMs, the web network connects Server 1 and VM 1, and the storage network connects Server 2 and VM 2.

     

    Prerequisites

    1. Install any supported Linux OS on all 4 servers.

    2. Install the latest MLNX_OFED on all servers.

    3. Install KVM Hypervisor on Server 3.

    4. Invoke 2 VMs on Server 3: VM1 with any supported Linux, and VM2 with Windows OS.

    5. Install the latest MLNX_OFED on VM1 and the latest WinOF on VM2.

    6. Make sure that the SM is enabled on a Linux server (Server 4 in the example). The SM cannot be enabled on the switch in case the switch is configured as a gateway.

     

    Configuration

     

    1. Configure partitions on Server 4 (SM server), edit/create the file partitions.conf as follows:

     

    # cat /etc/opensm/partitions.conf

    Default=0xffff,ipoib: ALL=full;    # index 0

    Web=0x8010,ipoib: ALL=full;        # index 1

    Storage=0x8020,ipoib: ALL=full;    # index 2

    Note: You may choose to limit the use of a given PKey to specific GUIDs. For example:

     

    Part2=0x8020,ipoib, defmember=full : SELF, 0x0002c90300ea67d1, 0x0002C9030073D2A8;

     

    • Make sure to include only GUIDs of physical ports

    • Be sure to include the GUID of the Proxy-ARP (which will be defined below)

     

    Linux OFED UM further explains the use of the partitions configuration file.

     

     

    2. Restart OpenSM service:

     

    # service opensmd restart

     

     

    3. Enable the relevant PKEYs for the VMs. On the KVM hypervisor (Server 3) configure the virtual-to-physical PKEY mapping for each VM.

    Note: The echo operation of 0,1 or 2 is the index of the PKEY in the partitions.conf file.

     

    # cd /sys/class/infiniband/mlx4_0/iov/

    # echo 1 > 0000:02:00.1/ports/1/pkey_idx/0                 (VM1 index 0 - using PKey=0x8010, PKey index 1)

    # echo 0 > 0000:02:00.1/ports/1/pkey_idx/1                 (VM1 index 1 - using default PKey=0xffff, Pkey index 0)

    # echo 2 > 0000:02:00.2/ports/1/pkey_idx/0                 (VM2 index 0 - using PKey=0x8020, Pkey index 2)

    # echo 0 > 0000:02:00.2/ports/1/pkey_idx/1                 (VM2 index 1 - using default PKey=0xffff, Pkey index 0)

    In this case, the IPoIB QPs are created to use the PKey at index 0. As a result, the hypervisor, vm1 and vm2 IPoIB QPs will all use different PKeys.

     

    4. Verify that all configured PKEYs are available. This can be done in each of the servers and VMs, but results may vary depending on each server's configuration.

    # cat /sys/class/infiniband/mlx4_0/ports/1/pkeys/* |grep -v 0000

    0xffff

    0x8010

    0x8020

     

    5. Create interfaces for all non-default PKEY networks:

     

        a. Add interface for PKEY 0x10 on VM1 (Linux server):

    # echo 0x10 > /sys/class/net/ib0/create_child

              This will generate interface named "ib0.8010"

     

        b. Add the interface for PKEY 0x10 on VM1 (Linux server):

    C:\Users\Administrator> part_man.exe add "Ethernet 3" iPart2 8020

    Done...

    C:\Users\Administrator> part_man.exe show

    Ethernet 7 iPart2  8020

    Done...

             "Ethernet 7" interface is added to "Network Connections".

     

    6. Update the IP address and subnet mask for all the interfaces:

     

         a. For VM1 (Linux):

    # ifconfig ib0 11.11.11.1/24

    # ifconfig ib0.8010 20.20.1.1/24

             Verify the configuration:

    # ifconfig

    ...

    ib0       Link encap:InfiniBand  HWaddr A0:00:0A:19:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00

              inet addr:11.11.11.1  Bcast:11.11.11.255  Mask:255.255.255.0

              inet6 addr: fe80::214:500:0:1/64 Scope:Link

              UP BROADCAST RUNNING MULTICAST  MTU:65520  Metric:1

              RX packets:1251 errors:0 dropped:0 overruns:0 frame:0

              TX packets:55 errors:0 dropped:0 overruns:0 carrier:0

              collisions:0 txqueuelen:1024

              RX bytes:129040 (126.0 KiB)  TX bytes:3632 (3.5 KiB)

    ib0.8010  Link encap:InfiniBand  HWaddr A0:00:0A:97:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00

              inet addr:20.20.1.1 Bcast:20.20.1.255  Mask:255.255.255.0

              inet6 addr: fe80::214:500:0:1/64 Scope:Link

              UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1

              RX packets:171 errors:0 dropped:0 overruns:0 frame:0

              TX packets:22 errors:0 dropped:0 overruns:0 carrier:0

              collisions:0 txqueuelen:1024

              RX bytes:18078 (17.6 KiB)  TX bytes:1640 (1.6 KiB)

     

         b. For VM2 (Windows):

             Update the properties of interfaces "Ethernet 3" and "Ethernet 7" with the correct IP address and subnet mask.

             Verify the configuration:

     

    C:\Users\Administrator> ipconfig

     

    Windows IP Configuration

     

    Ethernet adapter Ethernet 7:

     

       Connection-specific DNS Suffix  . :

       Link-local IPv6 Address . . . . . : fe80::7dfa:d3f2:9df2:ba75%38

       IPv4 Address. . . . . . . . . . . : 20.20.2.2

       Subnet Mask . . . . . . . . . . . : 255.255.255.0

       Default Gateway . . . . . . . . . :

     

    Ethernet adapter Ethernet 3:

     

       Connection-specific DNS Suffix  . :

       Link-local IPv6 Address . . . . . : fe80::19bd:84a:af39:aa9b%19

       IPv4 Address. . . . . . . . . . . : 11.11.11.2

       Subnet Mask . . . . . . . . . . . : 255.255.255.0

       Default Gateway . . . . . . . . . :

     

    7. Verify Gateway prerequisites:

     

    switch (config) # show system capabilities

    IB: Supported

    Ethernet: Supported, Full L2, L3

    GW: Supported

    Max SM nodes: 648

    IB Max licensed speed: FDR10

    Ethernet Max licensed speed: 40Gb

     

    switch (config) # show system profile

    vpi-single-switch

     

    switch(config) # show ports type

    Ethernet:   1/9 1/10 1/11 1/12 1/13 1/14 1/15 1/16 1/17 1/18 1/19 1/20 1/21 1/22 1/23 1/24 1/25 1/26 1/27 1/28 1/29 1/30 1/31 1/32 1/33 1/34 1/35 1/36

    InfiniBand: 1/1 1/2 1/3 1/4 1/5 1/6 1/7 1/8

     

    switch(config) # show ip routing

    IP routing: disabled

     

    switch (config) # show ip igmp snooping

     

    IGMP snooping global configuration:

    IGMP snooping globally disabled

    IGMP snooping operationally disabled

    Proxy-reporting globally disabled

    Last member query interval is 1 seconds

    Mrouter timeout is 125 seconds

    Port purge timeout is 260 seconds

    Report suppression interval is 5 seconds

    IGMP snooping unregistered multicast: flood

     

    switch (config) # show ib sm

    disable

     

    8.  Create VLANs on the switch and assign Ethernet port to each VLAN:

     

    switch (config) # vlan 10

    switch (config vlan 10) # exit

    switch (config) # vlan 20

    switch (config vlan 20) # exit

    switch (config) # interface ethernet 1/23 switchport access vlan 10

    switch (config) # interface ethernet 1/25 switchport access vlan 20

    switch (config) # show vlan

     

    VLAN    Name                    Ports

    ----    ----------- --------------------------------------

    1       default                 Eth1/9, Eth1/10, Eth1/11, Eth1/12, Eth1/13,

                                    Eth1/14, Eth1/15, Eth1/16, Eth1/17, Eth1/18,

                                    Eth1/19, Eth1/20, Eth1/21, Eth1/22, Eth1/24,

                                    Eth1/26, Eth1/27, Eth1/28, Eth1/29, Eth1/30,

                                    Eth1/31, Eth1/32, Eth1/33, Eth1/34, Eth1/35,

                                    Eth1/36

    10                              Eth1/23

    20                              Eth1/25

     

    switch (config) # show interfaces switchport

    Interface       Mode        Access vlan         Allowed vlans

    ---------------------------------------------------------------------------------

    ...

    Eth1/21         access      1

    Eth1/22         access      1

    Eth1/23         access      10

    Eth1/24         access      1

    Eth1/25         access      20

    ...

    9 . Configure Proxy-ARP:

     

    switch (config) # show proxy-arp mode

      Proxy-arp mode: unicast

      Resource             Total

      --------------------------

      ETH hosts            512

      IB hosts             3520

      Unicast routes       160

      Multicast routes     0

       

      switch (config) # ip proxy-arp

      switch (config) # interface proxy-arp 1

      switch (config interface proxy-arp 10) # ip address 11.11.11.100

      switch (config interface proxy-arp 10) # ip netmask /24

      switch (config interface proxy-arp 10) # ip vlan 1

      switch (config interface proxy-arp 10) # ip pkey 0x7fff

      switch (config interface proxy-arp 10) # no shutdown

      switch (config interface proxy-arp 10) # exit

       

      switch (config) # interface proxy-arp 2

      switch (config interface proxy-arp 1) # ip address 20.20.1.100

      switch (config interface proxy-arp 1) # ip netmask /24

      switch (config interface proxy-arp 1) # ip vlan 10

      switch (config interface proxy-arp 1) # ip pkey 0x10

      switch (config interface proxy-arp 1) # no shutdown

      switch (config interface proxy-arp 1) # exit

       

      switch (config) # interface proxy-arp 3

      switch (config interface proxy-arp 1) # ip address 20.20.2.100

      switch (config interface proxy-arp 1) # ip netmask /24

      switch (config interface proxy-arp 1) # ip vlan 20

      switch (config interface proxy-arp 1) # ip pkey 0x20

      switch (config interface proxy-arp 1) # no shutdown

      switch (config interface proxy-arp 1) # exit

       

      switch (config) # show interfaces proxy-arp

      Proxy-arp 1

        Admin state: Enabled

        Operational state: Up

        GUID: 00:02:C9:03:00:73:D2:A8

        Internet Address: 11.11.11.100/24

        Broadcast Address: 11.11.11.255

        Description: N/A

        MTU: 1500

        Counters: Disabled

        Bridged interfaces: vlan 1, pkey 0x7fff

       

      Proxy-arp 2

        Admin state: Enabled

        Operational state: Up

        GUID: 00:02:C9:03:00:73:D2:A8

        Internet Address: 20.20.1.100/24

        Broadcast Address: 20.20.1.255

        Description: N/A

        MTU: 1500

        Counters: Disabled

        Bridged interfaces: vlan 10, pkey 0x10

       

      Proxy-arp 3

        Admin state: Enabled

        Operational state: Up

        GUID: 00:02:C9:03:00:73:D2:A8

        Internet Address: 20.20.2.100/24

        Broadcast Address: 20.20.2.255

        Description: N/A

        MTU: 1500

        Counters: Disabled

        Bridged interfaces: vlan 20, pkey 0x20

       

      switch (config) # show interfaces proxy-arp brief

      Interface Description        State        Bridged interfaces

      ---------------------------------------------------------------------

      proxy-arp 1        N/A                Up           vlan 1, pkey 0x7fff

      proxy-arp 2        N/A                Up           vlan 10, pkey 0x10

      proxy-arp 3        N/A                Up           vlan 20, pkey 0x20

       

      10. On each Ethernet server (server 1 and server 2) connected to the switch, set the relevant IP address:

            For example, on the storage server (Server 2, VLAN20)

      # ifconfig eth2 20.20.2.164/24

       

      11. Verify basic network connectivity:

      # ping 20.20.2.2 -c1

      PING 20.20.2.2 (20.20.2.2) 56(84) bytes of data.

      64 bytes from 20.20.2.2: icmp_seq=1 ttl=63 time=1.80 ms

       

      --- 20.20.2.1 ping statistics ---

      1 packets transmitted, 1 received, 0% packet loss, time 0ms

      rtt min/avg/max/mdev = 1.801/1.801/1.801/0.000 ms