3 Replies Latest reply on Jul 19, 2016 10:56 AM by cmm

    IPoIB Interop VMware ESXi 6 to Windows 10/2012 R2 nodes with IS5022 InfiniScale "unmanaged" switch partitions.conf pkeys configured nodes not able to reach each other

    ciscokid

      I am pretty new to Infiniband/VPI configurations and I am currently running 4 Nodes connected to an InfiniScale IS5022 unmanaged switch.  One of the nodes is running Windows 10 with OpenSM and ConnectX-3 dual port HCA.  The other 3 nodes are running VMware ESXi 6 with VMware Standard Switches using ConnectX-3 dual port HCAs for their uplinks.  I have installed the appropriate OFED driver package, native driver package and MST package on all 3 VMware ESXi 6 hosts as well as the Windows OFED driver package on the Windows 10 host running the OpenSM instance.  I know that everything is working correctly because I can create VLAN/PKey mappings using the partitions.conf file and appropriate configurations on the VMware Standard Switch port configurations and see that the hosts can communicate with each other with newly created VMware PortGroups with VLAN tagging defined in partitions.conf.

       

      As a test, I created PortGroups with VLAN tags that have not been defined in the partitions.conf and validated that the hosts/guests configured for the VMware PortGroups were not able to communicate with each other.  Then I created the VLAN/PKey mapping for that VMware PortGroup and the guests/hosts were able to communicate with each other without issue.  The problem I am facing is that hosts outside of the VMware environment but connected to the same IB switch are NOT able to communicate with each other more specifically the host that is running the subnet manager.  I validated that the Windows node is seeing the correct mapping leveraging "mlxtool dbg pkeys" command with the following output.  I have configured the Windows 10 node with a Team and 5 VLAN tagged interfaces in VLAN 111, 112, 211, 212 and 213.  The host was assigned and IP address in each one of the given VLANs/Partitions using IB miniport drivers.

       

      C:\Program Files\Mellanox\MLNX_VPI\Tools>mlxtool dbg pkey

             ConnectX IPoIB NIC: Lag1112__IBSwitch1__Port1

                    ---------------- ----------------

                   |   PKEY index   |      PKEY      |

                    ---------------- ----------------

                   |           0    |        ffff    |

                   |           1    |        806f    |

                   |           2    |        8070    |

                   |           3    |        80d3    |

                   |           4    |        80d4    |

                   |           5    |        80d5    |

                    ---------------- ----------------

             ConnectX IPoIB NIC: Lag1112__IBSwitch1__Port2

                    ---------------- ----------------

                   |   PKEY index   |      PKEY      |

                    ---------------- ----------------

                   |           0    |        ffff    |

                   |           1    |        806f    |

                   |           2    |        8070    |

                   |           3    |        80d3    |

                   |           4    |        80d4    |

                   |           5    |        80d5    |

                    ---------------- ----------------

       

      Below is the configuration of the partitions.conf configured for the given environment.  I commented out the Multicast sections for each partition as I was seeing errors generated and complaining about invalid Multicast Group IDs in the osm.log.  From the Windows 10 host, I ran the arp -a and I do not see any ARP entries for the 3 VMware ESXi hosts.  What am I missing?

       

      # Rate =

      #   2  =  2.5  GBit/s

      #   3  =  10   GBit/s

      #   4  =  30   GBit/s

      #   5  =  5    GBit/s

      #   6  =  20   GBit/s

      #   7  =  40   GBit/s

      #   8  =  60   GBit/s

      #   9  =  80   GBit/s

      #   10 = 120   GBit/s

      #

      # MTU =

      #   1 = 256

      #   2 = 512

      #   3 = 1024

      #   4 = 2048

      #   5 = 4096

      #

      #

      #

      Default=0x7fff, rate=7, mtu=5, scope=2, defmember=full:

              ALL, ALL_SWITCHES=full;

      Default=0x7fff, ipoib, rate=7, mtu=5, scope=2:

      #        mgid=ff12:401b::ffff:ffff       # IPv4 Broadcast address

      #        mgid=ff12:401b::1               # IPv4 All Hosts group

      #        mgid=ff12:401b::2               # IPv4 All Routers group

      #        mgid=ff12:401b::16              # IPv4 IGMP group

      #        mgid=ff12:401b::fb              # IPv4 mDNS group

      #        mgid=ff12:401b::fc              # IPv4 Multicast Link Local Name Resolution group

      #        mgid=ff12:401b::101             # IPv4 NTP group

      #        mgid=ff12:401b::202             # IPv4 Sun RPC

      #        mgid=ff12:601b::1               # IPv6 All Hosts group

      #        mgid=ff12:601b::2               # IPv6 All Routers group

      #        mgid=ff12:601b::16              # IPv6 MLDv2-capable Routers group

      #        mgid=ff12:601b::fb              # IPv6 mDNS group

      #        mgid=ff12:601b::101             # IPv6 NTP group

      #        mgid=ff12:601b::202             # IPv6 Sun RPC group

      #        mgid=ff12:601b::1:3             # IPv6 Multicast Link Local Name Resolution group

        ALL=full, ALL_SWITCHES=full;

      #

      #

      #

      VLAN0111=0x006f, rate=7, mtu=5, scope=2, defmember=full:

              ALL, ALL_SWITCHES=full;

      VLAN0111=0x006f, ipoib, rate=7, mtu=5, scope=2:

      #        mgid=ff12:401b::ffff:ffff       # IPv4 Broadcast address

      #        mgid=ff12:401b::1               # IPv4 All Hosts group

      #        mgid=ff12:401b::2               # IPv4 All Routers group

      #        mgid=ff12:401b::16              # IPv4 IGMP group

      #        mgid=ff12:401b::fb              # IPv4 mDNS group

      #        mgid=ff12:401b::fc              # IPv4 Multicast Link Local Name Resolution group

      #        mgid=ff12:401b::101             # IPv4 NTP group

      #        mgid=ff12:401b::202             # IPv4 Sun RPC

      #        mgid=ff12:601b::1               # IPv6 All Hosts group

      #        mgid=ff12:601b::2               # IPv6 All Routers group

      #        mgid=ff12:601b::16              # IPv6 MLDv2-capable Routers group

      #        mgid=ff12:601b::fb              # IPv6 mDNS group

      #        mgid=ff12:601b::101             # IPv6 NTP group

      #        mgid=ff12:601b::202             # IPv6 Sun RPC group

      #        mgid=ff12:601b::1:3             # IPv6 Multicast Link Local Name Resolution group

        ALL=full, ALL_SWITCHES=full;

      #

      #

      #

      VLAN0112=0x0070, rate=7, mtu=5, scope=2, defmember=full:

              ALL, ALL_SWITCHES=full;

      VLAN0112=0x0070, ipoib, rate=7, mtu=5, scope=2:

      #        mgid=ff12:401b::ffff:ffff       # IPv4 Broadcast address

      #        mgid=ff12:401b::1               # IPv4 All Hosts group

      #        mgid=ff12:401b::2               # IPv4 All Routers group

      #        mgid=ff12:401b::16              # IPv4 IGMP group

      #        mgid=ff12:401b::fb              # IPv4 mDNS group

      #        mgid=ff12:401b::fc              # IPv4 Multicast Link Local Name Resolution group

      #        mgid=ff12:401b::101             # IPv4 NTP group

      #        mgid=ff12:401b::202             # IPv4 Sun RPC

      #        mgid=ff12:601b::1               # IPv6 All Hosts group

      #        mgid=ff12:601b::2               # IPv6 All Routers group

      #        mgid=ff12:601b::16              # IPv6 MLDv2-capable Routers group

      #        mgid=ff12:601b::fb              # IPv6 mDNS group

      #        mgid=ff12:601b::101             # IPv6 NTP group

      #        mgid=ff12:601b::202             # IPv6 Sun RPC group

      #        mgid=ff12:601b::1:3             # IPv6 Multicast Link Local Name Resolution group

        ALL=full, ALL_SWITCHES=full;

      #

      #

      #

      VLAN0211=0x00d3, rate=7, mtu=5, scope=2, defmember=full:

        ALL, ALL_SWITCHES=full;

      VLAN0211=0x00d3, ipoib, rate=7, mtu=5, scope=2:

      #        mgid=ff12:401b::ffff:ffff       # IPv4 Broadcast address

      #        mgid=ff12:401b::1               # IPv4 All Hosts group

      #        mgid=ff12:401b::2               # IPv4 All Routers group

      #        mgid=ff12:401b::16              # IPv4 IGMP group

      #        mgid=ff12:401b::fb              # IPv4 mDNS group

      #        mgid=ff12:401b::fc              # IPv4 Multicast Link Local Name Resolution group

      #        mgid=ff12:401b::101             # IPv4 NTP group

      #        mgid=ff12:401b::202             # IPv4 Sun RPC

      #        mgid=ff12:601b::1               # IPv6 All Hosts group

      #        mgid=ff12:601b::2               # IPv6 All Routers group

      #        mgid=ff12:601b::16              # IPv6 MLDv2-capable Routers group

      #        mgid=ff12:601b::fb              # IPv6 mDNS group

      #        mgid=ff12:601b::101             # IPv6 NTP group

      #        mgid=ff12:601b::202             # IPv6 Sun RPC group

      #        mgid=ff12:601b::1:3             # IPv6 Multicast Link Local Name Resolution group

        ALL=full, ALL_SWITCHES=full;

      #

      #

      #

      VLAN0212=0x00d4, rate=7, mtu=5, scope=2, defmember=full:

        ALL, ALL_SWITCHES=full;

      VLAN0212=0x00d4, ipoib, rate=7, mtu=5, scope=2:

      #        mgid=ff12:401b::ffff:ffff       # IPv4 Broadcast address

      #        mgid=ff12:401b::1               # IPv4 All Hosts group

      #        mgid=ff12:401b::2               # IPv4 All Routers group

      #        mgid=ff12:401b::16              # IPv4 IGMP group

      #        mgid=ff12:401b::fb              # IPv4 mDNS group

      #        mgid=ff12:401b::fc              # IPv4 Multicast Link Local Name Resolution group

      #        mgid=ff12:401b::101             # IPv4 NTP group

      #        mgid=ff12:401b::202             # IPv4 Sun RPC

      #        mgid=ff12:601b::1               # IPv6 All Hosts group

      #        mgid=ff12:601b::2               # IPv6 All Routers group

      #        mgid=ff12:601b::16              # IPv6 MLDv2-capable Routers group

      #        mgid=ff12:601b::fb              # IPv6 mDNS group

      #        mgid=ff12:601b::101             # IPv6 NTP group

      #        mgid=ff12:601b::202             # IPv6 Sun RPC group

      #        mgid=ff12:601b::1:3             # IPv6 Multicast Link Local Name Resolution group

        ALL=full, ALL_SWITCHES=full;

      #

      #

      #

      VLAN0213=0x00d5, rate=7, mtu=5, scope=2, defmember=full:

        ALL, ALL_SWITCHES=full;

      VLAN0213=0x00d5, ipoib, rate=7, mtu=5, scope=2:

      #        mgid=ff12:401b::ffff:ffff       # IPv4 Broadcast address

      #        mgid=ff12:401b::1               # IPv4 All Hosts group

      #        mgid=ff12:401b::2               # IPv4 All Routers group

      #        mgid=ff12:401b::16              # IPv4 IGMP group

      #        mgid=ff12:401b::fb              # IPv4 mDNS group

      #        mgid=ff12:401b::fc              # IPv4 Multicast Link Local Name Resolution group

      #        mgid=ff12:401b::101             # IPv4 NTP group

      #        mgid=ff12:401b::202             # IPv4 Sun RPC

      #        mgid=ff12:601b::1               # IPv6 All Hosts group

      #        mgid=ff12:601b::2               # IPv6 All Routers group

      #        mgid=ff12:601b::16              # IPv6 MLDv2-capable Routers group

      #        mgid=ff12:601b::fb              # IPv6 mDNS group

      #        mgid=ff12:601b::101             # IPv6 NTP group

      #        mgid=ff12:601b::202             # IPv6 Sun RPC group

      #        mgid=ff12:601b::1:3             # IPv6 Multicast Link Local Name Resolution group

        ALL=full, ALL_SWITCHES=full;

        • Re: IPoIB Interop VMware ESXi 6 to Windows 10/2012 R2 nodes with IS5022 InfiniScale "unmanaged" switch partitions.conf pkeys configured nodes not able to reach each other
          cmm

          Hello,

          If I am understanding correctly, you have some servers with InfiniBand interfaces, and also Ethernet virtual hosts connected to the same unmanaged IS50xx switch.

           

          InfiniBand is an alternative to Ethernet. They are different L2 protocols. In order for IB hosts to communicate with Ethernet hosts, you need a protocol gateway switch. Mellanox offers the SX6036G gateway device for ETH<>IB protocol mapping. In other words, the gateway maps pkeys (Infiniband) to vlans (Ethernet). Each VLAN maps to exactly one pkey.

           

          In the Infiniband world, there is no awareness of "vlan". In the Ethernet world, there is no awareness of "Ethernet".
          Only the SX6036G protocol gateway is capable of mapping vlans to pkeys (using a proxy-arp interface" configuration, within its running configuration), allowing end to end IB<>Eth communication.

           

          Typically, the partitions.conf file is rarely edited and the Subnet Manager process is simply "enabled", and then Infiniband ports will all get LID address and will link UP. The Subnet Manager has no awareness of Ethernet at all.

           

          All IS5xxx switches are InfiniBand only.

          Mellanox SX1xxx switches are Ethernet only.
          The SX6036G is capable of running some ports as Ethernet, while others are InfiniBand, and is additionally able to map traffic between these ports.

           

          Each adapter is using protocol Ethernet OR InfiniBand.

           

          If your virtual hosts are and must use Ethernet, then you can only communicate with an IB host if a gateways lies in between.

          In summary, either all interfaces must be Eth, or all must be IB, unless the SX6036G replaces the IS5022 switch.

            • Re: IPoIB Interop VMware ESXi 6 to Windows 10/2012 R2 nodes with IS5022 InfiniScale "unmanaged" switch partitions.conf pkeys configured nodes not able to reach each other
              ciscokid

              Thanks for your reply.  However, all hosts both Windows and VMware ESXi are using IPoIB connectivity to the IS5022 switch.  It turns out that the issue wasn't at all with the infrastructure components once they were configured correctly. I reformatted the Windows 10 host as Windows 2012 R2 leveraging the 5.22 OFED driver package and everything worked like it should.  I was able to ping the ESXi 6 hosts and the VM guests that were leveraging port-groups configured with the IPoIB uplinks.  This is probably due to issues seen with other networking products like Intel PROSet and the creation of VLANs and Teams with Windows 10.  Microsoft has drastically changed the driver architecture for the kernel drivers used for networking.  Intel is been working with Microsoft since November of last year trying to resolve issues seen with the build 1511 for Windows 10.  Looks like Mellanox should join the list of vendors who also have issues with Microsoft's flagship OS.

            • Re: IPoIB Interop VMware ESXi 6 to Windows 10/2012 R2 nodes with IS5022 InfiniScale "unmanaged" switch partitions.conf pkeys configured nodes not able to reach each other
              cmm

              If you were using an inbox InfiniBand driver for Win 10 (or any O.S.), this is a very limited driver and the Mellanox OFED driver should be installed as best practice.

              The WinOF 5.22 technically supports Windows 10. Support for Windows 10 client (64 bit only) was added in WinOF 5.10.

              Glad to hear you got it working on Windows 2012R2.