This post is comparing Ethernet VLANs and InfiniBand PKEYs. In addition it shows how to configure PKEYs within the InfiniBand fabric.
- InfiniBand Spec (click here)
- Using InfiniBand Partitions in Exalogic Physical Environments
What exactly is PKEY?
PKEY stands for partition key. It is a 16 bit field within the InfiniBand header called BTH (Base Transport Header).
A collection of endnodes with the same PKey in their PKey Tables are referred to as being members of a partition.
A P_Key Table can specify one of two types of partition membership:
- Limited (MSB=0)
- Full (MSB=1)
The high-order bit (MSB) of the partition key is used to record the type of membership in a partition table: 0 for Limited, and 1 for
Limited members cannot accept information from other Limited members, but communication is allowed between every other combination of membership types.
The PKey field of 0xFFFF (The PKEY number is 0x7FFF) represents the default partition key. The default partition key provides Full membership in the default partition.
What exactly is VLAN tag?
VLAN tag is an optional 16 bit field in the Ethernet frame. It is split to three fields:
- VLAN ID - 12 bits
- CFI - 1 bit
- Priority - 3 bits
The VLAN tag allows to logically (or virtually) split the Ethernet LAN network to virtual LANs.
The VLAN tag carries the frame priority as well.
What is the difference between InfiniBand PKEY and Ethernet VLAN?
1. Both VLAN tag and PKEY are 16 bit field
2. The VLAN tag carries the priority while the PKEY field doesn't. The priority in InfiniBand is carried in the SL (Service Level) field (four bits - compare the 3 bit priority within the VLAN field) within the InfiniBand LRH (Local Route Header) header.
3. Membership type (Full or Limited) is only applicable in PKEYs.
4. Counters - As VLAN counters (for traffic that passes via the kernel) can be seen via the file /proc/net/vlan/<ethX.vlan>. There are no InfiniBand counters per PKEY interface.
# cat /proc/net/vlan/eth1.100
eth1.100 VID: 100 REORDER_HDR: 1 dev->priv_flags: 1
total frames received 28373
total bytes received 1191666
Broadcast/Multicast Rcvd 0
total frames transmitted 1870
otal bytes transmitted 78756
INGRESS priority mappings: 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0
EGRESS priority mappings: 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
How Do I configure partitions in InfiniBand fabric?
Let's say you wish to add partition 0x8001 in the fabric, and to have two endnodes as members in this partition.
1. The partition should be defined for the SM (subnet manager) in partitions.conf file.
(The partitions.conf file default location is /etc/opensm/partitions.conf)
- If you run UFM in your fabric, you can do that via the UFM.
- If the SM is running on the InfiniBand switch, you need to do it via the switch CLI. Refer to the MLNX-OS UM for details.
- Otherwise, you need to modify the partitions.conf file to add the pkey manually.
Here is an example of the partitions.conf file with two Full members.
Default=0xffff , ipoib : ALL, SELF=full ;
MyPartition=0x8001, ipoib : 0x0002c9030009db3f=full, 0x0002c90200262841=full;
2. In most cases, you might need to define ipoib interface for each node with that pkey. For example, define ib0.8001 as interface in the two hosts, and give each on and IP address within the same subnet.
ib0.8001 Link encap:InfiniBand HWaddr A0:00:02:00:FE:80:00:00:00:00:00:00:00:00:00
inet addr:172.16.0.1 Bcast:172.16.255.255 Mask:255.255.0.0
UP BROADCAST UNNING MULTICAST MTU:4092 Metric:1
RX packets:0 erors:0 dropped:0 overruns:0 frame:0
TX packets:0 erors:0 dropped:0 overruns:0 carrier:0
RX bytes:0 (.0 b) TX bytes:0 (0.0 b)
3. To see the list of PKEYs configured on via the openSM, run:
# smpquery PkeyTable -D 0
0: 0xffff 0x8001 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000
8: 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000
16: 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000
24: 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000
you can see the PKEY 1 is enabled (=8001).