HowTo Locate the Physical Interface on a multi-host InfiniBand Environment

Version 2

    This post shows a simple way to locate the physical mlx5 device in a multi-host InfiniBand environment.

    This feature is available starting with MLNX_OFED 4.2

     

    References

     

    Overview

    Visualized mlx5 InfiniBand network device that is enabled with SR-IOV, do show which PCI device is virtualized and which is the physical, however, with some multi-host environments, you may not get that notation.

     

    For example: mlx5 InfiniBand port enabled with SR-IOV - you can see here which PCI is virtual.

    $ lspci | grep Mel

    05:00.0 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4]

    05:00.1 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4]

    05:00.2 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function]

    05:00.3 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function]

    05:00.4 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function]

    05:00.5 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function]

     

    In this other example we have multihost InfiniBand environment while 0003 are the physical devices and the 0033 are the VFs, but no notation.

    # lspci | grep Mel

    0003:01:00.0 Infiniband controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]

    0003:01:00.1 Infiniband controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]

    0033:01:00.0 Infiniband controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]

    0033:01:00.1 Infiniband controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]

     

    To find out with device is physical, and can run openSM, print the has_smi paramter in the following path.

    # cat /sys/class/infiniband/mlx5_0/ports/1/has_smi

    1

    # cat /sys/class/infiniband/mlx5_1/ports/1/has_smi

    1

    # cat /sys/class/infiniband/mlx5_2/ports/1/has_smi

    0

    # cat /sys/class/infiniband/mlx5_3/ports/1/has_smi

    0

     

    As you can see mlx5_0 and mlx5_1 devices are physical (has_smi=1). For our example, you could run openSM on ports.