HowTo Configure Adaptive Routing and SHIELD

Version 10

    This post instructs how to configure and validate the Adaptive Routing (AR) and SHIELD (Self-Healing Interconnect Enhancement for InteLligent Datacenters) mechanisms in an InfiniBand fabric that is equipped with Mellanox switch systems (Switch-IB™/Switch-IB™ 2 and above) and HCAs (ConnectX®-5 and above).

     

    1. Adaptive Routing and Adaptive Routing Notification

    Adaptive Routing (AR) enables the switch to select the output port based on the port's load.

    AR supports two routing modes:

    • Free AR: No constraints on output port selection.
    • Bounded AR: The switch does not change the output port during the same transmission burst. This mode minimizes the appearance of out-of-order packets.

    The Adaptive Routing Manager enables and configures Adaptive Routing mechanism on fabric switches.

    It scans all the fabric switches, identifies which switches support Adaptive Routing, and configures the AR functionality on these switches.

    The Adaptive Routing Manager supports three algorithms: LAG, TREE, and DF_PLUS.

    It configures AR groups and AR LFTs tables to allow switches to select an output port out of an AR group for a specific destination LID.

    The configuration of the AR groups depends on the selected algorithm:

    • LAG: All ports that are linked to the same remote switch are in the same AR group.

                     This algorithm suits any topology with multiple links between switches, and especially Hypercube/3D torus/mesh, where there are several links in each direction of the X/Y/Z axis.

    • TREE: All ports with minimal hops to destination are in the same AR group. This algorithm suits tree topologies such as fat tree, quasi fat tree, parallel links fat tree etc.
    • DF_PLUS: Algorithm designed for Dragonfly plus topology.

     

    2. SHIELD

    SHIELD (Self-Healing Interconnect Enhancement for InteLligent Datacenters), referred to as Fast Link Fault Recovery (FLFR) throughout this document, enables the switch to select the alternative output port if the output port provided in the Linear Forwarding Table is not in Armed/Active state.

    This mode allows fastest traffic recovery in case of switch-to-switch port failures due to link flaps or neighbor switch reboots without intervention of Subnet Manager.

    The Fast Link Fault Notification (FLFN) enables the switch to report to neighbor switches that an alternative output port for the traffic to specific destination LID should be selected to avoid sending traffic to the switch.

    This is required when the FLFR on the switch has no alternative port to select for the destination LID.

    Adaptive Routing Notification (ARN) enables the switch to send a report to the neighbor switches, if its ports are congested above the threshold. This report causes neighbor switches to select another output port to deliver the traffic to the destination LID

    NOTE:Fast Link Fault Notification (FLFN) is supported for fat tree and quasi-ftree topologies only.

    3. Adaptive Routing Manager

    The Adaptive Routing Manager implemented as Subnet Manager plug-in supports Adaptive Routing, Adaptive Routing Notification, Fast Link Fault Recovery and Fast Link Fault Notification features.

    3.1 Prerequisites

    3.1.1 Minimum AR/FLFR Requirements

    • Switch-IB™
      • OpenSM: UFM® 6.0/MLNX_OFED 4.4 or later
      • MLNX-OS ® 3.6.6162 or later (EDR)
      • Switch-IB™ firmware v11.1630.0216 for externally managed systems or later
    • Switch-IB™ 2
      • OpenSM: UFM® 6.0/MLNX_OFED 4.4 or later
      • MLNX-OS ® 3.6.6162 or later (EDR)
      • Switch-IB™ 2 firmware v15.1630.0216 for externally managed systems or later

     

    3.1.2 DF_PLUS Requirements

      • OpenSM: UFM® 5.9.8/MLNX_OFED 4.3 or later
      • MLNX-OS ® 3.6.6162 or later (EDR)
      • Switch-IB™ 2 firmware v15.1630.0216 for externally managed systems or later
      • MLNX_OFED 4.3 or later

     

    3.2 Installing the Adaptive Routing Plug-in

    The Adaptive Routing Manager is a Subnet Manager plug-in, i.e. it is a shared library (libarmgr.so) that is dynamically loaded by the Subnet Manager. The Adaptive Routing Manager is installed as a part of the Mellanox UFM or MLNX OFED installation.

     

    3.3 Running Subnet Manager with an Adaptive Routing Manager

    3.3.1 Opensm routing settings

      Before enabling Adaptive routing manager, routing engine should be correctly set in opensm.conf according to the fabric topology:

    • fat tree or quasi fat tree topology:

              routing_engine updn

              or

              routing_engine ftree

    • hypercube/enhanced hypercube topology:

              routing_engine dor

    • torus or mesh topology:

              routing_engine torus_2QoS

    • dragonfly plus topology:

              routing_engine minhop

     

    The Adaptive Routing (AR) Manager can be loaded/unloaded through UFM gv.cfg or through the MLNX_OFED OpenSM configuration file, and enabled/disabled through the AR manager configuration file.

     

    3.3.1 MLNX OFED

    3.3.1.1 Enabling Adaptive Routing

    In order to load the AR manager via MLNX_OFED, include the following settings in the OpenSM configuration file:

    #event plugin name(s)

    event_plugin_name armgr

     

    #options string that would be passed to the plugin

    event_plugin_options armgr --conf_file /etc/opensm/ar_mgr.conf

     

    qos FALSE // for DF_PLUS only

     

    3.3.1.2 Disabling Adaptive Routing

    There are two ways to disable the Adaptive Routing Manager:

    1 .Disable it explicitly in the Adaptive Routing configuration file. An HUP signal to OpenSM is required to apply the configuration.

        OR

    1. Change the opensm.conf.as follows:

    #event plugin name(s)

    event_plugin_name (null)

     

    #options string that would be passed to the plugin

    event_plugin_options (null)

    2. Restart OpenSM.

     

    3.3.2 Mellanox UFM

    3.3.2.1 Enabling Adaptive Routing

    To enable Adaptive Routing:

    1. Add 'armgr' to the gv.cfg file of UFM.

    event_plugin_name = osmufmpi armgr

    event_plugin_options = armgr --conf_file /opt/ufm/conf/opensm/ar_mgr.conf

    2. Disable QoS options in /opt/ufm/conf/opensm/opensm.conf.

    qos FALSE // for DF_PLUS only

    3. Restart UFM.

     

    3.3.2.2 Disabling Adaptive Routing

    There are two ways to disable the Adaptive Routing Manager:

     

    1. Disable it explicitly in the Adaptive Routing configuration file. An HUP signal to OpenSM is required to apply the configuration.

     

    3. Restart UFM.

        OR

    1. Remove the Adaptive Routing definitions from the gv.cfg file of UFM, so that the lines will appear as follows.

    event_plugin_name = osmufmpi

    event_plugin_options = (null)

    3. Restart UFM.

    NOTE: The Adaptive Routing mechanism is automatically disabled once the switch receives the setting of the usual linear routing table (LFT).

    Therefore, if you do not wish to use Adaptive Routing, no action is required to clear the Adaptive Routing configuration on the switches.

    3.4 Adaptive Routing Manager Options File

    The default AR Manager configuration file is located at /opt/ufm/files/conf/opensm/ar_mgr.conf.

    The AR Manager configuration file contains two types of parameters:

    1. General options - options which describe the AR Manager behavior and the AR parameters that will be applied to all the switches in the fabric.
    2. Per-switch options - options which describe specific switch behavior.

    Note the following:

      • The Adaptive Routing configuration file is case sensitive.
      • You can specify options for nonexistent switch GUID. These options will be ignored until a switch with a matching GUID will be added to the fabric.
      • The Adaptive Routing configuration file is parsed every AR Manager cycle, which in turn is executed at every heavy sweep of the Subnet Manager.
      • If the AR Manager fails to parse the configuration file, default settings will be used for all the options.

    3.4.1 General AR Manager Options

    3.4.1.1 MLNX OFED 3.2/UFM 5.6 and Beyond

     

    Option File

    Description

    Values

    ENABLE:

    <true|false>

    Enable/disable AR plugin

    Default: true

    AR_ENABLE:

    <true|false>

    Enable/disable Adaptive Routing on fabric switches.

    Note that if a switch was identified by the AR Manager as a device that does not support AR, the AR Manager will not try to enable AR on this switch. If the firmware of this switch was updated to support the AR, the AR Manager will need to be restarted (by restarting Subnet Manager) to allow it to configure the AR on this switch.

    This option can be changed on-the-fly.

    Default: true

    ARN_ENABLE:

    <true|false>

    Enable/disable Adaptive Routing Notification feature on fabric switches.

    If AR_ENABLE is set to FALSE, the ARN feature is automatically disabled.

    This option can be changed on-the-fly.

    Default: false

    FLFR_ENABLE:

    <true|false>

    Enable/disable Fast Link Fault Recovery (FLFR) on fabric switches.

    This option can be changed on-the-fly.

    Default: false

    FLFR_REMOTE_DISABLE:

    <true|false>

    Avoid sending link fault notifications to remote switches.

    If FLFR_ENABLE is set to FALSE, FLFR_REMOTE is automatically disabled

    This option can be changed on-the-fly.

    Default: false

    (send link fault notifications to remote switches)

    EN_SL_MASK

    Bitmask of SLs on which the AR will be enabled (VL if configured VL as SL)

    < 0x0000 - 0xFFFF > Default: 0xFFFE

    DISABLE_TR_MASK (experimental)

    Bitmask of disabled transport types

       #   Bit 0= UD

       #   Bit 1= RC

       #   Bit 2= UC

       #   Bit 3= DCT  

    <0x0 - 0xF>

    Default: 0x0

    AR_ALGORITHM:

    < LAG  | TREE | DF_PLUS >

    • Adaptive Routing algorithm:
    • LAG: Ports groups are created out of "parallel" links. Links that connect the same pair of switches.
    • TREE: All the ports with minimal hops to destination are in the same group. Must run together with updn or ftree routing engine.
    • DF_PLUS: an algorithm designed for Dragonfly plus topology.

    -

    AR_MODE:

    <bounded|free>

    Adaptive Routing Mode:

    • free: no constraints on output port selection
    • bounded: the switch does not change the output port during the same transmission burst. This mode minimizes the appearance of out-of-order packets

    This option can be changed on-the-fly.

    Default: bounded

    AGEING_TIME:

    <usec>

    Applicable to bounded AR mode only. Specifies the amount of time that must pass without traffic, before the switch may declare a transmission burst as finished (32-bit value).

    This option can be changed on-the-fly.

    Default: 30

    MAX_CAS_ON_SPINE[1]

    Applicable to DF_PLUS algorithm only.

    Specifies the maximal number of hosts attached to the spine switches in DragonFly islands,

    including Virtual TCAs (SHARP Aggregation nodes and GWs).

    2

    OP_MODE

    Applicable to DF_PLUS algorithm only:

    0 – spine-to-spine non-DragonFly routes not allowed

    1 – spine-to-spine non-DragonFly routes allowed

    0

    LOG_FILE: <full path>

    AR Manager log file: /opt/ufm/files/log/ar_mgr.log

    This option cannot be changed on-the-fly.

    Default: /var/log/ armgr.log

    LOG_SIZE: <size in MB

    This option defines maximal AR Manager log file size in MB. The logfile will be truncated and restarted upon reaching this limit.

    This option cannot be changed on-the-fly.

    0: unlimited log file size.

    Default: 5

     


    [1] This option is supported in MLNX_OFED and Mellanox UFM starting from v4.3 and v5.9.8, respectively.

     

    NOTE: Adaptive Routing Notification, Fast Link Fault Recovery and Link Fault notification features are applicable only to Fat tree or Quasi Fat Tree topologies in current release.

    3.4.1.2 Adaptive Routing Option File Example - Tree

    # -----

    # Adaptive Routing configuration file

    # -----

     

    # General configuration options

    # Enable / Disable the plugin (AR and RN)

    ENABLE: TRUE;             # Values: < TRUE | FALSE>

     

    # Enable / Disable Adaptive Routing on fabric switches

    AR_ENABLE: TRUE;    # Values: < TRUE | FALSE>

     

    # Enable / Disable Adaptive Routing Notifications on fabric switches

    ARN_ENABLE: FALSE;   # Values: < TRUE | FALSE>

     

    # Enable / Disable Fast Link Fault Recovery (FLFR) on fabric switches

    FLFR_ENABLE: TRUE;  # Values: < TRUE | FALSE>

    # Avoid sending link fault notifications to remote switches

    FLFR_REMOTE_DISABLE: FALSE; # Values: < TRUE | FALSE>

     

    # Adaptive Routing mode

    AR_MODE: FREE;       #BOUNDED / FREE

    # Bitmask of enabled SLs (VL if configured VL as SL)

    EN_SL_MASK: 0xFFFE;     # Value: < 0x0000 - 0xFFFF >

     

    # Bitmask of disabled transport types

    #   Bit 0= UD

    #   Bit 1= RC

    #   Bit 2= UC

    #   Bit 3= DCT

    DISABLE_TR_MASK: 0x0;   # Value: <0x0 - 0xF> Default: 0x0

     

    # Transmission burst ageing time (usec)

    AGEING_TIME: 55;

     

    # Specify log file name - This option can not be changed on the fly

    LOG_FILE: /opt/ufm/files/log/ar_mgr.log;

     

    # Specify log size - This option can not be changed on the fly

    LOG_SIZE: 70;             # In MB

     

    AR_ALGORITHM: TREE;    # Values: < TREE | DF_PLUS | LAG >

    3.4.1.3 Adaptive Routing Option File Example – DF_PLUS

    # -----

    # Adaptive Routing configuration file

    # -----

     

    # General configuration options

    # Enable / Disable the plugin (AR and RN)

    ENABLE: TRUE;             # Values: < TRUE | FALSE>

     

    # Enable / Disable Adaptive Routing on fabric switches

    AR_ENABLE: TRUE;    # Values: < TRUE | FALSE>

     

    # Enable / Disable Adaptive Routing Notifications on fabric switches

    ARN_ENABLE: FALSE;   # Values: < TRUE | FALSE>

     

    # Enable / Disable Fast Link Fault Recovery (FLFR) on fabric switches

    FLFR_ENABLE: FALSE;  # Values: < TRUE | FALSE>

     

    # Avoid sending link fault notifications to remote switches

    FLFR_REMOTE_DISABLE: FALSE; # Values: < TRUE | FALSE>

     

    # Adaptive Routing mode

    AR_MODE: FREE;       #BOUNDED / FREE

     

    # Bitmask of enabled SLs (VL if configured VL as SL)

    EN_SL_MASK: 0xFFFE;     # Value: < 0x0000 - 0xFFFF >

     

    # Bitmask of disabled transport types

    #   Bit 0= UD

    #   Bit 1= RC

    #   Bit 2= UC

    #   Bit 3= DCT

    DISABLE_TR_MASK: 0x0;   # Value: <0x0 - 0xF> Default: 0x0

     

    # Transmission burst ageing time (usec)

    AGEING_TIME: 55;

     

    # Specify log file name - This option can’t be changed on the fly

    LOG_FILE: /opt/ufm/files/log/ar_mgr.log;

     

    # Specify log size - This option can’t be changed on the fly

    LOG_SIZE: 70;             # In MB

     

    AR_ALGORITHM: DF_PLUS;    # Values: < TREE | DF_PLUS | LAG >

     

    OP_MODE: 0x1; # enable spine to spine none DF routing

     

    MAX_CAS_ON_SPINE: 1;

    3.4.1.4 AR Configuration Files (Before MLNX OFED 3.2/UFM 5.6)

     

    Option File

    Description

    Values

    ENABLE:

    <true|false>

    Enable/disable Adaptive Routing on fabric switches.

    Note that if a switch was identified by AR Manager as device that does not support AR, AR Manager will not try to enable AR on this switch. If the firmware of this switch was updated to support the AR, the AR Manager will need to be restarted (by restarting Subnet Manager) to allow it to configure the AR on this switch.

    This option can be changed on-the-fly.

    Default: true

    EN_SL_MASK

    Bitmask of SLs on which the AR will be enabled (VL if configured VL as SL)

    < 0x0000 - 0xFFFF > Default: 0xFFFE

    DISABLE_TR_MASK (experimental)

    Bitmask of disabled transport types

       #   Bit 0= UD

       #   Bit 1= RC

       #   Bit 2= UC

       #   Bit 3= DCT  

    <0x0 - 0xF>

    Default: 0x0

    AR_ALGORITHM:

    < LAG  | TREE | DF_PLUS >

    • Adaptive Routing algorithm:
    • LAG: Ports groups are created out of "parallel" links. Links that connect the same pair of switches.
    • TREE: All the ports with minimal hops to destination are in the same group. Must run together with UPDN routing engine.
    • DF_PLUS: experimental algorithm designed for Dragonfly plus topology.

    -

    OP_MODE

    Operation mode bitmask that controls the algorithm behavior.

         Bit 0 - enable spine to spine none DF routing (applicable only if DF_PLUS algorithm is selected).

    Default : 0

    AR_MODE:

    <bounded|free>

    Adaptive Routing Mode:

    • free: no constraints on output port selection
    • bounded: the switch does not change the output port during the same transmission burst. This mode minimizes the appearance of out-of-order packets

    This option can be changed on-the-fly.

    Default: bounded

    AGEING_TIME:

    <usec>

    Applicable to bounded AR mode only. Specifies the amount of time, without traffic, that must pass before the switch may declare a transmission burst as finished (32-bit value).

    This option can be changed on-the-fly.

    Default: 30

    MAX_ERRORS:

    <N> ERROR_WINDOW

    : <N>

    Deprecated

    ---

    LOG_FILE: <full path>

    AR Manager log file: /opt/ufm/files/log/ar_mgr.log

    This option cannot be changed on-the-fly.

    Default: /var/log/ armgr.log

    LOG_SIZE: <size in MB>

    This option defines maximal AR Manager log file size in MB. The logfile will be truncated and restarted upon reaching this limit.

    This option cannot be changed on-the-fly.

    0: unlimited log file size.

    Default: 5

    3.4.1.5 Adaptive Routing Manager Options File Example (before MLNX OFED 3.2/UFM 5.6)

    # -----

    # Adaptive Routing configuration file

    # -----

     

    # General configuration options

    # Enable / Disable Adaptive Routing on fabric switches

    ENABLE: TRUE;             # Values: < TRUE | FALSE>

     

    # Adaptive Routing mode

    AR_MODE: FREE;       #BOUNDED / FREE

     

    # Bitmask of enabled SLs (VL if configured VL as SL)

    EN_SL_MASK: 0xFFFE;     # Value: < 0x0000 - 0xFFFF >

     

    # Bitmask of disabled transport types

    #   Bit 0= UD

    #   Bit 1= RC

    #   Bit 2= UC

    #   Bit 3= DCT

    DISABLE_TR_MASK: 0x0;   # Value: <0x0 - 0xF> Default: 0x0

     

    # Transmission burst ageing time (usec)

    AGEING_TIME: 55;

     

    # Specify log file name - This option can not be changed on the fly

    LOG_FILE: /opt/ufm/files/log/ar_mgr.log;

     

    # Specify log size - This option can not be changed on the fly

    LOG_SIZE: 70;             # In MB

     

    AR_ALGORITHM: TREE;    # Values: < TREE | DF_PLUS | LAG >

     

    #OP_MODE: 0x1; # enable spine to spine none DF routing

     

    # Switch specific configuration options

    #SWITCH 0x11111 {

    #   ENABLE: disable;

    #   AGEING_TIME: 77;

    #}

     

    #SWITCH 0x22222 {

    #   ENABLE: false;

    #}

    3.4.2 Per-switch AR/FLFR Options

    A user can provide per-switch configuration options with the following syntax:

    SWITCH <GUID> {

                   <switch option 1>;

                   <switch option 2>;

                  ...

                  }

    The following are the per-switch options:

                        

    Option File

    Description

    Values

    ENABLE:

    <true|false>

    Allows you   to enable/disable the AR/FLFR on this switch. If the general ENABLE option value   is set to 'false', this   per-switch option   is ignored.

    This option can be changed on the fly.

    Default: true

    AGEING_TIME:

    <usec>

    Applicable to bounded AR mode   only. Specifies   the amount of time that must pass without traffic, before the switch may   declare a transmission burst as finished (32-bit value).

    In the per-switch configuration file this   option refers to the particular switch only.

    This option   can be changed on-the-fly.

    Default: 30

     

    3.5 MLNX OFED configuration for Adaptive Routing

    For user space applications:

    Relaxed Packet Ordering (Out-of-Order/OOO) can be enabled by modifying environment variables.

    It will take effect when QP supports OOO and the QP state is moving to RTR (by calling ibv_modify_qp or ibv_modify_qp_ex).

                  

                    $ export MLX5_RELAXED_PACKET_ORDERING_ON="all"

                                   Enable all mlx5 devices.

                  

                    $ export MLX5_RELAXED_PACKET_ORDERING_ON="mlx5_0"

                                    Enable mlx5_0

                  

                    $ export MLX5_RELAXED_PACKET_ORDERING_ON="mlx5_0 mlx5_3 mlx5_4"

                              Enable mlx5_0, mlx5_3 and mlx5_4

                  

    For persistency:the above environment variables should be added into the application launch scripts.

    For Kernel ULP QPs:

    Admin can enable Relaxed Packet Ordering (Out-of-Order / OOO) for kernel QPs by modifying the drivers sysfs entry.

    It will take effect when QP supports OOO and the QP state is moving to RTR during mlx5_ib_modify_qp().

                  

    $ echo <0|1> > /sys/kernel/debug/mlx5/<pci-bus>/ooo/enable

                  

    For persistency: admins should update file: /etc/infiniband/mlx5.conf

    # Enable all mlx5 devices.

    MLX5_RELAXED_PACKET_ORDERING_ON="all"

                    # Enable mlx5_0 only

    MLX5_RELAXED_PACKET_ORDERING_ON="mlx5_0"

                    # Enable mlx5_0, mlx5_3 and mlx5_4

    MLX5_RELAXED_PACKET_ORDERING_ON="mlx5_0 mlx5_3 mlx5_4"

    # To disable the feature, use MLX5_RELAXED_PACKET_ORDERING_OFF variable

                                    # Which supports same syntax as MLX5_RELAXED_PACKET_ORDERING_ON:

    MLX5_RELAXED_PACKET_ORDERING_OFF="mlx5_1"

     

    3.6 Adaptive Routing Validation and FLFR with ibdiagnet

    3.6.1 Adaptive Routing and FLFR Dump Files

    Following a successful deployment of adaptive routing and FLFR configuration by OpenSM to the IB fabric, it is possible to run the following to validate adaptive routing correctness:

    Mellanox UFM:

    #/opt/ufm/opensm/bin/ibdiagnet -r --r_opt vs,far --skip pm,pkey,links,temp_sensing,aguid,speed_width_check,nodes_info,sm,dup_guids,dup_node_desc,vs_cap_gmp,lids

     

    Summary

    -I- Stage                     Warnings   Errors Comment  

    -I- Discovery                 0          0             

    -I- Routing                   0          0

    Mellanox OFED:

    #/usr/bin/ibdiagnet -r --r_opt=vs,far --skip pm,pkey,links,temp_sensing,aguid,speed_width_check,nodes_info,sm,dup_guids,dup_node_desc,vs_cap_gmp,lids

     

    Summary

    -I- Stage                     Warnings   Errors Comment  

    -I- Discovery                 0          0        

        -I- Routing                   0          0

    The following dump files will be generated:

    • /var/tmp/ibdiagnet2/ibdiagnet2.ar
    • /var/tmp/ibdiagnet2/ibdiagnet2.plft
    • /var/tmp/ibdiagnet2/ibdiagnet2.vl2vl
    • /var/tmp/ibdiagnet2/ibdiagnet2.far
    • /var/tmp/ibdiagnet2/ibdiagnet2.fdbs

    FAR file example:

    #cat /var/tmp/ibdiagnet2/ibdiagnet2.far

     

    dump_ar: Switch 0xec0d9a030027dbd0 fr_en: 1 en_sl: 2, 8

     

     

    ByTransportDisable: (0x1). UD,

     

     

    Groups Definition:

     

     

    Group     Sub Group      Ports

    -------------------------------------

    0         0             

    1         0              15,16,17,18,19,20,21,22,

    2         0             

    3         0              15,16,17,18,19,20,21,22,

    4         0              15,16,17,18,19,20,21,22,

    5         0              15,16,17,18,19,20,21,22,

    6         0              15,16,17,18,19,20,21,22,

    7         0              15,16,17,18,19,20,21,22,

    8         0              15,16,17,18,19,20,21,22,

    9         0              15,16,17,18,19,20,21,22,

     

     

    LFT Definition:

     

     

    PLFT_NUM: 0

     

     

    LID       Static Port    Lid State      Group

    -----------------------------------------------

    0x0000    UNREACHABLE    Static         0

    0x0001    15             Free           6

    0x0002    15             Free           1

    0x0003    15             Free           5

    0x0004    17             Free           1

    0x0005    UNREACHABLE    Static         0

    0x0006    7              Static         0

    0x0007    UNREACHABLE    Static         0

    0x0008    9              Static         0

    0x0009    15             Free           5

    3.6.2 Adaptive Routing Credit-Loop Testing

    Credit-loop validation on a particular service level (SL) used for Adaptive Routing traffic can be performed as follows:

    Mellanox UFM:

    /opt/ufm/opensm/bin/ibdiagnet -r --r_opt=vs,sl=2 --skip pm,pkey,links,temp_sensing,aguid,speed_width_check,nodes_info,sm,dup_guids,dup_node_desc,vs_cap_gmp,lids

    MLNX_OFED:

    #/usr/bin/ibdiagnet -r --r_opt=vs,sl=2 --skip pm,pkey,links,temp_sensing,aguid,speed_width_check,nodes_info,sm,dup_guids,dup_node_desc,vs_cap_gmp,lids