HowTo Dump Driver Configuration (via ethtool)

Version 10

    The ability to dump the configuration can help debugging and understanding problems onsite. This post is relevant for software developers and advanced users, and relevant only for ConnectX-4 (mlx5 drivers). This feature is available in MLNX_OFED 3.3.2 and later, and available for Ethernet only. In MLNX_OFED 3.4, there was an update to the ouptut of the parser, see below.

     

    References

    • MLNX_OFED User Manual.

     

    Setup

    A server installed with MLNX_OFED 3.4 or later.

     

    Dump Parameter (Bitmap Flag)

    This bitmap parameter is used to set the type of dump.

     

    ValueDescription
    1MST dump
    2Ring dump (Software context information for SQs, EQs, RQs, CQs)
    3MST dump + Ring dump (1+2)
    4Clear this parameter

     

     

    Configuration

    1. Set the dump bitmap flag, use -W (uppercase). In this case we are using the value  3 (both mst and ring dump).

    # ethtool -W ens1f0 3

     

    2. Dump the file, use -w and the filename for the dump.

    # ethtool -w ens1f0 data /tmp/dump.bin

     

    3. (Optional) To get the flag value, version and size of the dump, run the command without the filename.

    # ethtool -w ens1f0

    flag: 3, version: 1, length: 4312

     

    4. To open the dump file, run the following command with those arguments.

    • -f for the file to be parsed (the file we just created).
    • -m for the mst dump file.
    • -r for the ring dump file.

     

    # mlnx_dump_parser -f /tmp/dump.bin -m mst_dump_demo.txt -r ring_dump_demo.txt

    Version: 1 Flag: 3 Number of blocks: 123 Length 327584

    MCION module number: 0 status: | present |

    DRIVER VERSION: 1-23 (03 Mar 2015)

    DEVICE NAME 0000:81:00.0:ens1f0

    Parsing Complete!

     

    Note: In MLNX_OFED 3.4, there was an update to the ouptut. Added the MCION (Management Cable IO and Notifications Register) firmware register. The status could be "present"/"not-present" and/or "rx los" / "tx fault".

    For example, the MCION register output will show "not-present" if there is no cable inserted.

    MCION module number: 0 status: | non-present |

     

    5. Open the files

     

    The MST dump can be opened, but in order to understand it you should consult with Mellanox support team.

    For example:

    # cat mst_dump_demo.txt

    0x00000000 0x01002000

    0x00000004 0x00000000

    0x00000008 0x00000000

    0x0000000c 0x00000000

    0x00000010 0x00000000

    0x00000014 0x00000000

    0x00000018 0x00000000

    ...

     

    The Ring dump file can help developers debug ring related issues.

    Output example:

    # cat ring_dump_demo.txt

    SQ TYPE: 3, WQN: 102, PI: 0, CI: 0, STRIDE: 6, SIZE: 1024...

    SQ TYPE: 3, WQN: 102, PI: 0, CI: 0, STRIDE: 6, SIZE: 1024, WQE_NUM: 65536, GROUP_IP: 0

    CQ TYPE: 5, WQN: 20, PI: 0, CI: 0, STRIDE: 6, SIZE: 1024, WQE_NUM: 1024, GROUP_IP: 0

    RQ TYPE: 4, WQN: 103, PI: 15, CI: 0, STRIDE: 5, SIZE: 16, WQE_NUM: 512, GROUP_IP: 0

    CQ TYPE: 5, WQN: 21, PI: 0, CI: 0, STRIDE: 6, SIZE: 16384, WQE_NUM: 16384, GROUP_IP: 0

    EQ TYPE: 6, CI: 1, SIZE: 0, IRQN: 109, EQN: 19, NENT: 2048, MASK: 0, INDEX: 0, GROUP_ID: 0

    SQ TYPE: 3, WQN: 106, PI: 0, CI: 0, STRIDE: 6, SIZE: 1024, WQE_NUM: 65536, GROUP_IP: 1

    CQ TYPE: 5, WQN: 23, PI: 0, CI: 0, STRIDE: 6, SIZE: 1024, WQE_NUM: 1024, GROUP_IP: 1

    RQ TYPE: 4, WQN: 107, PI: 15, CI: 0, STRIDE: 5, SIZE: 16, WQE_NUM: 512, GROUP_IP: 1

    CQ TYPE: 5, WQN: 24, PI: 0, CI: 0, STRIDE: 6, SIZE: 16384, WQE_NUM: 16384, GROUP_IP: 1

    EQ TYPE: 6, CI: 1, SIZE: 0, IRQN: 110, EQN: 20, NENT: 2048, MASK: 0, INDEX: 1, GROUP_ID: 1

    SQ TYPE: 3, WQN: 110, PI: 0, CI: 0, STRIDE: 6, SIZE: 1024, WQE_NUM: 65536, GROUP_IP: 2

    CQ TYPE: 5, WQN: 26, PI: 0, CI: 0, STRIDE: 6, SIZE: 1024, WQE_NUM: 1024, GROUP_IP: 2

    RQ TYPE: 4, WQN: 111, PI: 15, CI: 0, STRIDE: 5, SIZE: 16, WQE_NUM: 512, GROUP_IP: 2

    CQ TYPE: 5, WQN: 27, PI: 0, CI: 0, STRIDE: 6, SIZE: 16384, WQE_NUM: 16384, GROUP_IP: 2

    ...