8 Replies Latest reply on May 3, 2017 11:00 AM by acastano

    command 0x54 failed: fw status = 0x2

    acastano

      Hi,

      I installed two MT26448 on different servers.

      Both are working fine except for the "kernel: mlx4_core 0000:04:00.0: command 0x54 failed: fw status = 0x2" error that floods the system log.

      I updated the NIC drivers and tried to flash the firmware with no luck.

       

      Can you help me with this, please?

       

      You can see below all the troubleshooting steps:

      # tail /var/log/messages
      Mar 29 12:21:18 CDN2 kernel: mlx4_core 0000:04:00.0: command 0x54 failed: fw status = 0x2
      Mar 29 12:21:18 CDN2 kernel: mlx4_core 0000:04:00.0: command 0x54 failed: fw status = 0x2
      Mar 29 12:21:18 CDN2 kernel: mlx4_core 0000:04:00.0: command 0x54 failed: fw status = 0x2
      Mar 29 12:21:18 CDN2 kernel: mlx4_core 0000:04:00.0: command 0x54 failed: fw status = 0x2
      Mar 29 12:21:18 CDN2 kernel: mlx4_core 0000:04:00.0: command 0x54 failed: fw status = 0x2
      Mar 29 12:21:18 CDN2 kernel: mlx4_core 0000:04:00.0: command 0x54 failed: fw status = 0x2
      Mar 29 12:21:19 CDN2 kernel: mlx4_core 0000:04:00.0: command 0x54 failed: fw status = 0x2
      Mar 29 12:21:19 CDN2 kernel: mlx4_core 0000:04:00.0: command 0x54 failed: fw status = 0x2
      Mar 29 12:21:19 CDN2 kernel: mlx4_core 0000:04:00.0: command 0x54 failed: fw status = 0x2
      Mar 29 12:21:19 CDN2 kernel: mlx4_core 0000:04:00.0: command 0x54 failed: fw status = 0x2
      
      
      # uname -a
      Linux CDN2 3.10.0-514.10.2.el7.x86_64 #1 SMP Fri Mar 3 00:04:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
      
      
      # cat /etc/redhat-release
      CentOS Linux release 7.3.1611 (Core)
      
      
      # lspci | grep Mellanox
      04:00.0 Ethernet controller: Mellanox Technologies MT26448 [ConnectX EN 10GigE, PCIe 2.0 5GT/s] (rev a0)
      
      
      # ethtool -i eth2
      driver: mlx4_en
      version: 2.2-1 (Feb 2014)
      firmware-version: 2.7.0
      expansion-rom-version:
      bus-info: 0000:04:00.0
      supports-statistics: yes
      supports-test: yes
      supports-eeprom-access: no
      supports-register-dump: no
      supports-priv-flags: yes
      
      
      # lspci -vv -s 04:00.0
      04:00.0 Ethernet controller: Mellanox Technologies MT26448 [ConnectX EN 10GigE, PCIe 2.0 5GT/s] (rev a0)
              Subsystem: Mellanox Technologies MT26448 [ConnectX EN 10GigE, PCIe 2.0 5GT/s]
              Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
              Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
              Latency: 0, Cache Line Size: 64 bytes
              Interrupt: pin A routed to IRQ 16
              Region 0: Memory at df500000 (64-bit, non-prefetchable) [size=1M]
              Region 2: Memory at de800000 (64-bit, prefetchable) [size=8M]
              Expansion ROM at df400000 [disabled] [size=1M]
              Capabilities: [40] Power Management version 3
                      Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                      Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
              Capabilities: [48] Vital Product Data
                      Product Name: Hawk Dual Port
                      Read-only fields:
                              [PN] Part number: 59Y1905
                              [EC] Engineering changes: A1
                              [SN] Serial number: YK502000004T
                              [V0] Vendor specific: PCIe Gen2 x8
                              [RV] Reserved: checksum good, 0 byte(s) reserved
                      Read/write fields:
                              [V1] Vendor specific: N/A
                              [YA] Asset tag: N/A
                              [RW] Read-write area: 106 byte(s) free
                      End
              Capabilities: [9c] MSI-X: Enable+ Count=256 Masked-
                      Vector table: BAR=0 offset=0007c000
                      PBA: BAR=0 offset=0007d000
              Capabilities: [60] Express (v2) Endpoint, MSI 00
                      DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 unlimited
                              ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 25.000W
                      DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported-
                              RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
                              MaxPayload 128 bytes, MaxReadReq 512 bytes
                      DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
                      LnkCap: Port #8, Speed 5GT/s, Width x8, ASPM L0s, Exit Latency L0s unlimited, L1 unlimited
                              ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
                      LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
                              ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                      LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
                      DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
                      DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
                      LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
                               Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                               Compliance De-emphasis: -6dB
                      LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
                               EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
              Capabilities: [100 v1] Alternative Routing-ID Interpretation (ARI)
                      ARICap: MFVC- ACS-, Next Function: 1
                      ARICtl: MFVC- ACS-, Function Group: 0
              Kernel driver in use: mlx4_core
              Kernel modules: mlx4_core
      
      
      # mst status
      MST modules:
      ------------
          MST PCI module loaded
          MST PCI configuration module loaded
      
      
      MST devices:
      ------------
      /dev/mst/mt26448_pciconf0        - PCI configuration cycles access.
                                         domain:bus:dev.fn=0000:04:00.0 addr.reg=88 data.reg=92
                                         Chip revision is: A0
      /dev/mst/mt26448_pci_cr0         - PCI direct access.
                                         domain:bus:dev.fn=0000:04:00.0 bar=0xdf500000 size=0x100000
                                         Chip revision is: A0
      
      
      # flint --device /dev/mst/mt26448_pci_cr0 q
      -E- Cannot open Device: /dev/mst/mt26448_pci_cr0. No such file or directory. MFE_UNSUPPORTED_DEVICE
      
      
      # yum install mlnx-en-eth-only
      Loaded plugins: fastestmirror
      Loading mirror speeds from cached hostfile
      Package mlnx-en-eth-only-3.4-2.0.0.0.noarch already installed and latest version
      Nothing to do
      
      
      # yum install mlnx-fw-updater
      Downloading packages:
      Running transaction check
      Running transaction test
      Transaction test succeeded
      Running transaction
        Installing : mlnx-fw-updater-3.4-2.0.0.0.x86_64                                                                                                                                                1/1
      Attempting to perform Firmware update...
      Querying Mellanox devices firmware ...
      
      
      Device #1:
      ----------
      
      
        Device Type:      N/A
        Part Number:      --
        Description:
        PSID:
        PCI Device Name:  04:00.0
        Port1 MAC:        N/A
        Port1 GUID:       N/A
        Port2 MAC:        N/A
        Port2 GUID:       N/A
        Versions:         Current        Available
           FW             --
      
      
        Status:           Failed to open device
      
      
      ---------
      -E- Failed to query 04:00.0 device, error : No such file or directory. MFE_UNSUPPORTED_DEVICE
      
      
      Log File: /tmp/mlnx_fw_update.log
      Failed to update Firmware.
      See /tmp/mlnx_fw_update.log
        Verifying  : mlnx-fw-updater-3.4-2.0.0.0.x86_64                                                                                                                                                1/1
      
      
      Installed:
        mlnx-fw-updater.x86_64 0:3.4-2.0.0.0
      
      
      Complete!
      
      
      # mlxfwmanager_pci | grep PSID
      ---------
      -E- Failed to query 0000:04:00.0 device, error : No such file or directory. MFE_UNSUPPORTED_DEVICE
        PSID:
      
        • Re: command 0x54 failed: fw status = 0x2
          alkx

          Hi Agustin,

          Did you try the latest version of the MFT tool from Mellanox web site:

           

          http://www.mellanox.com/page/mlxup_firmware_tool

          http://www.mellanox.com/page/management_tools

            • Re: command 0x54 failed: fw status = 0x2
              acastano

              Thanks alkx, I tried the following:

               

              # mst start
              Starting MST (Mellanox Software Tools) driver set
              Loading MST PCI module - Success
              Loading MST PCI configuration module - Success
              Create devices
              
              
              
              
              # mlxfwmanager
              Querying Mellanox devices firmware ...
              
              
              Device #1:
              ----------
              
              
                Device Type:      N/A
                Part Number:      --
                Description:
                PSID:
                PCI Device Name:  /dev/mst/mt26448_pci_cr0
                Port1 MAC:        N/A
                Port1 GUID:       N/A
                Port2 MAC:        N/A
                Port2 GUID:       N/A
                Versions:         Current        Available
                   FW             --
              
              
                Status:           Failed to open device
              
              
              ---------
              -E- Failed to query /dev/mst/mt26448_pci_cr0 device, error : No such file or directory. MFE_UNSUPPORTED_DEVICE
              

               

              What am I missing?

               


              Regards.

                • Re: command 0x54 failed: fw status = 0x2
                  alkx

                  I wasn't able to trace the serial number from 'lspci' output "[SN] Serial number: YK502000004T", but it seems that device is ConnectX-2 and it needs the firwmare 2.9.XXXX from here http://www.mellanox.com/page/firmware_table_ConnectXEN . You can use 'ibv_devinfo' command to get board_id and then using it to download the firmware image.

                  I would suggest you to try the older version of Mellanox Firmware tools, 3.8.X and maybe earlier, from here http://www.mellanox.com/page/management_tools  in order to upgrade the firmware.

                  Once you have image downloaded, you may use 'flint' command for the upgrade

                  #flint -d 04:00.0 -i <FW IMAGE> b

                   

                   

                   

                   

                   

                   

                    • Re: command 0x54 failed: fw status = 0x2
                      acastano

                      Using an older version gave better results but I could not find the PSID in the firmware list:

                       

                      # mst status
                      MST modules:
                      ------------
                          MST PCI module loaded
                          MST PCI configuration module loaded
                      
                      
                      MST devices:
                      ------------
                      /dev/mst/mt26448_pciconf0        - PCI configuration cycles access.
                                                         domain:bus:dev.fn=0000:04:00.0 addr.reg=88 data.reg=92
                                                         Chip revision is: A0
                      /dev/mst/mt26448_pci_cr0         - PCI direct access.
                                                         domain:bus:dev.fn=0000:04:00.0 bar=0xdf500000 size=0x100000
                                                         Chip revision is: A0
                      
                      
                          
                      # flint -d /dev/mst/mt26448_pci_cr0 query
                      Image type:      FS2
                      FW Version:      2.7.0
                      Rom Info:        type=PXE version=1.5.5 devid=26448 proto=ETH
                      Device ID:       26448
                      Description:     Port1            Port2
                      MACs:                0002c907766c     0002c907766d
                      VSD:
                      PSID:            IBM0050000010
                      
                      
                      
                      
                      # ibv_devinfo
                      hca_id: mlx4_0
                              transport:                      InfiniBand (0)
                              fw_ver:                         2.7.000
                              node_guid:                      ffff:ffff:ffff:ffff
                              sys_image_guid:                 ffff:ffff:ffff:ffff
                              vendor_id:                      0x02c9
                              vendor_part_id:                 26448
                              hw_ver:                         0xA0
                              board_id:                       IBM0050000010
                              phys_port_cnt:                  2
                              Device ports:
                                      port:   1
                                              state:                  PORT_ACTIVE (4)
                                              max_mtu:                4096 (5)
                                              active_mtu:             1024 (3)
                                              sm_lid:                 0
                                              port_lid:               0
                                              port_lmc:               0x00
                                              link_layer:             Ethernet
                      
                      
                                      port:   2
                                              state:                  PORT_ACTIVE (4)
                                              max_mtu:                4096 (5)
                                              active_mtu:             1024 (3)
                                              sm_lid:                 0
                                              port_lid:               0
                                              port_lmc:               0x00
                                              link_layer:             Ethernet