1 Reply Latest reply on Oct 29, 2018 8:55 AM by march

    CentOS 7.5 -E- Cannot open Device

    ebezrukova

      On CentOS 7.5 + Mellanox ConnectX-4 LX having troubs to configure interface.

       

      I have Mellanox ConnectX-4 LX ethernet adapter:

      ~]# lspci | grep -i Mella

      10003:01:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]

      10003:01:00.1 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]

       

      But when mlx5_core module loads there is an error:

      mlx5_core 10003:01:00.0: PCI INT A: no GSI

      mlx5_core 10003:01:00.0: Missing registers BAR, aborting

      mlx5_core 10003:01:00.0: error requesting BARs, aborting

      mlx5_core 10003:01:00.0: mlx5_pci_init failed with error code -19

      mlx5_core 10003:01:00.1: PCI INT B: no GSI

      mlx5_core 10003:01:00.1: Missing registers BAR, aborting

      mlx5_core 10003:01:00.1: error requesting BARs, aborting

      mlx5_core 10003:01:00.1: mlx5_pci_init failed with error code -19

       

      Always receiving error while trying to read device configuration or open device - it tries to open wrong device (0003:01:00.0 and 0003:01:00.1 instead of 10003:01:00.0 and 10003:01:00.1):

       

      ~]# mstflint -d 10003:01:00.0 q

      -E- Cannot open Device: 10003:01:00.0. No such file or directory. MFE_CR_ERROR

       

      ~]# mst status

      MST modules:

      ------------

          MST PCI module loaded

          MST PCI configuration module loaded

       

      MST devices:

      ------------

      /dev/mst/mt4117_pciconf0         - PCI configuration cycles access.

                                         domain:bus:dev.fn=10003:01:00.0 addr.reg=88 data.reg=92

      Failed to open (/sys/bus/pci/devices/0003:01:00.1/config) for reading: No such file or directory

      Failed to open (/sys/bus/pci/devices/0003:01:00.1/config) for reading: No such file or directory

      Failed to open (/sys/bus/pci/devices/0003:01:00.0/config) for reading: No such file or directory

      Failed to open (/sys/bus/pci/devices/0003:01:00.0/config) for reading: No such file or directory

                                         Chip revision is: 00

      /dev/mst/mt4117_pci_cr0          - PCI direct access.

                                         domain:bus:dev.fn=10003:01:00.0 bar=0x00000000 size=0x0

      Failed to open (/sys/bus/pci/devices/0003:01:00.1/config) for reading: No such file or directory

      Failed to open (/sys/bus/pci/devices/0003:01:00.1/config) for reading: No such file or directory

      Failed to open (/sys/bus/pci/devices/0003:01:00.0/config) for reading: No such file or directory

      Failed to open (/sys/bus/pci/devices/0003:01:00.0/config) for reading: No such file or directory

      mopen: Invalid argument

                                         Chip revision is:

       

      ~]# flint -d /dev/mst/mt4117_pciconf0 query

      Failed to open (/sys/bus/pci/devices/0003:01:00.1/config) for reading: No such file or directory

      Failed to open (/sys/bus/pci/devices/0003:01:00.1/config) for reading: No such file or directory

      Failed to open (/sys/bus/pci/devices/0003:01:00.0/config) for reading: No such file or directory

      Failed to open (/sys/bus/pci/devices/0003:01:00.0/config) for reading: No such file or directory

      Failed to open (/sys/bus/pci/devices/0003:01:00.1/config) for reading: No such file or directory

      Failed to open (/sys/bus/pci/devices/0003:01:00.1/config) for reading: No such file or directory

      Failed to open (/sys/bus/pci/devices/0003:01:00.0/config) for reading: No such file or directory

      Failed to open (/sys/bus/pci/devices/0003:01:00.0/config) for reading: No such file or directory

       

       

      Image type:            FS3

      FW Version:            14.21.1000

      FW Release Date:       29.10.2017

      Product Version:       rel-14_21_1000

      Rom Info:              type=UEFI version=14.14.22 cpu=AMD64

                             type=PXE version=3.5.305 cpu=AMD64

      Description:           UID                GuidsNumber

      Base GUID:             ac1f6b15d92d2f68        4

      Base MAC:              ac1f6b2d2f68            4

      Image VSD:             N/A

      Device VSD:            N/A

      PSID:                  SM_2001000001034

      Security Attributes:   N/A

       

      Paths /sys/bus/pci/devices/10003:01:00.1/config and /sys/bus/pci/devices/0003:01:00.0/config exist, but mellanox tools try to open wrong one. This error appears everywhere - query device/update firmware/reset device - tools just cut first digit of pci domain number.

      The same stuff on Debian works great, but pci device was in 0000 domain (0000:86:00.0 and 0000:86:00.1). This error repeats on RHEL7 too - the same situation.

      Looks like problem with 5-digit pci domain. Maybe someone faced and solved the similar problem?

      Thanks!