4 Replies Latest reply on Jul 11, 2017 4:48 AM by march

    link down and ip address lost with mellanox ofed 100G card




      I am using Mellanox OFED stack 3.2 RHEL 7.2.  Before discovery process some times link become down and loosing Ip address. I have to reassign ip address couple of times.

      I have connected to Mellanox switch . Is it known issue or any fix available for this ?

      you can reproduce this issue very easily if you enable sniffer with ethtool with fallowing command/

      ethtool --set-priv-flags enp9s0f0 sniffer on

        • Re: link down and ip address lost with mellanox ofed 100G card



          What is your adapter kind, ConnectX-3 / ConnectX-4 , other ?

          lspci -xxxvvv | grep Mellanox will give the information.


          Which discovery, do you mean ? DHCP IPv4, v6 ? Something else ?

          Can you send me your ifcfg-enp9s0f0 file ?



            • Re: link down and ip address lost with mellanox ofed 100G card


              I am using Connect X-4 100 G card.


              I mean nvme discovery commands by using nvme cli's

              nvme discover -t rdma -a


              09:00.0 Ethernet controller: Mellanox Technologies MT27620 Family
                      Subsystem: Mellanox Technologies Device 0007
                      Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
                      Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
                      Latency: 0, Cache Line Size: 32 bytes
                      Interrupt: pin A routed to IRQ 27
                      Region 0: Memory at d0000000 (64-bit, prefetchable) [size=32M]
                      Expansion ROM at fb400000 [disabled] [size=1M]
                      Capabilities: [60] Express (v2) Endpoint, MSI 00
                              DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
                                      ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
                              DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                                      RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
                                      MaxPayload 256 bytes, MaxReadReq 512 bytes
                              DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
                              LnkCap: Port #0, Speed 8GT/s, Width x16, ASPM not supported, Exit Latency L0s unlimited, L1 unlimited
                                      ClockPM- Surprise- LLActRep- BwNot-
                              LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
                                      ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                              LnkSta: Speed 8GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                              DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
                              DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
                              LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
                                       Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                                       Compliance De-emphasis: -6dB
                              LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+
                                       EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
                      Capabilities: [48] Vital Product Data
                              Product Name: CX415A - ConnectX-4 QSFP28
                              Read-only fields:
                                      [PN] Part number: MCX415A-CCAT
                                      [EC] Engineering changes: A6
                                      [SN] Serial number: MT1608X07436
                                      [V0] Vendor specific: PCIeGen3 x16
                                      [RV] Reserved: checksum good, 0 byte(s) reserved
                      Capabilities: [9c] MSI-X: Enable+ Count=64 Masked-
                              Vector table: BAR=0 offset=00002000
                              PBA: BAR=0 offset=00003000
                      Capabilities: [c0] Vendor Specific Information: Len=18 <?>
                      Capabilities: [40] Power Management version 3
                              Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0-,D1-,D2-,D3hot-,D3cold+)
                              Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
                      Capabilities: [100 v1] Advanced Error Reporting
                              UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                              UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                              UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                              CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                              CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                              AERCap: First Error Pointer: 04, GenCap+ CGenEn- ChkCap+ ChkEn-
                      Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI)
                              ARICap: MFVC- ACS-, Next Function: 0
                              ARICtl: MFVC- ACS-, Function Group: 0
                      Capabilities: [180 v1] Single Root I/O Virtualization (SR-IOV)
                              IOVCap: Migration-, Interrupt Message Number: 000
                              IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy+
                              IOVSta: Migration-
                              Initial VFs: 8, Total VFs: 8, Number of VFs: 0, Function Dependency Link: 00
                              VF offset: 1, stride: 1, Device ID: 1014
                              Supported Page Size: 000007ff, System Page Size: 00000001
                              Region 0: Memory at 0000000000000000 (64-bit, prefetchable)
                              VF Migration: offset: 00000000, BIR: 0
                      Capabilities: [1c0 v1] #19
                      Capabilities: [230 v1] Access Control Services
                              ACSCap: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-

                    Kernel driver in use: mlx5_core