Understanding mlx5 ethtool Counters

Version 23

    This post shows the list of ethtool counters applicable for ConnectX-4 (mlx5 driver). All counters here are available via ethtool starting with MLNX_OFED 4.0.

    The post is also providing a reference to ConnectX-3 (mlx4 driver). for counters that co-exists in ConnectX3 look for the "ConnectX3  naming" remark

     

    References

     

    Release Notes

    • MLNX_OFED 4.0 adds the following counters:
      • rx_pci_signal_integrity
      • tx_pci_signal_integrity

     

    Counters Overview

    There are several counter groups, depends where the counter is being counted. In addition, each group of counters may have different counter types.

    1212.png

    Counter Groups

    • Ring – software ring counters
    • Software Port – An aggregation of software ring counters.
    • vPort counters - traffic counters and drops due to steering or no buffers. May indicate on NIC issues. These counters include Ethernet traffic counters (including Raw Ethernet) and RDMA/RoCE traffic counters.
    • Physical port counters – the physical port connecting NIC to the network. May indicate on NIC issues or link or network issue. This measuring point holds information on standardized counters like IEEE 802.3, RFC2863, RFC 2819, RFC 3635 and additional counters like flow control, FEC and more. Physical port counters are not exposed to virtual machines.
    • Priority Port Counters - A set of the physical port counters, per priory per port.

     

     

    Counter Types

    Counters are divided to three Types

    • Traffic Informative Counters – counters which counts traffic. These counters can be used for load estimation of for general debug.
    • Traffic Acceleration Counters – counters which counts traffic which was accelerated by hardware. The counters are an additional layer to the informative counter set and the same traffic is counted in both informative and acceleration counters. Acceleration counters are marked with [A].
    • Error Counters – Increment of these counters might indicate a problem. Each of these counter has an explanation and correction action.

     

    Statistic can be fetched via the ip link or  ethtool commands. ethtool provides more detailed information.

    ip –s link show <if-name>

    ethtool -S <if-name>

     

     

    Acceleration Mechanism

    The following acceleration mechanisms have dedicated counters:

    • TSO (TCP Segmentation Offload) - increasing outbound throughput and reducing CPU utilization by allowing the kernel to buffer multiple packets in a single large buffer. The NIC split the buffer into packet and transmits it
    • LRO (Large Receive Offload) - increasing inbound throughput and reducing CPU utilization by aggregation of o multiple incoming packet of a single stream to a single buffer
    • CHECKSUM (Checksum) – calculation of TCP checksum (by the NIC). The following CSUM offload are available (refer to skbuff.h for detailed explanation)
      • CHECKSUM_UNNECESSARY
      • CHECKSUM_NONE – no CSUM acceleration was used
      • CHECKSUM_COMPLETE – Device provided CSUM on the entire packet
      • CHECKSUM_PARTIAL – Device provided CSUM
    • CQE Compress – compression of Completion Queue Events (CQE) used for sparing bandwidth on PCIe and hence achieve better performance.

     

    Counters Description

     

    Ring Counters

    The following counters are available per ring. Acceleration mechanisms are equipped with dedicated counters.  These counters provide information on the amount of traffic that was accelerated by the NIC. The counters are counting the accelerated traffic in addition to the standard counters which counts it (i.e. accelerated traffic is counted twice).

     

    Ring Counter Table

    Counter DescriptionGroup
    rx[i]_packets

    The number of packets received on ring i.

    ConnectX3  naming : rx[i]_packets

    Informative
    rx[i]_bytes

    The number of bytes received on ring i.

    ConnectX3  naming : rx[i]_bytes

    Informative
    tx[i]_packets

    The number of packets transmitted on ring i.

    ConnectX3  naming : tx[i]_packets

    Informative
    tx[i]_bytes

    The number of bytes transmitted on ring i.

    ConnectX3  naming : tx[i]_bytes

    Informative
    tx[i]_tso_packetsThe number of TSO packets transmitted on ring i [A].Acceleration
    tx[i]_tso_bytesThe number of TSO bytes transmitted on ring i [A].Acceleration
    tx[i]_tso_inner_packetsThe number of TSO packets which are indicated to be carry internal encapsulation transmitted on ring i [A]Acceleration
    tx[i]_tso_inner_bytes

    The number of TSO bytes  which are indicated to be carry internal encapsulation transmitted on ring i [A].

    Acceleration
    rx[i]_lro_packetsThe number of LRO packets received on ring i [A].Acceleration
    rx[i]_lro_bytesThe number of LRO bytes received on ring i [A].Acceleration
    rx[i]_csum_ unnecessaryPackets received with a CHECKSUM_UNNECESSARY on ring i [A].Acceleration
    rx[i]_csum_none

    Packets received with CHECKSUM_NONE on ring i [A].

    Acceleration
    rx[i]_csum_complete

    Packets received with a CHECKSUM_COMPLETE on ring i [A].

    Acceleration
    rx[i]_csum_unnecessary_inner

    Packets received with inner encapsulation with a CHECK_SUM UNNECESSARY on ring i [A].

    Acceleration
    tx[i]_csum_partial

    Packets transmitted with a CHECKSUM_PARTIAL on ring i [A].

    Acceleration
    tx[i]_csum_partial_inner

    Packets transmitted with inner encapsulation with a CHECKSUM_PARTIAL on ring i [A].

    Acceleration
    tx[i]_csum_none

    Packets transmitted with no hardware checksum acceleration on ring i.

    Informative

    tx[i]_queue_stopped

    Events where SQ was full on ring i. If this counter is increased, check the amount of buffers allocated for transmission.Error

    tx[i]_queue_wake

    Events where SQ was full and has become not full on ring i.Error

    tx[i]_dropped

    Packets transmitted that were dropped due to DMA mapping failure on ring i. If this counter is increased, check the amount of buffers allocated for transmission.Error
    rx[i]_wqe_errThe number of wrong opcodes received on ring i.Error
    tx[i]_nopThe number of no WQEs (empty WQEs) inserted to the SQ (related to ring i) due to the reach of the end of the cyclic buffer. When reaching near to the end of cyclic buffer the driver may add those empty WQEs to avoid handling a state the a WQE start in the end of the queue and ends in the beginning of the queue. This is a normal condition.Informative
    rx[i]_mpwqe_fragThe number of WQEs that failed to allocate compound page and hence fragmented MPWQE’s (Multi Packet WQEs) were used on ring i. If this counter raise, it may suggest that there is no enough memory for large pages, the driver allocated fragmented pages. This is not abnormal condition.Informative
    rx[i]_mpwqe_fillerThe number of filler CQEs events that where issued on ring i.Informative
    rx[i]_cqe_compress_blksThe number of receive blocks with CQE compression on ring i [A].Acceleration

    rx[i]_cqe_compress_pkts

    The number of receive packets with CQE compression on ring i [A].Acceleration
    rx[i]_cache_reuseThe number of events of successful reuse of a page from a driver's internal page cache - supported from Kernel 4.9Acceleration
    rx[i]_cache_fullThe number of events of full internal page cache where driver can't put a page back to the cache for recycling (page will be freed) - supported from Kernel 4.9Acceleration

    rx[i]_cache_empty

    The number of events where cache was empty - no page to give. driver shall allocate new page - supported from Kernel 4.9Acceleration
    rx[i]_cache_busyThe number of events where cache head was busy and cannot be recycled. driver allocated new page - supported from Kernel 4.9Acceleration
    rx[i]_xdp_dropThe number of packets dropped due to XDP program XDP_DROP action. these packets are not counted by other software counters. These packets are counted by physical port and vPort counters - supported from kernel 4.9Informative
    rx[i]_xdp_txThe number of packets forwarded back to the port due to XDP program XDP_TX action (bouncing). these packets are not counted by other software counters. These packets are counted by physical port and vPort counters - supported from kernel 4.9Informative
    rx[i]_xdp_tx_full

    The number of packets that should have been  forwarded back to the port due to XDP_TX action but were dropped due to full tx queue. these packets are not counted by other software counters. These packets are counted by physical port and vPort counters

    you may open more rx queues and spread traffic rx over all queues and/or increase rx ring size

    supported from kernel 4.9

    Error

    rx[i]_xmit_more

    The number of packets sent with xmit_more indication set on the skbuff (no doorbell) - supported from kernel 4.8Acceleration

     

    For example: The full list of counters for ring 0:

    rx0_packets: 0

    rx0_bytes: 0

    rx0_csum_complete: 0

    rx0_csum_unnecessary_inner: 0

    rx0_csum_none: 0

    rx0_csum_unnecessary: 0

    rx0_lro_packets: 0

    rx0_lro_bytes: 0

    rx0_wqe_err: 0

    rx0_cqe_compress_pkts: 0

    rx0_cqe_compress_blks: 0

    rx0_mpwqe_filler: 0

    rx0_mpwqe_frag: 0

     

    tx0_packets: 0

    tx0_bytes: 0

    tx0_tso_packets: 0

    tx0_tso_bytes: 0

    tx0_tso_inner_packets: 0

    tx0_tso_inner_bytes: 0

    tx0_csum_none: 0

    tx0_csum_partial : 0

    tx0_csum_partial_inner: 0

    tx0_nop: 0

    tx0_queue_stopped: 0

    tx0_queue_wake: 0

    tx0_queue_dropped: 0

     

     

    Software Port Counters

    The following counters are available per port. The acceleration mechanism discussed above is relevant here as well. Those counters are an aggregation of software ring counters.

     

    Software Counter Table

    CounterDescriptionGroup
    rx_packets

    The number of packets received on a port

    ConnectX3  naming : pf_rx_packets on PF, rx_packets on VF

    Informative
    rx_bytes

    The number of bytes received on a port

    ConnectX3  naming : pf_rx_bytes on PF, rx_bytes on VF

    Informative
    tx_packets

    The number of packets transmitted on a port

    ConnectX3  naming : pf_tx_packets on PF, tx_packets on VF

    Informative
    tx_bytes

    The number of bytes transmitted on a port

    ConnectX3  naming : pf_tx_bytes on PF, tx_bytes on VF

    Informative
    tx_tso_packets

    The number of TSO packets transmitted on a port [A]

    ConnectX3  naming : tso_packets

    Acceleration
    tx_tso_bytesThe number of TSO bytes transmitted on a port [A]Acceleration
    tx_tso_inner_packetsThe number of TSO packets which are indicated to be carry internal encapsulation transmitted on a port [A]Acceleration
    tx_tso_inner_bytes

    The number of TSO bytes  which are indicated to be carry internal encapsulation transmitted on a port [A]

    Acceleration
    rx_lro_packetsThe number of LRO packets received on a port[A]Acceleration
    rx_lro_bytesThe number of LRO bytes received on a port [A]Acceleration
    rx_csum_ unnecessaryPackets received with a CHECKSUM_UNNECESSARY on a port [A]Acceleration
    rx_csum_none

    Packets received with CHECKSUM_NONE on a port [A]

    ConnectX3  naming : rx_csum_none

    Acceleration
    rx_csum_complete

    Packets received with a CHECKSUM_COMPLETE on a port [A]

    ConnectX3  naming : rx_csum_complete

    Acceleration
    rx_csum_unnecessary_inner

    Packets received with inner encapsulation with a CHECK_SUM UNNECESSARY on a port [A]

    Acceleration
    tx_csum_partial

    Packets transmitted with a CHECKSUM_PARTIAL on a port [A]

    ConnectX3  naming : rx_csum_offload

    Acceleration
    tx_csum_partial_inner

    Packets transmitted with inner encapsulation with a CHECKSUM_PARTIAL on a port [A]

    Acceleration
    tx_csum_none

    Packets transmitted with no hardware checksum acceleration on a port

    Informative

    tx_queue_stopped

    Events where SQ was full on on a port. If this counter is increased, check the amount of buffers allocated for transmission

    ConnectX3  naming : queue_stopped

    Error

    tx_queue_wake

    Events where SQ was full and has become not full on on a port

    ConnectX3  naming : wake_queue
    Error

    tx_queue_dropped

    Packets transmitted that were dropped due to DMA mapping failure on a port. If this counter is increased, check the amount of buffers allocated for transmissionError
    rx_wqe_errPackets received that were dropped due wrong opcode received on a port.Error
    rx_mpwqe_fragThe number of WQEs that failed to allocate compound page and hence fragmented MPWQE’s (Multi Packet WQEs) were used on all rings. If this counter raise, it may suggest that there is no enough memory for large pages, the driver allocated fragmented pages. This is not abnormal condition.Informative
    rx_mpwqe_fillerThe number of filler CQEs events that where issued on all rings.Informative
    rx_cqe_compress_blksThe number of receive blocks with CQE compression on a port [A]Acceleration
    rx_cqe_compress_pktsThe number of receive packets with CQE compression on a port [A]Acceleration
    rx_cache_reuseThe number of events of successful reuse of a page from a driver's internal page cache - supported from Kernel 4.9Acceleration
    rx_cache_fullThe number of events of full internal page cache where driver can't put a page back to the cache for recycling (page will be freed) - supported from Kernel 4.9Acceleration
    rx_cache_emptyThe number of events where cache was empty - no page to give. driver shall allocate new page - supported from Kernel 4.9Acceleration
    rx_cache_busyThe number of events where cache head was busy and cannot be recycled. driver allocated new page - supported from Kernel 4.9Acceleration
    rx_xdp_drop

    The number of packets dropped due to XDP program XDP_DROP action. these packets are not counted by other software counters. These packets are counted by physical port and vPort counters - supported from kernel 4.9

    ConnectX3  naming : rx_xdp_drop

    Informative
    rx_xdp_tx

    The number of packets forwarded back to the port due to XDP program XDP_TX action (bouncing). these packets are not counted by other software counters. These packets are counted by physical port and vPort counters - supported from kernel 4.9

    ConnectX3  naming : rx_xdp_tx

    Informative
    rx_xdp_tx_full

    The number of packets that should have been  forwarded back to the port due to XDP_TX action but were dropped due to full tx queue. these packets are not counted by other software counters. These packets are counted by physical port and vPort counters

    you may open more rx queues and spread traffic rx over all queues and/or increase rx ring size

    supported from kernel 4.9

    ConnectX3  naming : rx_xdp_tx_full

    Error
    tx_xmit_more

    The number of packets sent with xmit_more indication set on the skbuff (no doorbell) - supported from kernel 4.8

    ConnectX3  naming : xmit_more

    Acceleration
    N/A

    The number of events of page  allocation

    ConnectX3  naming : rx_alloc_pages

    Informative

     

    For example: The full list of counters for a port:

    rx_packets: 0

    rx_bytes: 0

    tx_packets: 6

    tx_bytes: 468

    tx_tso_packets: 0

    tx_tso_bytes: 0

    tx_tso_inner_packets: 0

    tx_tso_inner_bytes: 0

    rx_lro_packets: 0

    rx_lro_bytes: 0

    rx_csum_unnecessary: 0

    rx_csum_none: 0

    rx_csum_complete: 0

    rx_csum_unnecessary_inner: 0

    tx_csum_partial: 0

    tx_csum_partial_inner: 0

    tx_csum_none: 6

    tx_queue_stopped: 0

    tx_queue_wake: 0

    tx_queue_dropped: 0

    rx_sw_lro_aggregated: 0

    rx_sw_lro_flushed: 0

    rx_sw_lro_no_desc: 0

    rx_wqe_err: 0

    rx_cqe_compress_pkts: 0

    rx_cqe_compress_blks: 0

    rx_mpwqe_filler: 0

    rx_mpwqe_frag: 0

     

     

    vPort Counters

    Counters on the eswitch port that is connected to the VNIC.

    vPort Counter Table

    CounterDescriptionGroup
    rx_vport_unicast_packetsUnicast packets received, steered to a port including Raw Ethernet QP/DPDK trafficInformative
    rx_vport_unicast_bytesUnicast bytes received, steered to a port including Raw Ethernet QP/DPDK trafficInformative
    tx_vport_unicast_packetsUnicast packets transmitted, steered from a port including Raw Ethernet QP/DPDK trafficInformative
    tx_vport_unicast_bytesUnicast bytes transmitted, steered from a port including Raw Ethernet QP/DPDK trafficInformative
    rx_vport_multicast_packetsMulticast packets received, steered to a port including Raw Ethernet QP/DPDK trafficInformative
    rx_vport_multicast_bytesMulticast bytes received, steered to a port including Raw Ethernet QP/DPDK trafficInformative
    tx_vport_multicast_packetsMulticast packets transmitted, steered from a port including Raw Ethernet QP/DPDK trafficInformative
    tx_vport_multicast_bytesMulticast bytes transmitted, steered from a port including Raw Ethernet QP/DPDK trafficInformative
    rx_vport_broadcast_packetsBroadcast packets received, steered to a port including Raw Ethernet QP/DPDK trafficInformative
    rx_vport_broadcast_bytesBroadcast bytes received, steered to a port including Raw Ethernet QP/DPDK trafficInformative
    tx_vport_broadcast_packetsBroadcast packets transmitted, steered from a port including Raw Ethernet QP/DPDK trafficInformative
    tx_vport_broadcast_bytesBroadcast packets transmitted, steered from a port including Raw Ethernet QP/DPDK trafficInformative
    rx_vport_rdma_unicast_packetsRDMA unicast packets received, steered to a port (counters counts RoCE/UD/RC traffic) [A]Acceleration
    rx_vport_rdma_unicast_bytesRDMA unicast bytes received, steered to a port (counters counts RoCE/UD/RC traffic) [A]Acceleration
    tx_vport_rdma_unicast_packetsRDMA unicast packets transmitted, steered from a port (counters counts RoCE/UD/RC traffic) [A]Acceleration
    tx_vport_rdma_unicast_bytesRDMA unicast bytes transmitted, steered from a port (counters counts RoCE/UD/RC traffic) [A]Acceleration
    rx_vport_ rdma _multicast_packetsRDMA multicast packets received, steered to a port (counters counts RoCE/UD/RC traffic) [A]Acceleration
    rx_vport_ rdma _multicast_bytesRDMA multicast bytes received, steered to a port (counters counts RoCE/UD/RC traffic) [A]Acceleration
    tx_vport_ rdma _multicast_packetsRDMA multicast packets transmitted, steered from a port (counters counts RoCE/UD/RC traffic) [A]Acceleration
    tx_vport_ rdma _multicast_bytesRDMA multicast bytes transmitted, steered from a port (counters counts RoCE/UD/RC traffic) [A]Acceleration

     

     

    For example: The full list of vPort counters for a port:

    rx_vport_unicast_packets: 0

    rx_vport_unicast_bytes: 0

    tx_vport_unicast_packets: 0

    tx_vport_unicast_bytes: 0

    rx_vport_multicast_packets: 0

    rx_vport_multicast_bytes: 0

    tx_vport_multicast_packets: 12

    tx_vport_multicast_bytes: 936

    rx_vport_broadcast_packets: 0

    rx_vport_broadcast_bytes: 0

    tx_vport_broadcast_packets: 0

    tx_vport_broadcast_bytes: 0

    rx_vport_rdma_unicast_packets: 0

    rx_vport_rdma_unicast_bytes: 0

    tx_vport_rdma_unicast_packets: 0

    tx_vport_rdma_unicast_bytes: 0

    rx_vport_rdma_multicast_packets: 0

    rx_vport_rdma_multicast_bytes: 0

    tx_vport_rdma_multicast_packets: 0

    tx_vport_rdma_multicast_bytes: 0

     

     

    Physical Port Counters

    The physical port counters are the counters on the external port connecting adapter to the network. This measuring point holds information on standardized counters like IEEE 802.3, RFC2863, RFC 2819, RFC 3635 and additional counters like flow control, FEC and more.

     

    Physical Port Counter Table

    CounterDescriptionGroup
    rx_packets_phy

    The number of packets received on the physical port. This counter doesn’t include packets that were discarded due to FCS, frame size and similar errors.

    ConnectX3  naming : rx_packets

    Informative
    tx_packets_phy

    The number of packets transmitted on the physical port.

    ConnectX3  naming : tx_packets

    Informative
    rx_bytes_phy

    The number of bytes received on the physical port, including Ethernet header and FCS.

    ConnectX3  naming : rx_bytes

    Informative
    tx_bytes_phy

    The number of bytes transmitted on the physical port.

    ConnectX3  naming : tx_bytes

    Informative
    rx_multicast_phy

    The number of multicast packets received on the physical port.

    ConnectX3  naming : rx_multicast_packets

    Informative
    tx_multicast_phy

    The number of multicast packets transmitted on the physical port.

    ConnectX3  naming : tx_multicast_packets

    Informative
    rx_broadcast_phy

    The number of broadcast packets received on the physical port.

    ConnectX3  naming : rx_broadcast_packets

    Informative
    tx_broadcast_phy

    The number of broadcast packets transmitted on the physical port.

    ConnectX3  naming : tx_broadcast_packets

    Informative
    rx_crc_errors_phy

    The number of dropped received packets due to FCS (Frame Check Sequence)  error on the physical port. If this counter is increased in high rate, check the link quality using rx_symbol_error_phy and rx_corrected_bits_phy counters below.

    ConnectX3  naming : rx_crc_errors

    Error
    rx_in_range_len_errors_phy

    The number of received packets dropped due to length/type errors on a physical port.

    ConnectX3  naming : rx_in_range_length_error

    Error
    rx_out_of_range_len_phy

    The number of received packets dropped due to length greater than allowed on a physical port.

    If this counter is increasing, it implies that the peer connected to the adapter has a larger MTU configured. Using same MTU configuration shall resolve this issue.

    ConnectX3  naming : rx_out_range_length_error

    Error
    rx_oversize_pkts_phy

    The number of dropped received packets due to length which exceed MTU size on a physical port

    If this counter is increasing, it implies that the peer connected to the adapter has a larger MTU configured. Using same MTU configuration shall resolve this issue.

    ConnectX3  naming : rx_frame_errors

    Error
    rx_symbol_err_phyThe number of received packets dropped due to physical coding errors (symbol errors) on a physical port.Error
    rx_mac_control_phyThe number of MAC control packets received on the physical port.Informative
    tx_mac_control_phyThe number of MAC control packets transmitted on the physical port.Informative
    rx_pause_ctrl_phyThe number of link layer pause packets received on a physical port. If this counter is increasing, it implies that the network is congested and cannot absorb the traffic coming from to the adapter.Informative
    tx_pause_ctrl_phyThe number of link layer pause packets transmitted on a physical port. If this counter is increasing, it implies that the NIC is congested and cannot absorb the traffic coming from the network.Informative
    rx_unsupported_op_phyThe number of MAC control packets received with unsupported opcode on a phisical port.Error
    rx_discards_phy

    The number of received packets dropped due to lack of buffers on a physical port. If this counter is increasing, it implies that the adapter is congested and cannot absorb the traffic coming from the network.

    ConnectX3  naming : rx_fifo_errors

    Error
    tx_errors_phyThe number of transmitted packets dropped due to a length which exceed MTU size on a physical port.Error
    rx_undersize_pkts_phyThe number of received packets dropped due to length which is shorter than 64 bytes on a physical port. If this counter is increasing, it implies that the peer connected to the adapter has a non-standard MTU configured or malformed packet had arrived.Error
    rx_fragments_phyThe number of received packets dropped due to a length which is shorter than 64 bytes and has FCS error on a physical port. If this counter is increasing, it implies that the peer connected to the adapter has a non-standard MTU configured.Error
    rx_jabbers_phyThe number of received packets d due to a length which is longer than 64 bytes and had FCS error on a physical port.Error
    rx_64_bytes_phyThe number of packets received on the physical port with size of 64 bytes.Informative
    rx_65_to_127_bytes_phyThe number of packets received on the physical port with size of 65 to 127 bytes.Informative
    rx_128_to_255_bytes_phyThe number of packets received on the physical port with size of 128 to 255 bytes.Informative
    rx_256_to_511_bytes_phyThe number of packets received on the physical port with size of 256 to 512 bytes.Informative
    rx_512_to_1023_bytes_phyThe number of packets received on the physical port with size of 512 to 1023 bytes.Informative
    rx_1024_to_1518_bytes_phyThe number of packets received on the physical port with size of 1024 to 1518 bytes.Informative
    rx_1519_to_2047_bytes_phyThe number of packets received on the physical port with size of 1519 to 2047 bytes.Informative
    rx_2048_to_4095_bytes_phyThe number of packets received on the physical port with size of 2048 to 4095 bytes.Informative
    rx_4096_to_8191_bytes_phyThe number of packets received on the physical port with size of 4096 to 8191 bytes.Informative
    rx_8192_to_10239_bytes_phyThe number of packets received on the physical port with size of 8192 to 10239 bytes.Informative
    link_down_events_phy    The number of times where the link operative state changed to down. In case this counter is increasing it may imply on port flapping. You may need to replace the cable/transceiver.Error
    rx_out_of_bufferNumber of times  receive queue had no software buffers allocated for the adapter's incoming traffic.Error
    module_bus_stuckThe number of times that module's I2C bus (data or clock) short-wire was detected. You may need to replace the cable/transceiver - supported from kernel 4.10Error
    module_high_tempThe number of times that the module temperature was too high. If this issue persist, you may need to check the ambient temperature or replace the cable/transceiver module - supported from kernel 4.10Error
    module_bad_shortedThe number of times that the module cables were shorted. You may need to replace the cable/transceiver module - supported from kernel 4.10Error
    module_unplugThe number of times that module was ejected - supported from kernel 4.10Informative
    rx_buffer_passed_thres_phyThe number of events where the port receive buffer was over 85% full. Supported from kernel 4.14Informative
    tx_pause_storm_warning_eventsThe number of times the device was sending pauses for a long period of time - supported from kernel 4.15Informative
    tx_pause_storm_error_eventsThe number of times the device was sending pauses for a long period of time, reaching time out and disabling transmission of pause frames. on the period where pause frames were disabled, drop could have been occurred  - supported from kernel 4.15Error

     

     

    For example: The full list of Physical port counters:

    link_down_events_phy: 10

    tx_packets_phy: 334887427

    rx_packets_phy: 856031408

    rx_crc_errors_phy: 0

    tx_bytes_phy: 472228062310

    rx_bytes_phy: 1285813020323

    tx_multicast_phy: 179

    tx_broadcast_phy: 42

    rx_multicast_phy: 181

    rx_broadcast_phy: 952

    rx_in_range_len_errors_phy: 0

    rx_out_of_range_len_phy: 0

    rx_oversize_pkts_phy: 0

    rx_symbol_err_phy: 0

    tx_mac_control_phy: 0

    rx_mac_control_phy: 0

    rx_unsupported_op_phy: 0

    rx_pause_ctrl_phy: 0

    tx_pause_ctrl_phy: 0

    rx_discards_phy: 0

    tx_discards_phy: 0

    tx_errors_phy: 0

    rx_undersize_pkts_phy: 0

    rx_fragments_phy: 0

    rx_jabbers_phy: 0

    rx_64_bytes_phy: 641

    rx_65_to_127_bytes_phy: 10257829

    rx_128_to_255_bytes_phy: 31912

    rx_256_to_511_bytes_phy: 58138

    rx_512_to_1023_bytes_phy: 115665

    rx_1024_to_1518_bytes_phy: 499555011

    rx_1519_to_2047_bytes_phy: 346012212

    rx_2048_to_4095_bytes_phy: 0

    rx_4096_to_8191_bytes_phy: 0

    rx_8192_to_10239_bytes_phy: 0

    time_since_last_clear_phy: 3703953   (Obsolete)

    symbol_errors_phy: 0                 (Obsolete)

    sync_headers_errors_phy: 0           (Obsolete)

    edpl/bip_errors_lane0_phy: 0         (Obsolete)

    edpl/bip_errors_lane1_phy: 0         (Obsolete)

    edpl/bip_errors_lane2_phy: 0         (Obsolete)

    edpl/bip_errors_lane3_phy: 0         (Obsolete)

    fc_corrected_blocks_lane0_phy: 0     (Obsolete)

    fc_corrected_blocks_lane1_phy: 0     (Obsolete)

    fc_corrected_blocks_lane2_phy: 0     (Obsolete)

    fc_corrected_blocks_lane3_phy: 0     (Obsolete)

    fc_uncorrectable_lane0_phy: 0        (Obsolete)

    fc_uncorrectable_lane1_phy: 0        (Obsolete)

    fc_uncorrectable_lane2_phy: 0        (Obsolete)

    fc_uncorrectable_lane3_phy: 0        (Obsolete)

    rs_corrected_blocks_phy: 0           (Obsolete)

    rs_uncorrectable_blocks_phy: 0       (Obsolete)

    rs_no_errors_blocks_phy: 0           (Obsolete)

    rs_single_error_blocks_phy: 0        (Obsolete)

    rs_corrected_symbols_total_phy: 0    (Obsolete)

    rs_corrected_symbols_lane0_phy: 0    (Obsolete)

    rs_corrected_symbols_lane1_phy: 0    (Obsolete)

    rs_corrected_symbols_lane2_phy: 0    (Obsolete)

    rs_corrected_symbols_lane3_phy: 0    (Obsolete)

     

     

    Priority Port Counters

    The following counters are physical port counters that being counted per L2 priority (0-7).

    Note: 'p' in the counter name represents the priority.

     

    Priority Port Counter Table

    CounterDescriptionGroup
    rx_prio[p]_bytes

    The number of bytes received with priority p on the physical port.

    ConnectX3  naming :rx_prio_[p]_bytes. this counter also counts packets with no vlan

    Informative
    rx_prio[p]_packets

    The number of packets received with priority p on the physical port.

    ConnectX3  naming : rx_prio_[p]_packets. this counter also counts packets with no vlan

    Informative
    tx_prio[p]_bytes

    The number of bytes transmitted on priority p on the physical port.

    ConnectX3  naming :tx_prio_[p]_bytes.

    Informative
    tx_prio[p]_packets

    The number of packets transmitted on priority p on the physical port.

    ConnectX3  naming : tx_prio_[p]_packets.

    Informative
    rx_prio[p]_pause

    The number of pause packets received with priority p on a physical port. If this counter is increasing, it implies that the network is congested and cannot absorb the traffic coming from the adapter.

    Note: This counter is available only if PFC was enabled on priority p. Refer to HowTo Configure PFC on ConnectX-4 .

    ConnectX3  naming : rx_pause_prio_p

    Informative
    rx_prio[p]_pause_duration

    The duration of pause received (in microSec) on priority p on the physical port. The counter represents the time the port did not send any traffic on this priority. If this counter is increasing, it implies that the network is congested and cannot absorb the traffic coming from the adapter.

    Note: This counter is available only if PFC was enabled on priority p. Refer to HowTo Configure PFC on ConnectX-4 .

    ConnectX3  naming : rx_pause_duration_prio_p

    Informative
    rx_prio[p]_pause_transition

    The number of times a transition from Xoff to Xon on priority p on the physical port has occurred.

    Note: This counter is available only if PFC was enabled on priority p. Refer to HowTo Configure PFC on ConnectX-4 .

    ConnectX3  naming : rx_pause_transition_prio_p

    Informative
    tx_prio[p]_pause

    The number of pause packets transmitted on priority p on a physical port. If this counter is increasing, it implies that the adapter is congested and cannot absorb the traffic coming from the network.

    Note: This counter is available only if PFC was enabled on priority p. Refer to HowTo Configure PFC on ConnectX-4 .

    ConnectX3  naming : tx_pause_prio_p

    Informative
    tx_prio[p]_pause_duration

    The duration of pause transmitter (in microSec) on priority p on the physical port.

    Note: This counter is available only if PFC was enabled on priority p. Refer to HowTo Configure PFC on ConnectX-4 .

    ConnectX3  naming : tx_pause_duration_prio_p

    Informative

     

     

    For example: The full list of priority port counters:

    rx_prio4_bytes: 640

    rx_prio4_packets: 10

    tx_prio4_bytes: 0

    tx_prio4_packets: 0

    rx_prio4_pause: 0

    rx_prio4_pause_duration: 0

    tx_prio4_pause: 26832

    tx_prio4_pause_duration: 14508

    rx_prio4_pause_transition: 0

    PCIe Counters

    CounterDescriptionGroup
    rx_pci_signal_integrity

    Counts physical layer PCIe signal integrity errors, the number of transitions to recovery due to Framing errors and CRC (dlp and tlp).

    If this counter is raising, try moving the adapter card to a different slot to rule out a bad PCI slot. Validate that you are running with the latest firmware available and latest server BIOS version.

    Error
    tx_pci_signal_integrity

    Counts physical layer PCIe signal integrity errors, the number of transition to recovery initiated by the other side (moving to recovery due to getting TS/EIEOS).

    If this counter is raising, try moving the adapter card to a different slot to rule out a bad PCI slot. Validate that you are running with the latest firmware available and latest server BIOS version.

    Error
    outbound_pci_buffer_overflowThe number of packets dropped due to pci buffer overflow. If this counter is raising in high rate, it might indicate that the receive traffic rate for a host is larger than the PCIe bus and therefore a congestion occurs. Supported from kernel 4.14Informative
    outbound_pci_stalled_rdThe percentage (in the range 0...100) of time within the last second that the NIC had outbound non-posted reads requests but could not perform the operation due to insufficient posted credits. Supported from kernel 4.14Informative
    outbound_pci_stalled_wrThe percentage (in the range 0...100) of time within the last second that the NIC had outbound posted writes requests but could not perform the operation due to insufficient posted credits. Supported from kernel 4.14Informative
    outbound_pci_stalled_rd_eventsThe number of seconds where outbound_pci_stalled_rd was above 30%. Supported from kernel 4.14Informative
    outbound_pci_stalled_wr_eventsThe number of seconds where outbound_pci_stalled_wr was above 30%. Supported from kernel 4.14Informative

     

     

    For example: The full list of priority port counters:

    rx_pci_signal_integrity: 0
    tx_pci_signal_integrity: 0

     

     

    Full List of Counters

    #ibdev2netdev

    mlx5_0 port 1 ==> eth5 (Up)

     

    # ethtool -S eth5

    NIC statistics:

         rx_packets: 0

         rx_bytes: 0

         tx_packets: 6

         tx_bytes: 468

         tx_tso_packets: 0

         tx_tso_bytes: 0

         tx_tso_inner_packets: 0

         tx_tso_inner_bytes: 0

         rx_lro_packets: 0

         rx_lro_bytes: 0

         rx_csum_unnecessary: 0

         rx_csum_none: 0

         rx_csum_complete: 0

         rx_csum_unnecessary_inner: 0

         tx_csum_partial: 0

         tx_csum_partial_inner: 0

         tx_csum_none: 6

         tx_queue_stopped: 0

         tx_queue_wake: 0

         tx_queue_dropped: 0

         rx_sw_lro_aggregated: 0

         rx_sw_lro_flushed: 0

         rx_sw_lro_no_desc: 0

         rx_wqe_err: 0

         rx_cqe_compress_pkts: 0

         rx_cqe_compress_blks: 0

         rx_mpwqe_filler: 0

         rx_mpwqe_frag: 0

         link_down_events_phy: 10

         rx_out_of_buffer: 0

         rx_vport_unicast_packets: 0

         rx_vport_unicast_bytes: 0

         tx_vport_unicast_packets: 0

         tx_vport_unicast_bytes: 0

         rx_vport_multicast_packets: 0

         rx_vport_multicast_bytes: 0

         tx_vport_multicast_packets: 12

         tx_vport_multicast_bytes: 936

         rx_vport_broadcast_packets: 0

         rx_vport_broadcast_bytes: 0

         tx_vport_broadcast_packets: 0

         tx_vport_broadcast_bytes: 0

         rx_vport_rdma_unicast_packets: 0

         rx_vport_rdma_unicast_bytes: 0

         tx_vport_rdma_unicast_packets: 0

         tx_vport_rdma_unicast_bytes: 0

         rx_vport_rdma_multicast_packets: 0

         rx_vport_rdma_multicast_bytes: 0

         tx_vport_rdma_multicast_packets: 0

         tx_vport_rdma_multicast_bytes: 0

         tx_packets_phy: 334887427

         rx_packets_phy: 856031408

         rx_crc_errors_phy: 0

         tx_bytes_phy: 472228062310

         rx_bytes_phy: 1285813020323

         tx_multicast_phy: 179

         tx_broadcast_phy: 42

         rx_multicast_phy: 181

         rx_broadcast_phy: 952

         rx_in_range_len_errors_phy: 0

         rx_out_of_range_len_phy: 0

         rx_oversize_pkts_phy: 0

         rx_symbol_err_phy: 0

         tx_mac_control_phy: 0

         rx_mac_control_phy: 0

         rx_unsupported_op_phy: 0

         rx_pause_ctrl_phy: 0

         tx_pause_ctrl_phy: 0

         rx_discards_phy: 0

         tx_discards_phy: 0

         tx_errors_phy: 0

         rx_undersize_pkts_phy: 0

         rx_fragments_phy: 0

         rx_jabbers_phy: 0

         rx_64_bytes_phy: 641

         rx_65_to_127_bytes_phy: 10257829

         rx_128_to_255_bytes_phy: 31912

         rx_256_to_511_bytes_phy: 58138

         rx_512_to_1023_bytes_phy: 115665

         rx_1024_to_1518_bytes_phy: 499555011

         rx_1519_to_2047_bytes_phy: 346012212

         rx_2048_to_4095_bytes_phy: 0

         rx_4096_to_8191_bytes_phy: 0

         rx_8192_to_10239_bytes_phy: 0

         rx_pci_signal_integrity: 0
         tx_pci_signal_integrity: 0

         rx_prio0_bytes: 1285813018979

         rx_prio0_packets: 856031387

         tx_prio0_bytes: 472228055654

         tx_prio0_packets: 334887323

         rx_prio1_bytes: 0

         rx_prio1_packets: 0

         tx_prio1_bytes: 0

         tx_prio1_packets: 0

         rx_prio2_bytes: 0

         rx_prio2_packets: 0

         tx_prio2_bytes: 2752

         tx_prio2_packets: 43

         rx_prio3_bytes: 704

         rx_prio3_packets: 11

         tx_prio3_bytes: 2496

         tx_prio3_packets: 39

         rx_prio4_bytes: 640

         rx_prio4_packets: 10

         tx_prio4_bytes: 0

         tx_prio4_packets: 0

         rx_prio5_bytes: 0

         rx_prio5_packets: 0

         tx_prio5_bytes: 1408

         tx_prio5_packets: 22

         rx_prio6_bytes: 0

         rx_prio6_packets: 0

         tx_prio6_bytes: 0

         tx_prio6_packets: 0

         rx_prio7_bytes: 0

         rx_prio7_packets: 0

         tx_prio7_bytes: 0

         tx_prio7_packets: 0

         rx_global_pause: 0

         rx_global_pause_duration: 0

         tx_global_pause: 0

         tx_global_pause_duration: 0

         rx_global_pause_transition: 0

         rx0_packets: 0

         rx0_bytes: 0

         rx0_csum_complete: 0

         rx0_csum_unnecessary_inner: 0

         rx0_csum_none: 0

         rx0_csum_unnecessary: 0

         rx0_lro_packets: 0

         rx0_lro_bytes: 0

         rx0_wqe_err: 0

         rx0_cqe_compress_pkts: 0

         rx0_cqe_compress_blks: 0

         rx0_mpwqe_filler: 0

         rx0_mpwqe_frag: 0

         rx1_packets: 0

         rx1_bytes: 0

         rx1_csum_complete: 0

         rx1_csum_unnecessary_inner: 0

         rx1_csum_none: 0

         rx1_csum_unnecessary: 0

         rx1_lro_packets: 0

         rx1_lro_bytes: 0

         rx1_wqe_err: 0

         rx1_cqe_compress_pkts: 0

         rx1_cqe_compress_blks: 0

         rx1_mpwqe_filler: 0

         rx1_mpwqe_frag: 0

         rx2_packets: 0

         rx2_bytes: 0

         rx2_csum_complete: 0

         rx2_csum_unnecessary_inner: 0

         rx2_csum_none: 0

         rx2_csum_unnecessary: 0

         rx2_lro_packets: 0

         rx2_lro_bytes: 0

         rx2_wqe_err: 0

         rx2_cqe_compress_pkts: 0

         rx2_cqe_compress_blks: 0

         rx2_mpwqe_filler: 0

         rx2_mpwqe_frag: 0

         ...

         rx15_packets: 0

         rx15_bytes: 0

         rx15_csum_complete: 0

         rx15_csum_unnecessary_inner: 0

         rx15_csum_none: 0

         rx15_csum_unnecessary: 0

         rx15_lro_packets: 0

         rx15_lro_bytes: 0

         rx15_wqe_err: 0

         rx15_cqe_compress_pkts: 0

         rx15_cqe_compress_blks: 0

         rx15_mpwqe_filler: 0

         rx15_mpwqe_frag: 0

       

         tx0_packets: 0

         tx0_bytes: 0

         tx0_tso_packets: 0

         tx0_tso_bytes: 0

         tx0_tso_inner_packets: 0

         tx0_tso_inner_bytes: 0

         tx0_csum_none: 0

         tx0_csum_partial : 0

         tx0_csum_partial_inner: 0

         tx0_nop: 0

         tx0_queue_stopped: 0

         tx0_queue_wake: 0

         tx0_queue_dropped: 0

         tx1_packets: 0

         tx1_bytes: 0

         tx1_tso_packets: 0

         tx1_tso_bytes: 0

         tx1_tso_inner_packets: 0

         tx1_tso_inner_bytes: 0

         tx1_csum_none: 0

         tx1_csum_partial : 0

         tx1_csum_partial_inner: 0

         tx1_nop: 0

         tx1_queue_stopped: 0

         tx1_queue_wake: 0

         tx1_queue_dropped: 0

         tx2_packets: 0

         tx2_bytes: 0

         tx2_tso_packets: 0

         tx2_tso_bytes: 0

         tx2_tso_inner_packets: 0

         tx2_tso_inner_bytes: 0

         tx2_csum_none: 0

         tx2_csum_partial : 0

         tx2_csum_partial_inner: 0

         tx2_nop: 0

         tx2_queue_stopped: 0

         tx2_queue_wake: 0

         tx2_queue_dropped: 0

         ...

     

         tx15_bytes: 0

         tx15_tso_packets: 0

         tx15_tso_bytes: 0

         tx15_tso_inner_packets: 0

         tx15_tso_inner_bytes: 0

         tx15_csum_none: 0

         tx15_csum_partial : 0

         tx15_csum_partial_inner: 0

         tx15_nop: 0

         tx15_queue_stopped: 0

         tx15_queue_wake: 0

         tx15_queue_dropped: 0

     

     

    Upsteam Counters

    The following counters available upstream, but not in the current MLNX_OFED.

    CounterDescriptionGroupComments
    rx[i]_buff_alloc_err / rx_buff_alloc_errFailed to allocate a buffer to received packet (or SKB) on port (or per ring)Erroralready in upstream
    rx_bits_phy This counter provides information on the total amount of traffic that could have been received and can be used as a guideline to measure the ratio of errored traffic in rx_pcs_symbol_err_phy & rx_corrected_bits_phy.InformativePlanned for kernel 4.8
    rx_pcs_symbol_err_phyThis counter is the number of symbol errors that wasn’t corrected by FEC correction algorithm or that FEC algorithm was not active on this interface. If this counter is increasing, it implies that the link between the NIC and the network is suffering from high BER, and that traffic is lost. You may need to replace the cable/transceiver. The error rate is the number of rx_pcs_symbol_err_phy divided by the number of rx_phy_bits on a specific time frame.ErrorPlanned for kernel 4.8
    rx_corrected_bits_phyThe number of corrected bits on this port according to active FEC (RS/FC). If this counter is increasing, it implies that the link between the NIC and the network is suffering from high BER. The corrected bit rate is the number of rx_corrected_bits_phy divided by the number of rx_phy_bits on a specific time frame.ErrorPlanned for kernel 4.8