Understanding Telemetry Sampling on Mellanox Spectrum Switches

Version 3

    This post discusses Telemetry sampling on Mellanox Spectrum™ switches.

     

    References

     

    Overview

    As it is becoming increasingly complex to manage networks, and network administrators need more tools to understand network behavior, it is necessary to provide basic information about network performance, identify network bottlenecks, and provide information for the purposes of network optimization and future planning. Therefore, network administrators are required to constantly review network port behavior, record port buffer consumption, identify shortage in buffer resources, and record flows which lead to excessive buffer consumption.

    You can enable Histogram Sampling of the port buffer occupancy and:

    • Record occupancy changes over time
    • Provide information for different levels of buffer occupancy
    • Provide the amount of time the buffer has been occupied during the observation period

     

    Configuration

    Telemetry configuration is possible via CLI and WebUI, the latter allowing to actually see the Telemetry graphs.

    In this example, we will show the configuration and graph via the WebUI. For the CLI configuration, please refer to the MLNX-OS UM.

     

    1. Open the WebUI and click Ports > Telemetry.

     

    2. Enable Telemetry globally and set global parameters like sampling interval.

     

    3. Set the sampling configuration on the desired port per TC (Traffic Class) and traffic type.

     

     

     

    4. See the Telemetry summary for the traffic pattern you just configured. Open the details to see the graph.

     

     

     

    5. Run some traffic on that port and check the graph behavior.

    The graph interval may be set to 5 minutes, 1 hour, or 1 day.

     

     

    6. You can download the raw data (see attached) for analysis.