Mellanox HCAs’ temperature threshold is 105 [degrees centigrade]. Temperature measured on HCA lower or equal than this threshold is considered as proper temperature.
You can measure it also using mget_temp tool that comes with MFT (Mellanox firmware tool) that can be downloaded from Mellanox website:
Where can I get a Mellanox document including this thermal specification?
I had the same question for the ConnectX-4 EDR (MCX455A-ECAT) cards. Any idea if the temperature threshold is 120 degrees C for that card?
Please note that I corrected my answer above. The maximum temperature is 105 C. this is the Junction Temperature (chip temp).
If you use the mget_temp utility from the Mellanox firmware tool (MFT), it gives you the Junction Temperature, so 105 and lower is ok.
This is relevant for both ConnectX-3 and ConnectX-4.
This information can be found in the adapter's datasheets. this document can be provided by Mellanox support only if there is NDA signed.
The 55C temperature you see in the adapter HW user manual is the Ambient Temperature- this temperature cannot be measured by Mellanox tools.
For example- page 68: