1 Reply Latest reply on Nov 19, 2018 1:40 AM by samerka

    Inconsistent hardware timestamping? ConnectX-5 EN & tcpdump

    jillisn

      Hi all,

       

      We recently purchased a MCX516A-CCAT from the Mellanox webstore, but encountered the following issue when trying to do a simple latency measurement, using hardware timestamping.
      Using the following command to retrieve system timestamps:

      ip netns exec ns_m0 tcpdump --time-stamp-type=host --time-stamp-precision=nano
      

      Which gives the following results (for example):

       

      master/serverslave/client
      19:36:03.883442258 IP15
      19:36:03.883524725 IP15
      19:36:03.883678497 IP15
      19:36:03.883703809 IP15
      19:36:03.883924377 IP15
      19:36:03.883939231 IP15
      19:36:03.883971437 IP15
      19:36:03.883985143 IP15
      19:36:03.884010765 IP15
      19:36:03.884021139 IP15
      19:36:03.884051422 IP15
      19:36:03.884062029 IP15
      19:36:03.884083780 IP15
      19:36:03.884091661 IP15
      19:36:03.884127283 IP15
      19:36:03.884135654 IP15
      19:36:03.884159177 IP15
      19:36:03.884167900 IP15
      19:36:03.884187810 IP15
      19:36:03.884197308 IP15
      
      19:36:03.883379688 IP15
      19:36:03.883590507 IP15
      19:36:03.883659403 IP15
      19:36:03.883716669 IP15
      19:36:03.883914510 IP15
      19:36:03.883947770 IP15
      19:36:03.883961851 IP15
      19:36:03.883994953 IP15
      19:36:03.884005137 IP15
      19:36:03.884030823 IP15
      19:36:03.884046094 IP15
      19:36:03.884068390 IP15
      19:36:03.884078674 IP15
      19:36:03.884100314 IP15
      19:36:03.884119333 IP15
      19:36:03.884141135 IP15
      19:36:03.884152060 IP15
      19:36:03.884173955 IP15
      19:36:03.884182438 IP15
      19:36:03.884203057 IP15
      

      This is expected, timestamps are in chronological order. About the traffic: small and equal packets are bounced back-and-forth. Client initiates traffic generation. So for the client the odd numbered timestamps are outgoing and vice-versa for the server.

      But now, when using hardware timestamping, we get the following (for example):

      ip netns exec ns_m0 tcpdump --time-stamp-type=adapter_unsynced --time-stamp-precision=nano
      

       

      master/server
      slave/client
      14:44:04.710315788 IP15
      14:44:04.758545873 IP15
      14:44:04.710567282 IP15
      14:44:04.758799830 IP15
      14:44:04.710849394 IP15
      14:44:04.759069396 IP15
      14:44:04.711042879 IP15
      14:44:04.759236686 IP15
      14:44:04.711141554 IP15
      14:44:04.759281897 IP15
      14:44:04.711184281 IP15
      14:44:04.759324535 IP15
      14:44:04.711224345 IP15
      14:44:04.759364437 IP15
      14:44:04.711266610 IP15
      14:44:04.759406555 IP15
      14:44:04.711310310 IP15
      14:44:04.759449711 IP15
      14:44:04.711349465 IP15
      14:44:04.759488431 IP15
      
      14:44:04.758411898 IP15
      14:44:04.710425435 IP15
      14:44:04.758680982 IP15
      14:44:04.710662581 IP15
      14:44:04.758963612 IP15
      14:44:04.710928565 IP15
      14:44:04.759157087 IP15
      14:44:04.711098779 IP15
      14:44:04.759261251 IP15
      14:44:04.711140994 IP15
      14:44:04.759302503 IP15
      14:44:04.711182978 IP15
      14:44:04.759344893 IP15
      14:44:04.711223669 IP15
      14:44:04.759384802 IP15
      14:44:04.711267547 IP15
      14:44:04.759428520 IP15
      14:44:04.711308661 IP15
      14:44:04.759469128 IP15
      14:44:04.711351810 IP15
      

      Now we can see that the timestamps are not chronological (see nanosecond portions). Which is unexpected, and is making the latency measurement impossible (as far as I can see). I expect both ports to be on their own clocks, this does not appear to be the case however (clock for RX and a clock for TX, instead of clock per port). Is there a solution to this? Must I use socket ancillary data in a custom C application to receive the correct timestamps? I'll put information on the setup below. Please let me know if more information is needed. Note: Applications like linuxptp do seem to work fine with hardware timestamping, and gives a path-delay in the sub-microsecond range.

      The setup:

      meas_E2E_exp (1).png

      CentOS Linux 7.

      Kernel 3.10.0-862.14.4.el7.x86_64 (default kernel for CentOS 7.5 installation).

      Mellanox OFED, latest firmware & drivers.

      Using network namespaces ns_m0 with ens6f0 and ns_m1 with ens6f1 to prevent kernel loopback.