1 Reply Latest reply on Aug 27, 2018 7:30 AM by alkx

    Is an p2p (dedicated link, without switch) Fibre connexion totally lossless ?

    raph38130

      Dear all,

       

      I am doing RDMA transfer using connectx-5 100GbE Fibre with RoCEv2 (UD unreliable datagram Send) between two servers (<10 meters)

       

      data size is around 8 GBytes or 80GB during tests

      some time everything is fine and I dont have packet drop

      but I also have  frequently a low number (around 0.01%) packet silently loss (nothing visible with verbs api neither dmesg or sysfs)

       

      I am sure that some packet are dropped because I use RDMA_SEND_WITH_IMM verb with a pkt number that is checked while polling RWQ on destination host. Application is a loop that continuously post 15360 work request (3072bytes length)  at once

       

      This is not related with completion queue overrun (they are polled).

       

      I pay attention to cpu affinity, I try to put some amount of nanosleep on source host between ibv_post_send,I also set Ring parameters to max (8192). I suspected some transceiver temperature issue and try with another 40GbE copper link, and I have same issue

       

      My question : are some (very few number but not zero) packet loss unavoidable ?

       

       

      cheers