I am doing RDMA transfer using connectx-5 100GbE Fibre with RoCEv2 (UD unreliable datagram Send) between two servers (<10 meters)
data size is around 8 GBytes or 80GB during tests
some time everything is fine and I dont have packet drop
but I also have frequently a low number (around 0.01%) packet silently loss (nothing visible with verbs api neither dmesg or sysfs)
I am sure that some packet are dropped because I use RDMA_SEND_WITH_IMM verb with a pkt number that is checked while polling RWQ on destination host. Application is a loop that continuously post 15360 work request (3072bytes length) at once
This is not related with completion queue overrun (they are polled).
I pay attention to cpu affinity, I try to put some amount of nanosleep on source host between ibv_post_send,I also set Ring parameters to max (8192). I suspected some transceiver temperature issue and try with another 40GbE copper link, and I have same issue
My question : are some (very few number but not zero) packet loss unavoidable ?
RoCE v2 is a UDP based protocol, and UDP, unfortunately, does not guarantee delivery, ordering or duplicate protection of the packets.
In the case if you have an additional programming questions, I would suggest to ask the question on linux-rdma mailing list.