I am trying to port OFED to Barrelfish OS. I mention that I am testing my code on a Mellanox ConnectX-3 Dual-Port NIC (15b3:1003). I started by activating the 2 ports in eth mode and for the beginning I am trying to print the broadcast packets moving across the ethernet network. The network is quiet thus, I am generating broadcast ping packets from another endpoint. I reuse the FreeBSD OFED which creates an indirection QP (attached to the broadcast address) that receives all the eth packets and further steers them on four RX RSS QPs. Each one of the four RSS QPs post events on a different CQ. In the actual state of the code, I don't get any completion event entry on any of the completion queue BUT after around 80 ping packets, on the first CQ, I get a CQE corresponding to the below error message:
CQE completed in error - vendor syndrom:216 syndrom:2
I found that the syndrome 2 corresponds to LOCAL QP OPERATION ERROR but its definition on IBTA Architecture Specification is pretty generic. Could someone give me a hint on which may be the cause of this error? What the vendor syndrome means? It is strange that every time the error is triggered after around 80 packets received by the HCA.
The issue might be related to by PCI bus, but it is difficult to say without deeper analyzing of the whole source code including OS. I may recommend to put more prints in FreeBSD and your code, and following them to see if you getting the same flow on both systems. Hopefully, you will see some differences in the logic.
On the other hand, can you check if using the latest firmware helps?