I have the following equipment:
- 2xNICs: MCX455A-FCAT PCIe-gen3 x16 (each one installed in a separate node)
- Switch: MSX610F-BS
Both the (VPI) NIC and the Switch are configured to Infiniband mode.
And I am running the following experiment:
I have two servers and I am sending small messages (~30B) through UD QPs from one server to the other (both-directions).
I have highly optimized the code (e.g. batching to the NIC, inlining, selective signalling, multiple QPs etc)
I run such an experiment with two different configurations.
1. I am connecting the servers through the Switch.
2. I am directly connecting the NICs back-to-back (via a single cable)
The strange thing is that I get different results. More precisely by reading the mellanox counters I see different Performance in terms of packets / sec.
1. With the switch I get around 63Mpps (in each direction)
2. On the other hand without the switch I get up to 80 Mpps (per direction) and at this point --> I am highly confident that I am bottlenecked by the PCIe
So my question is the following.
- Do I have a defective/misconfigured switch or its common for a switch to not operate on line rate for small packets (by having less forwarding rate)?
P.S. Also since I have 2 servers only connected through the switch I don't think I may have any congestion or something else that could explain the degradation. Am I missing something?