I'm not a big expert on this, but I know the way to go today is to work with the ROCm driver, which supports Peer to Peer GPU VIA RDMA and has a peer plugin.
It's not exactly DirectGMA (which never worked well beyond SD-I/O functionality).
Hope it helps.
Excellent suggestion, thank you. That does indeed look like the way to go.