0 Replies Latest reply on Sep 22, 2015 8:57 AM by pmutyala

    having trouble doing rdma connect when using routes for rocee network

      Hi There,

       

      I have setup as follows

      1. Host server - RHEL 7.0 + Virtualization (QEMU) installed mlnx_fw_nic_3.0-1.0.1.2_rhel7_x86-64.bin with SR-IOV on a mellanox CX-3 card

      2. Created KVM guests and passed VFs to each guests. Installed MLNX_OFED_LINUX-3.0-2.0.1-rhel7.1-x86_64.iso for the guests

       

      When try to do rdma connect without any default routes added then dtest works fine. but when i added route to the network based on a L3 ip i created on the switch connect timesout.. i didn't get any roce-mode so its still rocev1, why does connect work when routes are added?

       

      # dtest -P roce1

      2230 Running as server - roce1 v2

      2230 Local Address AF_INET - 10.1.1.121 port 45248

      2230 Server is waiting for client connection to send server info

      2230 Server waiting for connect request on port b0c0

      2230 Waiting for connect response

       

       

      2230 CONNECTED!

       

       

      2230 Send RMR msg to remote: r_key_ctx=0x60020b0d,va=0x7fb52c84b000,len=0x400000

      2230 remote RMR data arrived!

      2230 Received RMR from remote: r_iov: r_key_ctx=30020b0f,va=7f10bf6eb000,len=0x400000

       

       

      2230 Query EP: LOCAL addr 10.1.1.121 port b0c0

      2230 Query EP: REMOTE addr 10.1.1.122 port ae6b

       

       

      2230 RDMA WRITE DATA with SEND MSG

       

       

      2230 Sending RDMA WRITE completion message

      2230 inbound rdma_write; send message arrived!

      2230 Received RMR from remote: r_iov: r_key_ctx=30020b0f,va=7f10bf6eb000,len=0x400000

      2230 SERVER: RDMA write buffer contains: client RDMA write data...

       

       

      2230 RDMA READ DATA with SEND MSG

       

       

      2230 Sending RDMA read completion message

      2230 Waiting for inbound message....

      2230 inbound rdma_read; send message arrived!

      2230 Received RMR from remote: r_iov: r_key_ctx=30020b0f,va=7f10bf6eb000,len=0x400000

      2230 SERVER: RCV RDMA read buffer contains: client RDMA read data...

       

       

      2230 PING DATA with SEND MSG

       

       

      2230: Message RTT: Total=1095998.05 usec, 100 bursts, itime=10959.98 usec, pc=0

       

       

      2230: RDMA write (bi-direction): Total=367049.93 usec, itime=1835.25 us, poll=0, 200 x 4194304, 2285.41 MB/sec

       

       

      2230: DAPL Test Complete. PASSED

       

       

       

       

      [root@ps1vm2 ~]# dtest -P roce1 -h 10.1.1.121

      12870 Running as client - waiting for server input

      12870 Running as roce1 client v2

      12870 Local Address AF_INET - 10.1.1.122 port 45248

      12870 Server Name: 10.1.1.121

      12870 Server Net Address: 10.1.1.121 port b0c0

      12870 Waiting for connect response

       

       

      12870 CONNECTED!

       

       

      12870 Send RMR msg to remote: r_key_ctx=0x30020b0f,va=0x7f10bf6eb000,len=0x400000

      12870 remote RMR data arrived!

      12870 Received RMR from remote: r_iov: r_key_ctx=60020b0d,va=7fb52c84b000,len=0x400000

       

       

      12870 Query EP: LOCAL addr 10.1.1.122 port ae6b

      12870 Query EP: REMOTE addr 10.1.1.121 port b0c0

       

       

      12870 RDMA WRITE DATA with SEND MSG

       

       

      12870 Sending RDMA WRITE completion message

      12870 inbound rdma_write; send message arrived!

      12870 Received RMR from remote: r_iov: r_key_ctx=60020b0d,va=7fb52c84b000,len=0x400000

      12870 CLIENT: RDMA write buffer contains: server RDMA write data...

       

       

      12870 RDMA READ DATA with SEND MSG

       

       

      12870 Sending RDMA read completion message

      12870 Waiting for inbound message....

      12870 inbound rdma_read; send message arrived!

      12870 Received RMR from remote: r_iov: r_key_ctx=60020b0d,va=7fb52c84b000,len=0x400000

      12870 CLIENT: RCV RDMA read buffer contains: server RDMA read data...

       

       

      12870 PING DATA with SEND MSG

       

       

      12870: Message RTT: Total=1092453.00 usec, 100 bursts, itime=10924.53 usec, pc=0

       

       

      12870: RDMA write (bi-direction): Total=367846.97 usec, itime=1839.23 us, poll=0, 200 x 4194304, 2280.46 MB/sec

       

       

      12870: DAPL Test Complete. PASSED

       

       

      [root@ps1vm2 ~]#

       

       

       

       

      # route -n

      Kernel IP routing table

      Destination     Gateway         Genmask         Flags Metric Ref    Use Iface

      10.1.1.0        10.1.1.3        255.255.255.0   UG    0      0        0 ens10

      10.1.1.0        0.0.0.0         255.255.255.0   U     100    0        0 ens10

      10.1.2.0        0.0.0.0         255.255.255.0   U     100    0        0 ens12

      # dtest -P roce1

      13902 Running as server - roce1 v2

      13902 Local Address AF_INET - 10.1.1.121 port 45248

      13902 Server is waiting for client connection to send server info

      13902 Server waiting for connect request on port b0c0

       

       

       

       

      # dtest -P roce1 -h 10.1.1.121

      28726 Running as client - waiting for server input

      28726 Running as roce1 client v2

      28726 Local Address AF_INET - 10.1.1.122 port 45248

      28726 Server Name: 10.1.1.121

      28726 Server Net Address: 10.1.1.121 port b0c0

      28726 Waiting for connect response

      ps1vm2.torolab.ibm.com:CMA:7036:8065e700: 98893068 us(98893068 us!!!): dapl_cma_active: CONN_ERR event=0x7 status=-110 TIMEOUT DST 10.1.1.121, 45248

      28726 Error unexpected conn event : 0x4008 DAT_CONNECTION_EVENT_UNREACHABLE

      28726 Error connect_ep: DAT_ABORT

      28726 ERR: Checking ASYNC EVD...

      28726 ERR: Checking RECEIVE EVD...

      28726 ERR: Checking REQUEST EVD...

       

       

      28726: DAPL Test Complete. FAILED

       

       

      [root@ps1vm2 ~]# route -n

      Kernel IP routing table

      Destination     Gateway         Genmask         Flags Metric Ref    Use Iface

      10.1.1.0        10.1.1.3        255.255.255.0   UG    0      0        0 ens9

      10.1.1.0        0.0.0.0         255.255.255.0   U     100    0        0 ens9

      10.1.2.0        0.0.0.0         255.255.255.0   U     100    0        0 ens12

       

       

       

       

      parm:           roce_mode:Set RoCE modes supported by the port

              A single value (e.g. 0) to define uniform preferred RoCE_mode value for all devices

                      or a string to map device function numbers to their RoCE mode value (e.g. '0000:04:00.0-0,002b:1c:0b.a-0').

                      Allowed values are 0: RoCEv1 (default), 1: RoCEv1.5, 2: RoCEv2, 3: RoCEv1.5+2 and 4: RoCEv1+2)