11 Replies Latest reply on Dec 6, 2017 10:47 PM by blackwall

    Where's the procedure of packing network protocol header in RoCE v2?

    blackwall

      Hi, lately I began to study the driver's source code of MLNX_OFED_LINUX-4.0-2.0.0.1-rhel7.3-x86_64.

      When  proceeding to the Network protocol stack, I met some problem,hoping for some guide from friends in the community.

      Please let me show the question:

        Now I'm using verbs api(black line at the below pic 2) and familiar with all its procedure(reading source code and mannual) but the abstract layer below it is not familiar.

        So I want to know:

      How is the RoCE v2 packing udp and ip header into the packet?(uh..I'm meaning where it's done,because I haven't found relevant code about it,but do have some clue,seeing below).And I'm not sure if this procudure is done by this driver or by system network stack.Somebody know it?Very pleasure to learn from you!

      1.source code from MLNX_OFED_LINUX-4.0-2.0.0.1-rhel7.3-x86_64/MLNX_OFED_SRC-4.0-2.0.0.1/SRPMS/libmlx5-1.2.1mlnx1/src/mlx5.c:

          

          

      2.some explanation of RoCEv2

      RoCE+Protocol+Stack.jpg

      3. RoCEv2 packet format

      RoCE+frame+Format.png

        • Re: Where's the procedure of packing network protocol header in RoCE v2?
          licq

          IP and UDP headers are encapsulated by hardware.

          Each QP has its own context. After QP is created, software set the attributes of the QP context by ibv_modify_qp. These attributes are used by hardware to encapsulate the headers.

          1 of 1 people found this helpful
            • Re: Where's the procedure of packing network protocol header in RoCE v2?
              blackwall

              Thanks for helping.

              Following your indication,I locate the prototype of ibv_modify_qp in kernel layer and it do set some attributes of the QP context and encapsulate the udp sport into the path attributes of QP

              but there are some extra questions that confuse me.May I ask you for advices?

              About how to select src udp port for qp,because the documents say when using Reliable Connection RDMA (RC) the Source UDP port is scrambled per QP.

              But actually,it seems not so clearly for "scramble".

              I did a simple expriment to verify the theory.

              exp 1.create 1 qp to transfer data,capture the network packets.

              exp 2.create 10 qp to transfer data,capture the network packets.

              exp 3.create 100 qp to transfer data,capture the network packets.

              Result:

              exp 1:all packets have the same src udp port.

              exp 2:all packets have the same src udp port.

              exp 3:There exist 3 src udp port.

              Guess:RoCEv2 load balance influences the port selection.

              Questions:How port selection is done?

              The following ref is I got from the driver source code,it shows the procedure of setting attributes of qp.

              The setting udp src port procedure is a hint-like method(I feel it).It really arrests me.Thanks for taking time to view.

              深度截图_选择区域_20170731134037.png 

              深度截图_选择区域_20170731134108.png

              深度截图_选择区域_20170731133948.png

              深度截图_选择区域_20170731133757.png

              深度截图_选择区域_20170731134325.png深度截图_选择区域_20170731133549.png

              深度截图_选择区域_20170731134539.png

              深度截图_选择区域_20170731134616.png

                • Re: Where's the procedure of packing network protocol header in RoCE v2?
                  licq

                  There was a bug when driver/firmware generate UDP source port. Could you please try the latest mlnx_ofed-4.1?

                    • Re: Where's the procedure of packing network protocol header in RoCE v2?
                      blackwall

                      Thanks for replying.

                      After updating to the latest mlnx_ofed-4.1,it seems working normaly.

                      But...I do more expriements.

                      exp 1.create 1 qp to transfer data,capture the network packets.

                      exp 2.create 2 qp to transfer data,capture the network packets.

                      exp 3.create 5 qp to transfer data,capture the network packets.

                      exp 4.create 10 qp to transfer data,capture the network packets.
                      exp 5.create 20 qp to transfer data,capture the network packets.

                      Result:

                      exp 1:all packets have the same src udp port.

                      exp 2:There exist 2 src udp port.

                      exp 3:There exist 5 src udp port.

                      exp 4:There exist >5 src udp port.(file size too large,can't capture all the packets)
                      exp 5:There exist >8 src udp port.(file size too large,can't capture all the packets)

                       

                      it seems when number of qp increases the number of udp src port doesn't always increases with it.

                      Or the scramble mechanism is not the one-to-one mode rather than the multiplexing?

                      ---------------------------------------------------------------------------------------------------------------------------------

                      update:

                      exp4 && exp5 There doesn't exit equal number of src udp port as number of queuepair no matter how long the sniffer has captured the network packets.

                • Re: Where's the procedure of packing network protocol header in RoCE v2?
                  haiyingc

                  Hey haonan,

                  I am having similar issue, do you hava any advice ? From my side, i tried to assign a  number to -q parameter , However, i could't not be abole to get different udp source ports, only one (49152) instead. I am using newest MLNX-OFED 4.2 .

                  For that source  file mlx5.c, do i need to compile or like edit it to make it work ?

                   

                  --haiying

                    • Re: Where's the procedure of packing network protocol header in RoCE v2?
                      blackwall

                      No need to modify that source file.

                      Things may need to be clear.

                      Which device do you use?ConnectX-5?

                      Do you use the RoCEv2 protocol?

                      How do you identify that there only exists port 49152?

                       

                      I try to assign several different numbers of QP to generate packets for about 30s, and capture and save them to file by TCPDUMP tool at the same time.

                      Then I analyze them in Wireshark by UNIQUE the UDP source port and find there exist several different source ports.

                      Hope useful to you.

                        • Re: Where's the procedure of packing network protocol header in RoCE v2?
                          haiyingc

                          Thanks for your response!

                           

                          I am using  MLNX_OFED_Linux 4.2  and yes RoCEv2, by using ib_write_bw to test RDMA performance, here i assigned -q a number, 20 for example,  and capture and save them to file by TCPDUMP tool. In Wireshark i see UDP source ports are the same 49152, so i want to know how could i get different source ports. Do i need to modify any files like that mlx5.c? or any other additional configurations?

                           

                          Really appreciate!

                           

                          --Haiying

                            • Re: Where's the procedure of packing network protocol header in RoCE v2?
                              blackwall

                              Hi Haiying.

                              It seems that there is nothing wrong.

                              I have tested several simple experiments for it and find something very strange.

                              First, I test 20 QPs by ib_write_bw -d mlx5_0 -D 10 -q 20, and check the packets in Wireshark. It gives me all the same UDP src port: 53248.

                              Then, I test 2 QPs with the same tools and check the packets in Wireshark. It gives me all the same UDP src port:53248 again.

                              Next, I test 2 QPs by ib_send_bw -d mlx5_0 -D 2 -q 2, and check the packets in Wireshark. It gives me the same result...

                              Things seem confusing.

                              But I test the last one. By ib_send_bw -d mlx5_0 -q 10 -b, things seem to be normal finally.

                              Result as following: there exist 4 different UDP src port (a section of the total result)

                              For excluding the bidirectional effect, I do another test without -b

                              Result as following: there still exists four different UDP src ports.

                              For excluding the difference between SEND and WRITE, another experiment...

                              Result as following:

                              Finally, It seems that there is a QP-Port mapping algorithm to allocate port for QP according to the system load(I guess)

                              Because at the beginning there is only one port for network communication but as the test numbers increase it gradually to allocate more ports for communication(Something like a warmup procedure).So you may need to try more experiments...