4 Replies Latest reply on Mar 3, 2016 2:46 AM by harold.demure

    MLX4 custom packet steering

    harold.demure

      Hello everybody,

        I am approaching the lower-level networking world, made of linux kernel, mellanox drivers and DPDK libraries.

       

      I have read some documentations around about those topics, trying to understand how to leverage the multiple rx/tx queues of my NIC for my purposes and, in case, also leverage the capabilities of the DPDK libraries.

       

      Despite this, I have not properly grasped how these technologies can be used to fulfill my goal (or if it is even possible).

      My goal is the following: I would like my application to be able to redirect incoming packets to a specific queue depending on information that are in the header, according to a function that I specify (not a built-in hash function).

       

      In particular, I would like to steer a UDP packet depending on the content of the header and the entries of lookup table that the top-level application supplies (and can manipulate at runtime). This steering operation must be able to rely on a pointer to an application-defined function to identify the proper entry in the table.

       

       

      Here are my questions

      0) Is this possible in any way?

      1) As far as I understood, this should be possible with a mlx4 driver (which I use), using RPS, but I have not properly understood how to use RPS. In addition, it is not clear to me whether RPS is compatible at all with the DPDK libraries.

      2) In my understanding, RPS is "expensive" when compared to other techniques, like RSS or other hardware-assisted techniques. I think that I could implement what I want if I could hack the mlx4 driver and the linux kernel to force RSS to use my "hash function" (that is actually a lookup function). I could insert my lookup table, provide pointers to functions and also expose some hooks to the application to modify the header-to-queue mapping.  Note that the hacking does not need to be "clean"; e.g., I could live even with having to recompile together app and driver etc.

      Do you confirm this would be possible? If you do, would you provide me with some pointers to the linux kernel code and mlx4 driver code to start my hacking journey? I have scavenged the mlx4 driver code but I only found the functions to set the TOP vs XOR hash function, and I did not find the code portion to mess with to implement what I want to do. I did not find the proper networking linux kernel's code portions either

      3) If what I wrote before is possible, then what would be the actual advantage of using the DPDK libraries? Or is it even possible to leverage them to implement what I want?

       

      ------ Specs

      I am using DPDK 2.2

      I have followed the instructions in [1] and installed MLNX_OFED_LINUX-3.1-1.0.3-ubuntu14.04-x86_64   (3.2 is out, but instructions in [1] refers to 3.1)

       

      My NIC :

      05:00.0 Ethernet controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]

      I have a single port and 2 NUMA domains

       

      My machine runs a 3.19.0-47-generic on Ubuntu 14.04.

      CPU is Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz  2x8 cores

      ------

      Thank you for any help/pointer/resource/code snippet you will be able to give me.

        Harold

       

      [1] http://dpdk.org/doc/guides/nics/mlx4.html

        • Re: MLX4 custom packet steering
          giladber

          Harold,

           

          First i would like to ask -

          1. What type of information in the header you would like to based your steering on?

          2. What is the reason to go for DPDK? do you need higher performance?

           

          What you're probably looking for in DPDK is the Flow Director feature, which will be supported in the next MLNX_DPDK release (End of Feb), however, only for ConnectX-4 and not ConnectX-3. In general, i would advise to move to ConnectX-4 if possible (for sure will be much easier than hacking the code ).

          In DPDK, you can also change the RSS RETA table (i.e manually configure the hash result -> queue mapping)

           

          Same thing (Steering, not change RETA table) can be accomplished with ethtool if you are using the standard driver.

           

          Note that not all HW support all fields, hence my first two questions.

           

          Hope this helps and let me know if you need more info.

          1 of 1 people found this helpful
            • Re: MLX4 custom packet steering
              harold.demure

              Hello,

              Thank you very much for your reply.

               

              1) Theoretically, I would like to steer the packet depending on information encoded in the packet, not the header. In any case, this information can be  shrunk to 32 bits so as to be placed in the header (e.g., in the destination port field). Doing a simple hash, however, is not enough, as the mapping field_in_the_header -> queue could change over time. 

              2) Yes I have read much about DPDK and it seems I could leverage it for higher performance.

               

              About other things you have said.

               

              - As far as I have understood, flow director uses information on the header, but then the steering is done in hardware either according to a hard-wired hash function or using a 1:1 perfect filter.  Therefore, I am not sure I can do what I want since I cannot afford informing the sender about the "header--> core" mapping (to implement 1:1), if this changes often over time, or if this map is huge. This is why I actually would like to change the hash function, to make it a "non-hash" function, but a custom steering one. I did not expect this to be doable in hardware, but I was wondering whether it could be done even before the packet is read, in software, by the dpdk library. Looking around, I guess this is not possible.

               

              - Similarly, changing the RSS table would not be enough, as I would not have control over the hashing function.

               

              - For now, I am committed to using DPDK to bypass the kernel, so using ethtool would not be useful for my purpose, as far as I have understood

               

              - I would love to switch to ConnectX-4, but I cannot afford changing hardware right now.

               

              My curiosity, though, still stands: if I were to give up on using DPDK, where should I put my hand in the mlx4 driver/linux kernel to implement a custom steering (either in RSS or RPS)?

              Thank you again

              Hardol

                • Re: MLX4 custom packet steering
                  giladber

                  Sorry for the late reply!  was OOO...  See below my comments.

                   

                  1) Theoretically, I would like to steer the packet depending on information encoded in the packet, not the header. In any case, this information can be  shrunk to 32 bits so as to be placed in the header (e.g., in the destination port field). Doing a simple hash, however, is not enough, as the mapping field_in_the_header -> queue could change over time.

                  [GB] Steering and RSS according to the payload for ConnectX-3 are not supported sadly.

                  For DPDK you can change the RSS key, the RETA table and in future release (probably DPDK 16.07, or MLNX_DPDK 16.04_X, after OFED May release) the hash result will be available as well.  

                  2) Yes I have read much about DPDK and it seems I could leverage it for higher performance.

                   

                  About other things you have said.

                   

                  - As far as I have understood, flow director uses information on the header, but then the steering is done in hardware either according to a hard-wired hash function or using a 1:1 perfect filter.  Therefore, I am not sure I can do what I want since I cannot afford informing the sender about the "header--> core" mapping (to implement 1:1), if this changes often over time, or if this map is huge. This is why I actually would like to change the hash function, to make it a "non-hash" function, but a custom steering one. I did not expect this to be doable in hardware, but I was wondering whether it could be done even before the packet is read, in software, by the dpdk library. Looking around, I guess this is not possible.

                  [GB] This is something i don't feel i can answer, probably need a better DPDK expert

                   

                  - Similarly, changing the RSS table would not be enough, as I would not have control over the hashing function.

                   

                  - For now, I am committed to using DPDK to bypass the kernel, so using ethtool would not be useful for my purpose, as far as I have understood

                  [GB] Indeed, i was suggesting one or the other. Actually, in our model, ethtool still works when using DPDK but in practice you can't configure steering to the DPDK rx queues with ethtool.

                   

                  - I would love to switch to ConnectX-4, but I cannot afford changing hardware right now.

                   

                  My curiosity, though, still stands: if I were to give up on using DPDK, where should I put my hand in the mlx4 driver/linux kernel to implement a custom steering (either in RSS or RPS)?

                  [GB] First, our driver is open source and you are more than welcome to have a look at the code , i can try to help you with some pointers if needed.  You can not change the HW to do this for you and i don't think the payload will be available for the generic steering you want so it will comes down to software layer.

                   

                  In general, it would be great if you can contact me directly because i am am very interested in your use case and if we can find some workaround with our newer products (most likely not ConnectX-3).

                    • Re: MLX4 custom packet steering
                      harold.demure

                      Hello Gilad,

                      Thank you very much for your reply. As a matter of fact, for the moment, I am resorting to a full software solution to fulfill my needs. I will contact you when I will have rigorously identified my use case (for now, this is a fuzzy research project) and possibly an outline of the solution with a sketchy prototype. I am pretty sure this will take some time, but I will take your invitation in consideration for sure.

                      Thank you very much again for your support.

                      Regards,

                       

                        Harold