1 Reply Latest reply on Jun 16, 2016 12:11 AM by olgas

    RSS not working on Mellanox ConnectX-3 NIC

    jblazquez

      Hi,

       

      I asked this on the DPDK users mailing list too but this may be a better forum for it.

       

      I have a pair of Mellanox MCX354A-FCBT NICs and I'm having trouble scaling up RX performance. It appears that RSS is not working and RX speed is limited by a single queue.

       

      According to the documentation RSS is supported on the mlx4 driver, and debugging the eth dev initialization code I can see the driver setting up RSS apparently with success. I can generate 34Mpps from one NIC using 8 queues, but I can only ever receive at 20Mpps on the other NIC, no matter how many queues I use.

       

      The generated packets have randomized source/destination IP addresses and source/destination UDP ports, so they should hash to different RX queues.

       

      The NICs are connected directly to each other with a DAC cable. They are on different NUMA nodes and I'm placing TX/RX lcores on the appropriate socket for each NIC. It doesn't matter which NIC I use as the sender, the results are exactly the same. I have tried both pktgen and my own code but didn't see any difference.

       

      The server is a 2x 12-core Intel E5-2680 v3 2.5GHz. The Mellanox NICs are flashed with the latest firmware and I'm using MLNX_OFED 3.3. I'm using the MLNX_DPDK 2.2 distribution, but I also tried the standard DPDK v16.04 and the result was the same.

       

      Here's the output of ibstat:

       

      CA 'mlx4_0'
        CA type: MT4099
        Number of ports: 2
        Firmware version: 2.36.5000
        Hardware version: 1
        Node GUID: 0x0002c90300310c30
        System image GUID: 0x0002c90300310c33
        Port 1:
        State: Active
        Physical state: LinkUp
        Rate: 56
        Base lid: 0
        LMC: 0
        SM lid: 0
        Capability mask: 0x0c010000
        Port GUID: 0x0202c9fffe310c30
        Link layer: Ethernet
        Port 2:
        State: Active
        Physical state: LinkUp
        Rate: 56
        Base lid: 0
        LMC: 0
        SM lid: 0
        Capability mask: 0x0c010000
        Port GUID: 0x0202c9fffe310c31
        Link layer: Ethernet
      CA 'mlx4_1'
        CA type: MT4099
        Number of ports: 2
        Firmware version: 2.36.5000
        Hardware version: 1
        Node GUID: 0x0002c90300318200
        System image GUID: 0x0002c90300318203
        Port 1:
        State: Active
        Physical state: LinkUp
        Rate: 56
        Base lid: 0
        LMC: 0
        SM lid: 0
        Capability mask: 0x0c010000
        Port GUID: 0x0202c9fffe318200
        Link layer: Ethernet
        Port 2:
        State: Active
        Physical state: LinkUp
        Rate: 56
        Base lid: 0
        LMC: 0
        SM lid: 0
        Capability mask: 0x0c010000
        Port GUID: 0x0202c9fffe318201
        Link layer: Ethernet
      

       

      Below are the pktgen results. Note that the first NIC is 0000:03:00.0 and is assigned ports 0-1, and the second NIC is 0000:a1:00.0 and is assigned ports 2-3. I'm testing TX on port 0 and RX on port 2, which are connected directly. Random packets are generated by using the pktgen script found here.

       

      $ app/pktgen -c ffffff -n 4 -w 0000:03:00.0 -w 0000:a1:00.0 --socket-mem=1024,1024 -- -N -T -P -m "[0-7].0,[12-19].2"
      
      Copyright (c) <2010-2016>, Intel Corporation. All rights reserved. Powered by Intel® DPDK
      EAL: Detected lcore 0 as core 0 on socket 0
      EAL: Detected lcore 1 as core 1 on socket 0
      EAL: Detected lcore 2 as core 2 on socket 0
      EAL: Detected lcore 3 as core 3 on socket 0
      EAL: Detected lcore 4 as core 4 on socket 0
      EAL: Detected lcore 5 as core 5 on socket 0
      EAL: Detected lcore 6 as core 8 on socket 0
      EAL: Detected lcore 7 as core 9 on socket 0
      EAL: Detected lcore 8 as core 10 on socket 0
      EAL: Detected lcore 9 as core 11 on socket 0
      EAL: Detected lcore 10 as core 12 on socket 0
      EAL: Detected lcore 11 as core 13 on socket 0
      EAL: Detected lcore 12 as core 0 on socket 1
      EAL: Detected lcore 13 as core 1 on socket 1
      EAL: Detected lcore 14 as core 2 on socket 1
      EAL: Detected lcore 15 as core 3 on socket 1
      EAL: Detected lcore 16 as core 4 on socket 1
      EAL: Detected lcore 17 as core 5 on socket 1
      EAL: Detected lcore 18 as core 8 on socket 1
      EAL: Detected lcore 19 as core 9 on socket 1
      EAL: Detected lcore 20 as core 10 on socket 1
      EAL: Detected lcore 21 as core 11 on socket 1
      EAL: Detected lcore 22 as core 12 on socket 1
      EAL: Detected lcore 23 as core 13 on socket 1
      EAL: Detected lcore 24 as core 0 on socket 0
      EAL: Detected lcore 25 as core 1 on socket 0
      EAL: Detected lcore 26 as core 2 on socket 0
      EAL: Detected lcore 27 as core 3 on socket 0
      EAL: Detected lcore 28 as core 4 on socket 0
      EAL: Detected lcore 29 as core 5 on socket 0
      EAL: Detected lcore 30 as core 8 on socket 0
      EAL: Detected lcore 31 as core 9 on socket 0
      EAL: Detected lcore 32 as core 10 on socket 0
      EAL: Detected lcore 33 as core 11 on socket 0
      EAL: Detected lcore 34 as core 12 on socket 0
      EAL: Detected lcore 35 as core 13 on socket 0
      EAL: Detected lcore 36 as core 0 on socket 1
      EAL: Detected lcore 37 as core 1 on socket 1
      EAL: Detected lcore 38 as core 2 on socket 1
      EAL: Detected lcore 39 as core 3 on socket 1
      EAL: Detected lcore 40 as core 4 on socket 1
      EAL: Detected lcore 41 as core 5 on socket 1
      EAL: Detected lcore 42 as core 8 on socket 1
      EAL: Detected lcore 43 as core 9 on socket 1
      EAL: Detected lcore 44 as core 10 on socket 1
      EAL: Detected lcore 45 as core 11 on socket 1
      EAL: Detected lcore 46 as core 12 on socket 1
      EAL: Detected lcore 47 as core 13 on socket 1
      EAL: Support maximum 128 logical core(s) by configuration.
      EAL: Detected 48 lcore(s)
      EAL: Setting up physically contiguous memory...
      EAL: Ask a virtual area of 0x80000000 bytes
      EAL: Virtual area found at 0x7f38c0000000 (size = 0x80000000)
      EAL: Ask a virtual area of 0x80000000 bytes
      EAL: Virtual area found at 0x7f3800000000 (size = 0x80000000)
      EAL: Requesting 1 pages of size 1024MB from socket 0
      EAL: Requesting 1 pages of size 1024MB from socket 1
      EAL: TSC frequency is ~2494222 KHz
      EAL: Master lcore 0 is ready (tid=eca398c0;cpuset=[0])
      EAL: lcore 6 is ready (tid=e7833700;cpuset=[6])
      EAL: lcore 7 is ready (tid=e7032700;cpuset=[7])
      EAL: lcore 8 is ready (tid=e6831700;cpuset=[8])
      EAL: lcore 4 is ready (tid=e8835700;cpuset=[4])
      EAL: lcore 1 is ready (tid=ea038700;cpuset=[1])
      EAL: lcore 9 is ready (tid=e6030700;cpuset=[9])
      EAL: lcore 3 is ready (tid=e9036700;cpuset=[3])
      EAL: lcore 2 is ready (tid=e9837700;cpuset=[2])
      EAL: lcore 13 is ready (tid=e402c700;cpuset=[13])
      EAL: lcore 10 is ready (tid=e582f700;cpuset=[10])
      EAL: lcore 12 is ready (tid=e482d700;cpuset=[12])
      EAL: lcore 11 is ready (tid=e502e700;cpuset=[11])
      EAL: lcore 5 is ready (tid=e8034700;cpuset=[5])
      EAL: lcore 20 is ready (tid=e0825700;cpuset=[20])
      EAL: lcore 19 is ready (tid=e1026700;cpuset=[19])
      EAL: lcore 18 is ready (tid=e1827700;cpuset=[18])
      EAL: lcore 21 is ready (tid=bbfff700;cpuset=[21])
      EAL: lcore 22 is ready (tid=bb7fe700;cpuset=[22])
      EAL: lcore 14 is ready (tid=e382b700;cpuset=[14])
      EAL: lcore 17 is ready (tid=e2028700;cpuset=[17])
      EAL: lcore 23 is ready (tid=baffd700;cpuset=[23])
      EAL: lcore 15 is ready (tid=e302a700;cpuset=[15])
      EAL: lcore 16 is ready (tid=e2829700;cpuset=[16])
      EAL: PCI device 0000:03:00.0 on NUMA socket 0
      EAL:   probe driver: 15b3:1003 librte_pmd_mlx4
      PMD: librte_pmd_mlx4: PCI information matches, using device "mlx4_0" (VF: false)
      PMD: librte_pmd_mlx4: 2 port(s) detected
      PMD: librte_pmd_mlx4: port 1 MAC address is 00:02:c9:31:0c:30
      PMD: librte_pmd_mlx4: port 2 MAC address is 00:02:c9:31:0c:31
      EAL: PCI device 0000:a1:00.0 on NUMA socket 1
      EAL:   probe driver: 15b3:1003 librte_pmd_mlx4
      PMD: librte_pmd_mlx4: PCI information matches, using device "mlx4_1" (VF: false)
      PMD: librte_pmd_mlx4: 2 port(s) detected
      PMD: librte_pmd_mlx4: port 1 MAC address is 00:02:c9:31:82:00
      PMD: librte_pmd_mlx4: port 2 MAC address is 00:02:c9:31:82:01
      [0-7].0          lcores: RX( 0 1 2 3 4 5 6 7 )TX( 0 1 2 3 4 5 6 7 ) ports: RX( 0 )TX( 0 )
      [12-19].2        lcores: RX( 12 13 14 15 16 17 18 19 )TX( 12 13 14 15 16 17 18 19 ) ports: RX( 2 )TX( 2 )
         Copyright (c) <2010-2016>, Intel Corporation. All rights reserved.
         Pktgen created by: Keith Wiles -- >>> Powered by Intel® DPDK <<<
      Lua 5.3.2  Copyright (C) 1994-2015 Lua.org, PUC-Rio
      >>> Packet Burst 32, RX Desc 512, TX Desc 512, mbufs/port 4096, mbuf cache 512
      === port to lcore mapping table (# lcores 24) ===
         lcore:     0     1     2     3     4     5     6     7     8     9    10    11    12    13    14    15    16    17    18    19    20    21    22    23 
      port   0:  D: T  1: 1  1: 1  1: 1  1: 1  1: 1  1: 1  1: 1  0: 0  0: 0  0: 0  0: 0  0: 0  0: 0  0: 0  0: 0  0: 0  0: 0  0: 0  0: 0  0: 0  0: 0  0: 0  0: 0 =  8: 8
      port   2:  D: T  0: 0  0: 0  0: 0  0: 0  0: 0  0: 0  0: 0  0: 0  0: 0  0: 0  0: 0  1: 1  1: 1  1: 1  1: 1  1: 1  1: 1  1: 1  1: 1  0: 0  0: 0  0: 0  0: 0 =  8: 8
      Total   :  1: 1  1: 1  1: 1  1: 1  1: 1  1: 1  1: 1  1: 1  0: 0  0: 0  0: 0  0: 0  1: 1  1: 1  1: 1  1: 1  1: 1  1: 1  1: 1  1: 1  0: 0  0: 0  0: 0  0: 0
          Display and Timer on lcore 0, rx:tx counts per port/lcore
      Configuring 4 ports, MBUF Size 1920, MBUF Cache Size 512
      Lcore:
          0, RX-TX  
                      RX( 1): ( 0: 0) 
                      TX( 1): ( 0: 0) 
          1, RX-TX  
                      RX( 1): ( 0: 1) 
                      TX( 1): ( 0: 1) 
          2, RX-TX  
                      RX( 1): ( 0: 2) 
                      TX( 1): ( 0: 2) 
          3, RX-TX  
                      RX( 1): ( 0: 3) 
                      TX( 1): ( 0: 3) 
          4, RX-TX  
                      RX( 1): ( 0: 4) 
                      TX( 1): ( 0: 4) 
          5, RX-TX  
                      RX( 1): ( 0: 5) 
                      TX( 1): ( 0: 5) 
          6, RX-TX  
                      RX( 1): ( 0: 6) 
                      TX( 1): ( 0: 6) 
          7, RX-TX  
                      RX( 1): ( 0: 7) 
                      TX( 1): ( 0: 7) 
         12, RX-TX  
                      RX( 1): ( 2: 0) 
                      TX( 1): ( 2: 0) 
         13, RX-TX  
                      RX( 1): ( 2: 1) 
                      TX( 1): ( 2: 1) 
         14, RX-TX  
                      RX( 1): ( 2: 2) 
                      TX( 1): ( 2: 2) 
         15, RX-TX  
                      RX( 1): ( 2: 3) 
                      TX( 1): ( 2: 3) 
         16, RX-TX  
                      RX( 1): ( 2: 4) 
                      TX( 1): ( 2: 4) 
         17, RX-TX  
                      RX( 1): ( 2: 5) 
                      TX( 1): ( 2: 5) 
         18, RX-TX  
                      RX( 1): ( 2: 6) 
                      TX( 1): ( 2: 6) 
         19, RX-TX  
                      RX( 1): ( 2: 7) 
                      TX( 1): ( 2: 7) 
      Port :
          0, nb_lcores  8, private 0x8f09f0, lcores:  0  1  2  3  4  5  6  7 
          2, nb_lcores  8, private 0x8f5270, lcores: 12 13 14 15 16 17 18 19 
      ** Dev Info (librte_pmd_mlx4:17) **
         max_vfs        :   0 min_rx_bufsize    :  32 max_rx_pktlen : 65536 max_rx_queues         :65408 max_tx_queues:65408
         max_mac_addrs  : 127 max_hash_mac_addrs:   0 max_vmdq_pools:     0
         rx_offload_capa:   0 tx_offload_capa   :   0 reta_size     :     0 flow_type_rss_offloads:0000000000000000
         vmdq_queue_base:   0 vmdq_queue_num    :   0 vmdq_pool_base:     0
      ** RX Conf **
         pthreash       :   0 hthresh          :   0 wthresh        :     0
         Free Thresh    :   0 Drop Enable      :   0 Deferred Start :     0
      ** TX Conf **
         pthreash       :   0 hthresh          :   0 wthresh        :     0
         Free Thresh    :   0 RS Thresh        :   0 Deferred Start :     0 TXQ Flags:00000000
      PMD: librte_pmd_mlx4: 0x94b7e0: TX queues number update: 0 -> 8
      PMD: librte_pmd_mlx4: 0x94b7e0: RX queues number update: 0 -> 8
      Initialize Port 0 -- TxQ 8, RxQ 8,  Src MAC 00:02:c9:31:0c:30
          Create: Default RX  0:0  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Default RX  0:1  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Default RX  0:2  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Default RX  0:3  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Default RX  0:4  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Default RX  0:5  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Default RX  0:6  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Default RX  0:7  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Default TX  0:0  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Range TX    0:0  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Sequence TX 0:0  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Special TX  0:0  - Memory used (MBUFs   64 x (size 1920 + Hdr 128)) + 1581248 =   1673 KB headroom 128 2176
          Create: Default TX  0:1  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Range TX    0:1  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Sequence TX 0:1  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Special TX  0:1  - Memory used (MBUFs   64 x (size 1920 + Hdr 128)) + 1581248 =   1673 KB headroom 128 2176
          Create: Default TX  0:2  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Range TX    0:2  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Sequence TX 0:2  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Special TX  0:2  - Memory used (MBUFs   64 x (size 1920 + Hdr 128)) + 1581248 =   1673 KB headroom 128 2176
          Create: Default TX  0:3  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Range TX    0:3  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Sequence TX 0:3  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Special TX  0:3  - Memory used (MBUFs   64 x (size 1920 + Hdr 128)) + 1581248 =   1673 KB headroom 128 2176
          Create: Default TX  0:4  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Range TX    0:4  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Sequence TX 0:4  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Special TX  0:4  - Memory used (MBUFs   64 x (size 1920 + Hdr 128)) + 1581248 =   1673 KB headroom 128 2176
          Create: Default TX  0:5  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Range TX    0:5  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Sequence TX 0:5  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Special TX  0:5  - Memory used (MBUFs   64 x (size 1920 + Hdr 128)) + 1581248 =   1673 KB headroom 128 2176
          Create: Default TX  0:6  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Range TX    0:6  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Sequence TX 0:6  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Special TX  0:6  - Memory used (MBUFs   64 x (size 1920 + Hdr 128)) + 1581248 =   1673 KB headroom 128 2176
          Create: Default TX  0:7  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Range TX    0:7  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Sequence TX 0:7  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Special TX  0:7  - Memory used (MBUFs   64 x (size 1920 + Hdr 128)) + 1581248 =   1673 KB headroom 128 2176
                                                                             Port memory used = 324936 KB
      ** Dev Info (librte_pmd_mlx4:19) **
         max_vfs        :   0 min_rx_bufsize    :  32 max_rx_pktlen : 65536 max_rx_queues         :65408 max_tx_queues:65408
         max_mac_addrs  : 127 max_hash_mac_addrs:   0 max_vmdq_pools:     0
         rx_offload_capa:   0 tx_offload_capa   :   0 reta_size     :     0 flow_type_rss_offloads:0000000000000000
         vmdq_queue_base:   0 vmdq_queue_num    :   0 vmdq_pool_base:     0
      ** RX Conf **
         pthreash       :   0 hthresh          :   0 wthresh        :     0
         Free Thresh    :   0 Drop Enable      :   0 Deferred Start :     0
      ** TX Conf **
         pthreash       :   0 hthresh          :   0 wthresh        :     0
         Free Thresh    :   0 RS Thresh        :   0 Deferred Start :     0 TXQ Flags:00000000
      PMD: librte_pmd_mlx4: 0x953870: TX queues number update: 0 -> 8
      PMD: librte_pmd_mlx4: 0x953870: RX queues number update: 0 -> 8
      Initialize Port 2 -- TxQ 8, RxQ 8,  Src MAC 00:02:c9:31:82:00
          Create: Default RX  2:0  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Default RX  2:1  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Default RX  2:2  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Default RX  2:3  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Default RX  2:4  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Default RX  2:5  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Default RX  2:6  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Default RX  2:7  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Default TX  2:0  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Range TX    2:0  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Sequence TX 2:0  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Special TX  2:0  - Memory used (MBUFs   64 x (size 1920 + Hdr 128)) + 1581248 =   1673 KB headroom 128 2176
          Create: Default TX  2:1  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Range TX    2:1  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Sequence TX 2:1  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Special TX  2:1  - Memory used (MBUFs   64 x (size 1920 + Hdr 128)) + 1581248 =   1673 KB headroom 128 2176
          Create: Default TX  2:2  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Range TX    2:2  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Sequence TX 2:2  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Special TX  2:2  - Memory used (MBUFs   64 x (size 1920 + Hdr 128)) + 1581248 =   1673 KB headroom 128 2176
          Create: Default TX  2:3  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Range TX    2:3  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Sequence TX 2:3  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Special TX  2:3  - Memory used (MBUFs   64 x (size 1920 + Hdr 128)) + 1581248 =   1673 KB headroom 128 2176
          Create: Default TX  2:4  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Range TX    2:4  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Sequence TX 2:4  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Special TX  2:4  - Memory used (MBUFs   64 x (size 1920 + Hdr 128)) + 1581248 =   1673 KB headroom 128 2176
          Create: Default TX  2:5  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Range TX    2:5  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Sequence TX 2:5  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Special TX  2:5  - Memory used (MBUFs   64 x (size 1920 + Hdr 128)) + 1581248 =   1673 KB headroom 128 2176
          Create: Default TX  2:6  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Range TX    2:6  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Sequence TX 2:6  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Special TX  2:6  - Memory used (MBUFs   64 x (size 1920 + Hdr 128)) + 1581248 =   1673 KB headroom 128 2176
          Create: Default TX  2:7  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Range TX    2:7  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Sequence TX 2:7  - Memory used (MBUFs 4096 x (size 1920 + Hdr 128)) + 1581248 =   9737 KB headroom 128 2176
          Create: Special TX  2:7  - Memory used (MBUFs   64 x (size 1920 + Hdr 128)) + 1581248 =   1673 KB headroom 128 2176
                                                                             Port memory used = 324936 KB
                                                                            Total memory used = 649871 KB
      Port  0: Link Up - speed 56000 Mbps - full-duplex <Enable promiscuous mode>
      Port  2: Link Up - speed 56000 Mbps - full-duplex <Enable promiscuous mode>
      === Display processing on lcore 0
      WARNING: Nothing to do on lcore 8: exiting
      WARNING: Nothing to do on lcore 9: exiting
      WARNING: Nothing to do on lcore 10: exiting
      WARNING: Nothing to do on lcore 11: exiting
      WARNING: Nothing to do on lcore 20: exiting
      WARNING: Nothing to do on lcore 21: exiting
      WARNING: Nothing to do on lcore 22: exiting
      WARNING: Nothing to do on lcore 23: exiting
      === RX/TX processing lcore  1 rxcnt 1 txcnt 1 port/qid, 0/1
      === RX/TX processing lcore  2 rxcnt 1 txcnt 1 port/qid, 0/2
      === RX/TX processing lcore  3 rxcnt 1 txcnt 1 port/qid, 0/3
      === RX/TX processing lcore  4 rxcnt 1 txcnt 1 port/qid, 0/4
      === RX/TX processing lcore  5 rxcnt 1 txcnt 1 port/qid, 0/5
      === RX/TX processing lcore  6 rxcnt 1 txcnt 1 port/qid, 0/6
      === RX/TX processing lcore  7 rxcnt 1 txcnt 1 port/qid, 0/7
      === RX/TX processing lcore 12 rxcnt 1 txcnt 1 port/qid, 2/0
      === RX/TX processing lcore 13 rxcnt 1 txcnt 1 port/qid, 2/1
      === RX/TX processing lcore 14 rxcnt 1 txcnt 1 port/qid, 2/2
      === RX/TX processing lcore 15 rxcnt 1 txcnt 1 port/qid, 2/3
      === RX/TX processing lcore 16 rxcnt 1 txcnt 1 port/qid, 2/4
      === RX/TX processing lcore 17 rxcnt 1 txcnt 1 port/qid, 2/5
      === RX/TX processing lcore 18 rxcnt 1 txcnt 1 port/qid, 2/6
      === RX/TX processing lcore 19 rxcnt 1 txcnt 1 port/qid, 2/7
      Pktgen > load random.txt
      geometry 132x44
      mac_from_arp disable
      set 0 count 0
      set 0 size 64
      set 0 rate 100
      set 0 burst 32
      set 0 sport 1234
      set 0 dport 5678
      set 0 prime 1
      type ipv4 0
      range.proto 0 udp
      proto udp 0
      set ip dst 0 10.1.72.17
      set ip src 0 10.1.72.154/24
      set mac 0 00:23:e9:64:c0:03
      vlanid 0 1
      pattern 0 abc
      latency 0 disable
      mpls 0 disable
      mpls_entry 0 0
      qinq 0 disable
      qinqids 0 0 0
      gre 0 disable
      gre_eth 0 disable
      gre_key 0 0
      icmp.echo 0 disable
      pcap 0 disable
      range 0 enable
      process 0 disable
      capture 0 disable
      rxtap 0 disable
      txtap 0 disable
      vlan 0 disable
      src.mac start 0 00:50:56:86:10:76
      src.mac min 0 00:00:00:00:00:00
      src.mac max 0 00:00:00:00:00:00
      src.mac inc 0 00:00:00:00:00:00
      dst.mac start 0 00:23:e9:64:c0:03
      dst.mac min 0 00:00:00:00:00:00
      dst.mac max 0 00:00:00:00:00:00
      dst.mac inc 0 00:00:00:00:00:00
      src.ip start 0 10.1.72.154
      src.ip min 0 10.1.72.154
      src.ip max 0 10.1.72.254
      src.ip inc 0 0.0.0.1
      dst.ip start 0 10.1.72.17
      dst.ip min 0 10.1.72.17
      dst.ip max 0 10.1.72.17
      dst.ip inc 0 0
      src.port start 0 1025
      src.port min 0 1025
      src.port max 0 65512
      src.port inc 0 1
      dst.port start 0 0
      dst.port min 0 0
      dst.port max 0 254
      dst.port inc 0 1
      vlan.id start 0 1
      vlan.id min 0 1
      vlan.id max 0 4095
      vlan.id inc 0 0
      pkt.size start 0 64
      pkt.size min 0 64
      pkt.size max 0 1518
      pkt.size inc 0 0
      set 0 seqCnt 0
      Pktgen > start 0
        Flags:Port    :   P-----R--------:0                       P--------------:2
      Link State      :       <UP-56000-FD>                           <UP-56000-FD>----
      Pkts/s Max/Rx   :                 0/0                           19839945/19839945
             Max/Tx   :   34199936/34135552                           34199936/34135552
      MBits/s Rx/Tx   :             0/21846                                 12697/21846
      Broadcast       :                   0                                       0
      Multicast       :                   0                                       0
        64 Bytes      :                   0                                78156990
        65-127        :                   0                                       0
        128-255       :                   0                                       0
        256-511       :                   0                                       0
        512-1023      :                   0                                       0
        1024-1518     :                   0                                       0
      Runts/Jumbos    :                 0/0                                     0/0
      Errors Rx/Tx    :                 0/0                                     0/0
      Total Rx Pkts   :                   0                               368764259
            Tx Pkts   :           669245499                                       0
            Rx MBs    :                   0                                  236009
            Tx MBs    :              428317                                       0
      ARP/ICMP Pkts   :                 0/0                                     0/0
                      :
      Pattern Type    :             abcd...                                 abcd...
      Tx Count/% Rate :      Forever / 100%                          Forever / 100%
      PktSize/Tx Burst:           64 /   32                               64 /   32
      Src/Dest Port   :         1234 / 5678                             1234 / 5678
      Pkt Type:VLAN ID:     IPv4 / UDP:0001                         IPv4 / TCP:0001
      Dst  IP Address :          10.1.72.17                             192.168.3.1
      Src  IP Address :      10.1.72.154/24                          192.168.2.1/24
      Dst MAC Address :   00:23:e9:64:c0:03                       00:00:00:00:00:00
      Src MAC Address     00:02:c9:31:0c:30                       00:02:c9:31:82:00
      

       

      Have I hit a hardware limitation?

       

      Any pointers would be appreciated.

        • Re: RSS not working on Mellanox ConnectX-3 NIC
          olgas

          Hi,

           

          Already sent my answer to the dpdk mailing list, but also adding it here if anyone else needs it.

           

          RSS on ConnectX-3 cards is working, but doesn't improve the Maximum rate of the NIC, it helps for real application to spread the traffic among different cores.

          Therefore with benchmark application you will see  degradation with RSS, but with real application the performance should be better with RSS than without.

           

          ConnectX-4 doesn't have this limitation and we suggest using it instead of ConnectX-3 

           

          Best Regards,

          Olga