10 Replies Latest reply on Jan 26, 2016 11:09 AM by unxited

    DPDK-OVS using connectx3-Pro binding issues

    motmot

      Hi,

       

      We have been trying to install DPDK-OVS on DL360 G7 (HP server) host using Fedora 21 and mellanox connectx-3 Pro NIC.

       

      We used the several tutorials Gilad \ Olga have posted here and the installation seemed to be working up (including testpmd running - see output bellow).

       

      We ran dpdk_nic_bind and didn’t see any user space driver we can bind to the mellanox device:

       

      0000:06:00.0 'MT27520 Family [ConnectX-3 Pro]' if=ens1d1,ens1 drv=mlx4_core unused=ib_ipoib *Active*

      1. We need to somehow bind this device to a DPDK-compatible driver, can you think of a way to do so ?


      2. Can you please let take a look at the versions we use (fedora, OFED, dpdk, ovs, qemu) and let us know (from your experience) if we should upgrade\downgrade one of them ?

       

      3. Do you have a more up to date tutorial for our specific HW?

       

      4. Let us know if you need additional details.



      Thanks a lot!!!!



      ===============

      SYSTEM DETAILS:

      ===============

      [root@localhost ~]# uname -a
      Linux localhost.localdomain 3.17.4-301.fc21.x86_64 #1 SMP Thu Nov 27 19:09:10 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

      Using Mellanox Technologies MT27520 Family [ConnectX-3 Pro] NIC


      ### Mellanox OFED version
      MLNX_OFED_LINUX-3.1-1.0.3 (OFED-3.1-1.0.3):

      OVS version: openvswitch-2.4.0

      DPDK version: dpdk-2.1.0

      QEMU version: qemu-2.2.1





      ##ethtool output

      root@localhost ~]# ethtool -i ens1
      driver: mlx4_en
      version: 3.1-1.0.3 (29 Sep 2015)
      firmware-version: 2.35.5100
      bus-info: 0000:06:00.0
      supports-statistics: yes
      supports-test: yes
      supports-eeprom-access: no
      supports-register-dump: no
      supports-priv-flags: yes


      root@localhost ~]# ethtool -i ens1
      driver: mlx4_en
      version: 3.1-1.0.3 (29 Sep 2015)
      firmware-version: 2.35.5100
      bus-info: 0000:06:00.0
      supports-statistics: yes
      supports-test: yes
      supports-eeprom-access: no
      supports-register-dump: no
      supports-priv-flags: yes

      ####################Test_PMD Output:
      [root@localhost dpdk-2.1.0]# ./x86_64-ivshmem-linuxapp-gcc/build/app/test-pmd/testpmd -c 0xff00 -n 4 -w 0000:06:00.0 -- --rxq=2 --txq=2 -i
      EAL: Detected lcore 0 as core 0 on socket 0
      EAL: Detected lcore 1 as core 0 on socket 1
      EAL: Detected lcore 2 as core 8 on socket 0
      EAL: Detected lcore 3 as core 8 on socket 1
      EAL: Detected lcore 4 as core 2 on socket 0
      EAL: Detected lcore 5 as core 2 on socket 1
      EAL: Detected lcore 6 as core 10 on socket 0
      EAL: Detected lcore 7 as core 10 on socket 1
      EAL: Detected lcore 8 as core 1 on socket 0
      EAL: Detected lcore 9 as core 1 on socket 1
      EAL: Detected lcore 10 as core 9 on socket 0
      EAL: Detected lcore 11 as core 9 on socket 1
      EAL: Detected lcore 12 as core 0 on socket 0
      EAL: Detected lcore 13 as core 0 on socket 1
      EAL: Detected lcore 14 as core 8 on socket 0
      EAL: Detected lcore 15 as core 8 on socket 1
      EAL: Detected lcore 16 as core 2 on socket 0
      EAL: Detected lcore 17 as core 2 on socket 1
      EAL: Detected lcore 18 as core 10 on socket 0
      EAL: Detected lcore 19 as core 10 on socket 1
      EAL: Detected lcore 20 as core 1 on socket 0
      EAL: Detected lcore 21 as core 1 on socket 1
      EAL: Detected lcore 22 as core 9 on socket 0
      EAL: Detected lcore 23 as core 9 on socket 1
      EAL: Support maximum 128 logical core(s) by configuration.
      EAL: Detected 24 lcore(s)
      EAL: VFIO modules not all loaded, skip VFIO support...
      EAL: Searching for IVSHMEM devices...
      EAL: No IVSHMEM configuration found!
      EAL: Setting up physically contiguous memory...
      EAL: Ask a virtual area of 0x200000000 bytes
      EAL: Virtual area found at 0x7fa740000000 (size = 0x200000000)
      EAL: Ask a virtual area of 0x200000000 bytes
      EAL: Virtual area found at 0x7fa500000000 (size = 0x200000000)
      EAL: Requesting 8 pages of size 1024MB from socket 0
      EAL: Requesting 8 pages of size 1024MB from socket 1
      EAL: TSC frequency is ~2666753 KHz
      EAL: Master lcore 8 is ready (tid=c9c2e8c0;cpuset=[8])
      EAL: lcore 14 is ready (tid=c58cd700;cpuset=[14])
      EAL: lcore 12 is ready (tid=c68cf700;cpuset=[12])
      EAL: lcore 13 is ready (tid=c60ce700;cpuset=[13])
      EAL: lcore 10 is ready (tid=c78d1700;cpuset=[10])
      EAL: lcore 15 is ready (tid=c50cc700;cpuset=[15])
      EAL: lcore 11 is ready (tid=c70d0700;cpuset=[11])
      EAL: lcore 9 is ready (tid=c80d2700;cpuset=[9])
      EAL: PCI device 0000:06:00.0 on NUMA socket 0
      EAL:   probe driver: 15b3:1007 librte_pmd_mlx4
      PMD: librte_pmd_mlx4: PCI information matches, using device "mlx4_0" (VF: false)
      PMD: librte_pmd_mlx4: 2 port(s) detected
      PMD: librte_pmd_mlx4: port 1 MAC address is e4:1d:2d:bb:6d:c0
      PMD: librte_pmd_mlx4: port 2 MAC address is e4:1d:2d:bb:6d:c1
      Interactive-mode selected
      Configuring Port 0 (socket 0)
      PMD: librte_pmd_mlx4: 0x20ad4740: TX queues number update: 0 -> 2
      PMD: librte_pmd_mlx4: 0x20ad4740: RX queues number update: 0 -> 2
      Port 0: E4:1D:2D:BB:6D:C0
      Configuring Port 1 (socket 0)
      PMD: librte_pmd_mlx4: 0x20ad5788: TX queues number update: 0 -> 2
      PMD: librte_pmd_mlx4: 0x20ad5788: RX queues number update: 0 -> 2
      Port 1: E4:1D:2D:BB:6D:C1
      Checking link statuses...
      Port 0 Link Up - speed 40000 Mbps - full-duplex
      Port 1 Link Up - speed 40000 Mbps - full-duplex
      Done
      testpmd>


      [root@localhost ~]# python /home/cloud/dpdk-2.1.0/tools/dpdk_nic_bind.py --status

      Network devices using DPDK-compatible driver
      ============================================
      <none>

      Network devices using kernel driver
      ===================================
      0000:03:00.0 'NetXtreme II BCM5709 Gigabit Ethernet' if=enp3s0f0 drv=bnx2 unused=ib_ipoib *Active*
      0000:03:00.1 'NetXtreme II BCM5709 Gigabit Ethernet' if=enp3s0f1 drv=bnx2 unused=ib_ipoib
      0000:04:00.0 'NetXtreme II BCM5709 Gigabit Ethernet' if=enp4s0f0 drv=bnx2 unused=ib_ipoib
      0000:04:00.1 'NetXtreme II BCM5709 Gigabit Ethernet' if=enp4s0f1 drv=bnx2 unused=ib_ipoib
      0000:06:00.0 'MT27520 Family [ConnectX-3 Pro]' if=ens1d1,ens1 drv=mlx4_core unused=ib_ipoib *Active*

      Other network devices
      =====================
      <none>

       

      ========

      TESTING:

      ========

      We tried here to bind each of the available drivers to the device. However,  none of them has caused the device to be using a DPDK-compatible driver.

       

      python /home/cloud/dpdk-2.1.0/tools/dpdk_nic_bind.py --bind=ib_ipoib 0000:06:00.0

      Routing table indicates that interface 0000:06:00.0 is active. Not modifying

      [root@localhost cloud]# ifconfig ens1 down

      [root@localhost cloud]# ifconfig ens1d1 down

      [root@localhost cloud]# python /home/cloud/dpdk-2.1.0/tools/dpdk_nic_bind.py --bind=ib_ipoib 0000:06:00.0

      Error: bind failed for 0000:06:00.0 - Cannot open /sys/bus/pci/drivers/ib_ipoib/new_id

       

      //////////////////////////////////////////////////////////////////////////////////

      [root@localhost cloud]# python /home/cloud/dpdk-2.1.0/tools/dpdk_nic_bind.py --bind=mlx4_core 0000:06:00.0

      [root@localhost cloud]# python /home/cloud/dpdk-2.1.0/tools/dpdk_nic_bind.py --status

       

      Network devices using DPDK-compatible driver

      ============================================

      <none>

       

      Network devices using kernel driver

      ===================================

      0000:03:00.0 'NetXtreme II BCM5709 Gigabit Ethernet' if=enp3s0f0 drv=bnx2 unused=ib_ipoib *Active*

      0000:03:00.1 'NetXtreme II BCM5709 Gigabit Ethernet' if=enp3s0f1 drv=bnx2 unused=ib_ipoib

      0000:04:00.0 'NetXtreme II BCM5709 Gigabit Ethernet' if=enp4s0f0 drv=bnx2 unused=ib_ipoib

      0000:04:00.1 'NetXtreme II BCM5709 Gigabit Ethernet' if=enp4s0f1 drv=bnx2 unused=ib_ipoib

      0000:06:00.0 'MT27520 Family [ConnectX-3 Pro]' if=ens1d1,ens1 drv=mlx4_core unused=ib_ipoib *Active*

       

      Other network devices

      =================

       

        • Re: DPDK-OVS using connectx3-Pro binding issues
          ballle_98

          I think you need to use the IGB UIO module.  Going through the steps in tools/setup.sh helped me.  mlx4_core (ethernet) and ib_ipoib (IPoIB) are kernel mode drivers.

           

          Network devices using DPDK-compatible driver 
          ============================================
          0000:43:00.0 'MT27500 Family [ConnectX-3]' drv=igb_uio unused=
          0000:44:00.0 'Ethernet 10G 2P X520 Adapter' drv=igb_uio unused=
          0000:44:00.1 'Ethernet 10G 2P X520 Adapter' drv=igb_uio unused=
          • Re: DPDK-OVS using connectx3-Pro binding issues
            giladber

            Hello,

             

            i do not use the dpdk_nic_bind script because we do not use UIO (IGB UIO is Intel's uio driver for IGB). For MLNX NICs, DPDK is yet another user space application written over the Raw Eth VERBs interface. control path is going through the mlx4/5 kernel modules and data path directly to HW from user space (somewhat similar to the RDMA mechanism). It has some nice advantages like security and that the NIC can be managed by the kernel like one would do in none DPDK environment (i.e all standard tools like ethtool etc.).

             

            Lee, if you managed to make this work with the script using MLNX NIC, would be great if you can share the details.

             

            I'm currently writing a post for OVS-DPDK and MLNX NICs, so until then, here is a very dirty guide that might help (Sorry for the format, will be much nicer in the post ). this was done on ConnectX-4 but should work the same for ConnectX-3.

            Some typos, bad phrasing and minor errors can be expected , i will correct them in the post.

             

            Find the NIC numa node:

            # mst start

            # mst status -v

            MST modules:

            ------------

                MST PCI module loaded

                MST PCI configuration module loaded

            PCI devices:

            ------------

            DEVICE_TYPE                  MST                                                    PCI                           RDMA NET                       NUMA

            ConnectX4(rev:0)           /dev/mst/mt4115_pciconf0.1       11:00.1 mlx5_1     net-enp17s0f1                 0

             

            ConnectX4(rev:0)           /dev/mst/mt4115_pciconf0           11:00.0 mlx5_0     net-enp17s0f0                 0

            # mst stop

             

            Configure Hugepages

            OVS needs a system with 1GB hugepages support, which can only be allocated during boot. Note that for NUMA machine the pages will be divided between the NUMA nodes.

            For best performance you might want to have two separate hugepages mount points, one for QEMU (1G pages) and one for DPDK (2M pages). see here - 29. Vhost Sample Application — Data Plane Development Kit 2.2.0 documentation

            2M pages can be allocated after the machine booted.  Here i used only 1G pages (and no performance tuning done)

             

            Adding boot parameters to enable 8 x 1GB HugePages (using grubby here, can be done in many ways)

            Need to add "default_hugepagesz=1GB hugepagesz=1GB hugepages=<number of pages>" to the kernel boot parameters.

            # yum install grub2-tools

            # grubby -c /boot/grub2/grub.cfg --default-kernel

            /boot/vmlinuz-3.10.0-229.el7.x86_64

            # grubby -c /boot/grub2/grub.cfg --args="default_hugepagesz=1GB hugepagesz=1GB hugepages=8" --update-kernel /boot/vmlinuz-3.10.0-229.el7.x86_64

             

            Verify

            # grubby -c /boot/grub2/grub.cfg --info /boot/vmlinuz-3.10.0-229.el7.x86_64

            index=0

            kernel=/boot/vmlinuz-3.10.0-229.el7.x86_64

            args="ro crashkernel=auto rhgb quiet LANG=en_US.UTF-8 default_hugepagesz=1GB hugepagesz=1GB hugepages=8"

            root=UUID=c4d1bf80-880c-459e-a996-57cb41de2544

            initrd=/boot/initramfs-3.10.0-229.el7.x86_64.img

            title=Red Hat Enterprise Linux Server 7.1 (Maipo), with Linux 3.10.0-229.el7.x86_64

            Reboot the machine

             

            Configure 4 pages on the right NUMA node (note that this should be done by default, i just like to make sure)

            echo 4 > /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages

            By default the hugepages should be mounted on /dev/hugepages

             

            Reboot the machine

             

            Build DPDK

            Download DPDK

            Edit vim config/common_linuxapp:

            CONFIG_RTE_BUILD_COMBINE_LIBS=y

            CONFIG_RTE_LIBRTE_MLX5_PMD=y

            Make sure CONFIG_RTE_LIBRTE_VHOST_USER=y

            # make install T=x86_64-native-linuxapp-gcc

             

            Install OVS

            # wget https://github.com/openvswitch/ovs/tarball/master

            # tar -zxvf master

            # cd openvswitch-ovs-39cc5c4/

            # ./boot.sh

            # export LIBS="-libverbs"

            # ./configure --with-dpdk=/var/soft/dpdk/dpdk-2.2.0/x86_64-native-linuxapp-gcc --disable-ssl

            # make CFLAGS='-O3 -march=native'

            # make install

             

            Start OVS

            # mkdir -p /usr/local/etc/openvswitch

            # mkdir -p /usr/local/var/run/openvswitch

            # rm /usr/local/etc/openvswitch/conf.db     ## If not first time run

            # ovsdb-tool create /usr/local/etc/openvswitch/conf.db /usr/local/share/openvswitch/vswitch.ovsschema

             

            Start ovsdb-server

            # export DB_SOCK=/usr/local/var/run/openvswitch/db.sock

            # ovsdb-server --remote=punix:$DB_SOCK --remote=db:Open_vSwitch,Open_vSwitch,manager_options --pidfile --detach

             

            Start OVS

            # ovs-vsctl --no-wait init

            # ovs-vswitchd --dpdk -c 0xf -n 4 --socket-mem 1024 -- unix:$DB_SOCK --pidfile --detach --log-file=/var/log/openvswitch/ovs-vswitchd.log

             

            Create OVS bridge, add DPDK port and vhost-user port

            # ovs-vsctl add-br ovsbr0 -- set bridge ovsbr0 datapath_type=netdev

            # ovs-vsctl add-port ovsbr0 dpdk0 -- set Interface dpdk0 type=dpdk

            # ovs-vsctl add-port ovsbr0 vhost-user1 -- set Interface vhost-user1 type=dpdkvhostuser

            vhost-user device created here -

            /usr/local/var/run/openvswitch/vhost-user1

             

            Run VM with vhost-user back-end device-

            qemu-system-x86_64 -enable-kvm -m 1024 -smp 2 -chardev socket,id=char0,path=/usr/local/var/run/openvswitch/vhost-user1 -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce  -device virtio-net-pci,netdev=mynet1,mac=12:34:00:00:50:2c -object memory-backend-file,id=mem,size=1024M,mem-path=/dev/hugepages,share=on -numa node,memdev=mem -mem-prealloc /data1/vms/rhel6.7-master.qcow2

             

            Hope this helps and i did not missed anything

            Anyway a full community post will be available soon

              • Re: DPDK-OVS using connectx3-Pro binding issues
                motmot

                Gilad,

                Thanks for your quick response!

                We would like to try your suggested tutorial, however, in order to do so, can you please address our 2nd question? We are not sure which versions should we use?

                Moreover, we are currently working with Fedora OS based on recommendations given by engineers who tested DPDK using Intel's NICs.

                Any recommendations on that as well ?

                 

                Looking forward to your reply

                  • Re: DPDK-OVS using connectx3-Pro binding issues
                    giladber

                    Any version OFED is working on should be fine from our perspective. I can't really comment on OVS though. i have tested it on RHEL6.5 (if i remember correctly) and 7.1.

                    See here - http://www.mellanox.com/page/osv_support_ib#InfiniBand

                    Hope this helps and let me know if anything else is needed.

                      • Re: DPDK-OVS using connectx3-Pro binding issues
                        motmot

                        1. Any comments regarding dpdk version?

                        2. As for ivshmem - we currently using native. However, you didn't include it in your tutorial. Should I configure it manually or is it already enabled by default (as part of the DPDK package on mlx site?

                        3. Note that in on one of your tutorials you have mentioned that CONFIG_RTE_BUILD_COMBINE_LIBS should be configured to N while in your tuturial above you claim that it should be configured to Y.

                        3. A more general question: We try to transmit data from two guests located in two different hosts (connected using mlx 40GB switch) and DPDK-OVS. We wanted to make sure that we do not miss anything: while dpdk is installed in both the host and guests OS, ovs installation is needed only in the hosts.

                        4. Have you tried configuring DPDK using virt-manager as well?

                          • Re: DPDK-OVS using connectx3-Pro binding issues
                            giladber

                            1. Any comments regarding dpdk version?

                            [A] I recommend to use latest - DPDK 2.2. i've used MLNX_DPDK 2.1 and DPDK 2.2 from upstream.

                             

                            2. As for ivshmem - we currently using native. However, you didn't include it in your tutorial. Should I configure it manually or is it already enabled by default (as part of the DPDK package on mlx site?

                            [A] To be honest, i never used ivshmem. Mellanox do not change the default and therefore i don't think it is enabled. I do need to try it and then will have better answer (and tutorial)

                             

                            3. Note that in on one of your tutorials you have mentioned that CONFIG_RTE_BUILD_COMBINE_LIBS should be configured to N while in your tutorial above you claim that it should be configured to Y.

                            [A] It is application specific. OVS needs DPDK to be built as combined lib, other application no. by default it combined lib is disabled. for the examplw application like testpmd you do not need it.

                             

                            3. A more general question: We try to transmit data from two guests located in two different hosts (connected using mlx 40GB switch) and DPDK-OVS. We wanted to make sure that we do not miss anything: while dpdk is installed in both the host and guests OS, ovs installation is needed only in the hosts.

                            [A] Yes, OVS installed only on the host. DPDK on the guests is not a must (but DPDK does support virtio). the short tutorial above do not cover DPDK in the guests.

                             

                             

                            4. Have you tried configuring DPDK using virt-manager as well?

                            [A] No, only QEMU. i guess you can do it with manually edit the xmls but never did it myself. might be a good suggestion for the post, i will try to add it.

                              • Re: DPDK-OVS using connectx3-Pro binding issues
                                motmot

                                Thanks.

                                 

                                [using dpdk 2.1_1.1 and ovs 2.4.0]

                                I have noticed that during ovs configuration stage i get link error with dpdk.

                                exploring the conig.log i have noticed that it tries to load (hard coded) -lintel_dpdk

                                which library is equivalent to intel's lib. is it librte_eal?

                                  • Re: DPDK-OVS using connectx3-Pro binding issues
                                    giladber

                                    Older version of DPDK indeed used intel_dpdk as the library name. newer versions use dpdk as the library name (also seems the configuration option for this is gone, but i guess i am missing something). you have 2 options - use latest OVS (like in my reply), which is prefered, or change the OVS configure file to look for dpdk and not intel_dpdk. 

                                     

                                    you can find the library here -

                                    x86_64-native-linuxapp-gcc/lib/libdpdk.a

                                     

                                    Really hope i'm right here.. based on my memory

                                    1 of 1 people found this helpful
                          • Re: DPDK-OVS using connectx3-Pro binding issues
                            unxited

                            Hi, I'm using dpdk 2.2 and OFED 3.1.1.x on linux 3.10.0-327.4.4.el7.x86_64 (CentOS 7.2)

                            with ConnectX 3 pro. Was going through   your guide. On stage make install T=x86_64-native-linuxapp-gcc.

                            Output (very end of it) was:

                            INSTALL-APP testpmd

                              INSTALL-MAP testpmd.map

                              LD test

                              INSTALL-APP test

                              INSTALL-MAP test.map

                            Build complete [x86_64-native-linuxapp-gcc]

                            Installation cannot run with T defined and DESTDIR undefined

                            Can you please suggest some workaround to make installation possible?

                            Thank you a lot.