HowTo Configure IP Multicast (PIM, IGMP) on Mellanox Ethernet Switches

Version 10

    This post explains and shows how to configure IP multicast (PIM and IGMP) over Mellanox Ethernet switches (VMS).

     

    References

    Requirements

    • MLNX-OS Software version 3.3.5200 or later

     

    Setup

    The setup that will be used in this example is based on the setup in this post

     

    It includes four Mellanox Ethernet switches and two servers.

    24.png

     

    PIM BIDIR

    Bidir-PIM is a variant of PIM suite, and extension of PIM-SM (Sparse mode).

     

    Rendezvous Point (RP)

    It is recommended to select one of the spine switches to be configured as RP.

    In this example, we will select SX03 and SX04 Ethernet switches to be configured as RP for the PIM protocol.

    In addition, it is recommended to set the RP on loopback interface configured on that switch.

     

    IP Routing

    To run IP Multicast, IP routing must be enabled and running on the switches. In addition, OSPF should be configured on all switches as the unicast protocol.

     

    Configuring IP Multicast (Example) on Mellanox switches:

    To configure IP multicast, follow these steps:

    1. Configure the switches and servers to be enabled with IP routing (L3)  as described in HowTo Configure OSPF on Mellanox Switches (Running-Config).

    2. Enable IP Multicast and PIM globally on each of the switches:

    switch(config) # ip multicast-routing

    switch(config) # protocol pim

    switch(config) # no ip pim bidir shutdown

    3. Select two spine switches to act as the RP for the PIM BI-DIR. In our example, we will use SX03 and SX04.

        a. Configure loopback interface on those two spines with IP address as follows

    // Configure SX03

    switch(config) # interface loopback 0

    switch(config interface loopback 0) # ip address 100.100.100.100 /32

    switch(config interface loopback 0) # ip ospf area 0.0.0.0

     

     

    // Configure SX04

    switch(config) # interface loopback 0

    switch(config interface loopback 0) # ip address 99.99.99.99 /32

    switch(config interface loopback 0) # ip ospf area 0.0.0.0

       b. Configure PR candidate and BSR candidate for the loopback interface

    switch(config)# ip pim rp-candidate loopback 0 group-list 224.0.0.0 /4 bidir

    switch(config)# ip pim bsr-candidate loopback 0

    4. For each VLAN interface used in setup (in each switch), enable PIM-SM

    // Configure SX01

    interface vlan 1 ip pim sparse-mode

    interface vlan 3 ip pim sparse-mode

    interface vlan 5 ip pim sparse-mode

    // Configure SX02

    interface vlan 2 ip pim sparse-mode

    interface vlan 4 ip pim sparse-mode

    interface vlan 6 ip pim sparse-mode

    // Configure SX03

    interface vlan 1 ip pim sparse-mode

    interface vlan 2 ip pim sparse-mode

    // Configure SX04

    interface vlan 3 ip pim sparse-mode

    interface vlan 4 ip pim sparse-mode

     

    At this point the switch configuration is done.

     

    Server Configuration:

    Each server should have IGMPv2 configured and multicast route configured:

    #route add -net 224.0.0.0 netmask 240.0.0.0 dev eth2

    #echo "2" > /proc/sys/net/ipv4/conf/eth2/force_igmp_version

     

    Verification:

    To verify that IP multicast can run over this setup, you can use iperf application or similar to generate multicast traffic.

    In this example, server S1 is selected as the multicast generator, while S2 is the listener.

    1. Configure Server S1 to listen to IP multicast group 224.10.10.10:

    # iperf -suB 224.10.10.10 -i 1

    ------------------------------------------------------------

    Server listening on UDP port 5001

    Binding to local address 224.10.10.10

    Joining multicast group  224.10.10.10

    Receiving 1470 byte datagrams

    UDP buffer size: 4.00 MByte (default)

    ------------------------------------------------------------

    2. Configure Server S2 to send IP multicast for 224.10.10.10:

    # iperf -uc 224.10.10.10 -t 10000000 -T 10 -b 100000000000000

    ------------------------------------------------------------

    Client connecting to 224.10.10.10, UDP port 5001

    Sending 1470 byte datagrams

    Setting multicast TTL to 10

    UDP buffer size: 4.00 MByte (default)

    ------------------------------------------------------------

    [  3] local 11.11.5.1 port 52718 connected with 224.10.10.10 port 5001

     

    Switch show output commands:

     

     

    1. Rendezvous Point (RP)

    The RP 100.100.100.100 should be shown in all switches in the setup.

    Run the command:

     

     

     

    switch (config) # show ip pim rp

    PIM RP Status Information for VRF "default"

    BSR: 100.100.100.100, expires: 00:01:52,

         priority: 64, hash-length: 30

    RP: 100.100.100.100, expires: 00:02:12

      priority: 192, RP-source: 100.100.100.100, group ranges:

        224.0.0.0/4

     

    2. IGMP groups

    IGMP groups should be shown on the leaf switches (SX01, SX02 in the example)

    switch (config) # show ip igmp group

    IGMP Connected Group Membership

    Type: S - Static, D - Dynamic

    Group Address  Type     Interface  Uptime     Expires          Last Reporter

    224.10.10.10    D       Vlan6      21:46:19   00:03:17         11.11.6.1

    Spine switches  (SX03, SX04 in the example) will not show igmp groups

    switch (config) # show ip igmp group

    % Group list is empty

     

    3. BSR

    BSR information of the suggested rp should be shown as follows on the leaf switches  (SX01, SX02 in the example)

    switch (config) # show ip pim bsr

    PIMv2 Bootstrap information

      BSR address: 100.100.100.100

      Uptime:      22:35:58, BSR Priority: 64, Hash mask length: 30

      Expires:     00:01:21

      This system is not a candidate-BSR

    Spine switches will be candidate BSR (as configured)

    switch (config) # show ip pim bsr
    PIMv2 Bootstrap information
      BSR address: 100.100.100.100
      Uptime:      22:39:52, BSR Priority: 64, Hash mask length: 30
      Expires:     00:00:26
    This system is a candidate BSR
      Candidate BSR address: 100.100.100.100, priority: 64, hash mask length: 30
                 interval: 60, holdtime: 0

     

    4. Neighbors

    Each switch should show his PIM neighbors

    For example:

    switch (config) # show ip pim neighbor

    Neighbor          Interface      Uptime    Expires   Ver   DR-Prio Mode

    11.11.1.2         Vlan1          22:40:28  00:01:33  v2    1       DR B

    11.11.4.2         Vlan4          22:40:30  00:01:32  v2    1       DR B

     

    5. Show multicast routing table

    switch (config) # show ip mroute
    IP Multicast Routing Table
    Flags: B - Bidir Group, L - Local, P - Pruned, R - RP-bit set, T - SPT-bit set
           J - Join SPT
    Timers: Uptime/Expires
    Interface state: Interface, State/Mode

    (*, 224.0.0.0/4), 01D 18:20:40, RP 100.100.100.100, flags: BR
    Bidir-Upstream: Lo0
    Outgoing interface list:

    (*, 224.10.10.10/32), 00D 01:03:42, RP 100.100.100.100, flags: BR
    Bidir-Upstream: Lo0
    Outgoing interface list:
       Vlan1, Forwarding/Sparse, 00D 01:03:42/00D 00:00:00
       Lo0, Forwarding/Sparse, 00D 01:03:42/00D 00:00:00
    switch (config) # 

     

    More switch output options related to PIM are available under the prefix "show ip pim ..." in the MLNX-OS CLI.