Tuesday, February 25, 2020

VXLAN Flood and Learn with Multicast

In the introduction to VXLAN lesson, I explained what VXLAN is and how it works. In this lesson, I’ll show you how to configure VXLAN where we use the multicast “flood and learn” system to learn the mapping between a VTEP IP address and a MAC address.

Configuration

Here’s the topology we’ll use:
Vxlan Lab Topology Ip Mac

All devices are CSR1000V routers running Cisco IOS XE Software, version 16.06.01. I’m using CSR1000V routers since anyone can use these. I use custom MAC addresses because those are easy to recognize when we do a packet capture.

VTEP1 and VTEP2 are our VTEP devices. The core router is there to simulate our “IP network”.  We are going to create a VXLAN tunnel with VNI 5012 so that H1 and H2 can communicate directly over layer 2.
I pre-configured OSPF so that we have connectivity between the VTEP devices and the core router.
hostname CORE
!
interface Loopback0
 ip address 3.3.3.3 255.255.255.255
!
interface GigabitEthernet2
 mac-address 0000.5e00.5303
 ip address 192.168.13.3 255.255.255.0
!
interface GigabitEthernet3
 mac-address 0000.5e00.5333
 ip address 192.168.23.3 255.255.255.0
!
router ospf 1
 network 3.3.3.3 0.0.0.0 area 0
 network 192.168.13.0 0.0.0.255 area 0
 network 192.168.23.0 0.0.0.255 area 0
!
end
hostname H1
!
interface GigabitEthernet2
 mac-address 0000.5e00.5365
 ip address 192.168.12.101 255.255.255.0
!
end
hostname H2
!
interface GigabitEthernet2
 mac-address 0000.5e00.5366
 ip address 192.168.12.102 255.255.255.0
!
end
hostname VTEP1
!
interface Loopback0
 ip address 1.1.1.1 255.255.255.255
!
interface GigabitEthernet2
 mac-address 0000.5e00.5301
!
interface GigabitEthernet3
 mac-address 0000.5e00.5311
 ip address 192.168.13.1 255.255.255.0
!
router ospf 1
 network 1.1.1.1 0.0.0.0 area 0
 network 192.168.13.0 0.0.0.255 area 0
!
end
hostname VTEP2
!
interface Loopback0
 ip address 2.2.2.2 255.255.255.255
!
interface GigabitEthernet2
 mac-address 0000.5e00.5302
!
interface GigabitEthernet3
 mac-address 0000.5e00.5322
 ip address 192.168.23.2 255.255.255.0
!
router ospf 1
 network 2.2.2.2 0.0.0.0 area 0
 network 192.168.23.0 0.0.0.255 area 0
!
end

Multicast

Let’s start with the configuration of multicast. With VXLAN, we don’t have a typical scenario where we have a few sources and many receivers. All VTEP devices communicate with each other so it makes sense to use bidirectional PIM. The core router will be the RP in this network.
Let’s enable multicast routing and bidirectional PIM on all VTEP devices and the core router:
VTEP1, VTEP2 & CORE
(config)#ip multicast-routing distributed
(config)#ip pim bidir-enable
We need to enable PIM sparse mode on all physical interfaces that connect to the IP network:
VTEP1 & VTEP2 & CORE
(config)#interface GigabitEthernet 3
(config-if)#ip pim sparse-mode
CORE(config)#interface GigabitEthernet 2
CORE(config-if)#ip pim sparse-mode
And don’t forget the loopback interfaces:
VTEP1, VTEP2 & CORE
(config)#interface Loopback 0
(config-if)#ip pim sparse-mode
Last but not least, configure the RP address:
VTEP1, VTEP2 & CORE
(config)#ip pim rp-address 3.3.3.3 bidir
This completes the multicast configuration.

VXLAN

We need to create a Network Virtualization Endpoint (NVE) interface. This is where we configure the VNI and multicast group that we will use. We source this interface from the loopback 0 interface, use VNI 5012, and use multicast group 239.1.1.1.
Here’s how to configure the NVE interface:
VTEP1 & VTEP2
(config)#interface NVE 1
(config-if)#no shutdown
(config-if)#source-interface Loopback 0
(config-if)#member vni 5012 mcast-group 239.1.1.1
Now we need to configure the Ethernet Flow Point (EFP) service instance. This is a logical interface that connects a bridge domain to a physical port (or EtherChannel). Under the service instance, we configure whether the incoming traffic is tagged or untagged. In our case, the hosts send untagged traffic. This is how to configure it:
VTEP1 & VTEP2
(config)#interface GigabitEthernet 2
(config-if)#service instance 1 ethernet
(config-if-srv)#encapsulation untagged
(config-if-srv)#exit
(config-if)#exit
Last but not least, we need to configure the Bridge Domain Interface (BDI):
The BDI is the IOS XE equivalent of the IOS Bridge-Group Virtual Interface (BVI).
This is where we combine the VNI, physical interface, and service-instance:
VTEP1 & VTEP2
(config)#bridge-domain 1
(config-bdomain)#member vni 5012
(config-bdomain)#member GigabitEthernet 2 service-instance 1
This completes our VXLAN configuration.
I’m showing the two exit commands on purpose because I configure the bridge-domain globally. You can also configure the bridge-domain under the service instance.

Verification

Let’s verify our work.

Multicast

First, I’ll check if our multicast configuration is correct:
VTEP1#show ip mroute 239.1.1.1
IP Multicast Routing Table

(*, 239.1.1.1), 00:00:36/00:02:25, RP 3.3.3.3, flags: BCx
  Bidir-Upstream: GigabitEthernet3, RPF nbr 192.168.13.3
  Outgoing interface list:
    Tunnel0, Forward/Sparse-Dense, 00:00:36/00:02:25
    GigabitEthernet3, Bidir-Upstream/Sparse, 00:00:36/stopped
VTEP2#show ip mroute 239.1.1.1
IP Multicast Routing Table

(*, 239.1.1.1), 00:00:36/00:02:24, RP 3.3.3.3, flags: BCx
  Bidir-Upstream: GigabitEthernet3, RPF nbr 192.168.23.3
  Outgoing interface list:
    Tunnel0, Forward/Sparse-Dense, 00:00:36/00:02:24
    GigabitEthernet3, Bidir-Upstream/Sparse, 00:00:36/stopped
CORE#show ip mroute 239.1.1.1
IP Multicast Routing Table

(*, 239.1.1.1), 00:00:49/00:02:45, RP 3.3.3.3, flags: B
  Bidir-Upstream: Null, RPF nbr 0.0.0.0
  Outgoing interface list:
    GigabitEthernet3, Forward/Sparse, 00:00:44/00:02:45
    GigabitEthernet2, Forward/Sparse, 00:00:49/00:02:40
I’m seeing the (*,G) entry for the multicast group 239.1.1.1 and outgoing interfaces. This is looking good.

VXLAN

Let’s try some VXLAN specific commands. First, we’ll check if the NVE interface is up:
VTEP1#show nve interface nve1
Interface: nve1, State: Admin Up, Oper Up, Encapsulation: Vxlan,
BGP host reachability: Disable, VxLAN dport: 4789
VNI number: L3CP 0 L2DP 1
source-interface: Loopback0 (primary:1.1.1.1 vrf:0)
VTEP2#show nve interface nve1
Interface: nve1, State: Admin Up, Oper Up, Encapsulation: Vxlan,
BGP host reachability: Disable, VxLAN dport: 4789
VNI number: L3CP 0 L2DP 1
source-interface: Loopback0 (primary:2.2.2.2 vrf:0)
The command above tells us whether the NVE interface is up or not. We can add the detail parameter to also see the number of packets or bytes we transmitted or received on this interface:
VTEP1#show nve interface nve1 detail
Interface: nve1, State: Admin Up, Oper Up, Encapsulation: Vxlan,
BGP host reachability: Disable, VxLAN dport: 4789
VNI number: L3CP 0 L2DP 1
source-interface: Loopback0 (primary:1.1.1.1 vrf:0)
   Pkts In   Bytes In   Pkts Out  Bytes Out
         0          0          0          0
VTEP2#show nve interface nve1 detail
Interface: nve1, State: Admin Up, Oper Up, Encapsulation: Vxlan,
BGP host reachability: Disable, VxLAN dport: 4789
VNI number: L3CP 0 L2DP 1
source-interface: Loopback0 (primary:2.2.2.2 vrf:0)
   Pkts In   Bytes In   Pkts Out  Bytes Out
         0          0          0          0
Things are quiet but that will change soon. Let’s check the VNIs and multicast group addresses we use with the NVE interface:
VTEP1#show nve vni
Interface  VNI        Multicast-group VNI state  Mode  BD    cfg vrf                      
nve1       5012       239.1.1.1       Up         L2DP  1     CLI N/A 
VTEP2#show nve vni
Interface  VNI        Multicast-group VNI state  Mode  BD    cfg vrf                      
nve1       5012       239.1.1.1       Up         L2DP  1     CLI N/A 
Here Take a look at the show nve peers command:
VTEP1#show nve peers
Interface  VNI      Type Peer-IP          RMAC/Num_RTs   eVNI     state flags UP time
VTEP1 doesn’t know about any other VTEPs right now. This will change when we generate some traffic. The last thing we need to check is the bridge domain:
VTEP1#show bridge-domain 1
Bridge-domain 1 (2 ports in all)
State: UP                    Mac learning: Enabled
Aging-Timer: 300 second(s)
    GigabitEthernet2 service instance 1
    vni 5012
   AED MAC address    Policy  Tag       Age  Pseudoport
VTEP2#show bridge-domain 1
Bridge-domain 1 (2 ports in all)
State: UP                    Mac learning: Enabled
Aging-Timer: 300 second(s)
    GigabitEthernet2 service instance 1
    vni 5012
   AED MAC address    Policy  Tag       Age  Pseudoport
The output above is empty because our hosts haven’t sent anything yet. Let’s change that by sending some ICMP packets between H1 and H2:
H1#ping 192.168.12.102
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.12.102, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 2/2/4 ms
Excellent, at least there is connectivity. Let’s see if our VTEP devices now know about each other:
VTEP1#show nve peers
Interface  VNI      Type Peer-IP          RMAC/Num_RTs   eVNI     state flags UP time
nve1       5012     L2DP 2.2.2.2         
VTEP2#show nve peers
Interface  VNI      Type Peer-IP          RMAC/Num_RTs   eVNI     state flags UP time
nve1       5012     L2DP 1.1.1.1 
VTEP1 and VTEP2 now know about each other. Let’s also look again at the NVE interface:
VTEP1#show nve interface nve1 detail
Interface: nve1, State: Admin Up, Oper Up, Encapsulation: Vxlan,
BGP host reachability: Disable, VxLAN dport: 4789
VNI number: L3CP 0 L2DP 1
source-interface: Loopback0 (primary:1.1.1.1 vrf:0)
   Pkts In   Bytes In   Pkts Out  Bytes Out
         5        610          5        610
VTEP2#show nve interface nve1 detail
Interface: nve1, State: Admin Up, Oper Up, Encapsulation: Vxlan,
BGP host reachability: Disable, VxLAN dport: 4789
VNI number: L3CP 0 L2DP 1
source-interface: Loopback0 (primary:2.2.2.2 vrf:0)
   Pkts In   Bytes In   Pkts Out  Bytes Out
         5        610          5        610
Above, we see our 5 packets. Here’s the bridge domain output again:
VTEP1#show bridge-domain 1
Bridge-domain 1 (2 ports in all)
State: UP                    Mac learning: Enabled
Aging-Timer: 300 second(s)
    GigabitEthernet2 service instance 1
    vni 5012
   AED MAC address    Policy  Tag       Age  Pseudoport
   0   0000.5E00.5365 forward dynamic   170  GigabitEthernet2.EFP1
   0   0000.5E00.5366 forward dynamic   169  nve1.VNI5012, VxLAN 
                                             src: 1.1.1.1 dst: 2.2.2.2
VTEP2#show bridge-domain 1
Bridge-domain 1 (2 ports in all)
State: UP                    Mac learning: Enabled
Aging-Timer: 300 second(s)
    GigabitEthernet2 service instance 1
    vni 5012
   AED MAC address    Policy  Tag       Age  Pseudoport
   0   0000.5E00.5365 forward dynamic   149  nve1.VNI5012, VxLAN 
                                             src: 2.2.2.2 dst: 1.1.1.1
   0   0000.5E00.5366 forward dynamic   149  GigabitEthernet2.EFP1
The output above is interesting. We can see that VTEP1 and VTEP2 learned about the MAC addresses of our hosts.
We verified that our configuration works but there are some interesting things we can try with this topology.

Unknown Unicast Traffic

How about we look at the “flood and learn” system in action? To demonstrate this, I’ll clear the bridge domain on our VTEP devices:
VTEP1 & VTEP2
#clear bridge-domain 1 mac table
Once again, the VTEP devices don’t know about any MAC addresses:
VTEP1#show bridge-domain 1
Bridge-domain 1 (2 ports in all)
State: UP                    Mac learning: Enabled
Aging-Timer: 300 second(s)
    GigabitEthernet2 service instance 1
    vni 5012
   AED MAC address    Policy  Tag       Age  Pseudoport
VTEP2#show bridge-domain 1
Bridge-domain 1 (2 ports in all)
State: UP                    Mac learning: Enabled
Aging-Timer: 300 second(s)
    GigabitEthernet2 service instance 1
    vni 5012
   AED MAC address    Policy  Tag       Age  Pseudoport
I’ll transmit two ICMP requests from H1. I do this on purpose, the first one will be flooded and the second one will be transmitted with unicast:
H1#ping 192.168.12.102 repeat 2
Type escape sequence to abort.
Sending 2, 100-byte ICMP Echos to 192.168.12.102, timeout is 2 seconds:
!!
Success rate is 100 percent (2/2), round-trip min/avg/max = 2/3/4 ms
Here’s the first ICMP request:
Vxlan Multicast Icmp Request Multicast
Above, we see that this ICMP request is flooded because the destination is multicast group address 239.1.1.1. The ICMP reply from H2 to H1 is transmitted with unicast:
Vxlan Multicast Icmp Reply Unicast
The second ICMP request from H1 is now also transmitted with unicast:
Vxlan Multicast Icmp Request Unicast
If you want to take a look at this packet capture yourself, click on the button below:

Broadcast Traffic

What about broadcast traffic? This works the same as unknown unicast. Let me show you an example:
H1#ping 192.168.12.255 repeat 1
Type escape sequence to abort.
Sending 1, 100-byte ICMP Echos to 192.168.12.255, timeout is 2 seconds:

Reply to request 0 from 192.168.12.102, 4 ms
Vxlan Multicast Flood Broadcast
Above, you can see that this layer 2 broadcast traffic is also flooded to the multicast group.
This wraps up this lesson!
hostname CORE
!
interface Loopback0
 ip address 3.3.3.3 255.255.255.255
!
interface GigabitEthernet2
 mac-address 0000.5e00.5303
 ip address 192.168.13.3 255.255.255.0
!
interface GigabitEthernet3
 mac-address 0000.5e00.5333
 ip address 192.168.23.3 255.255.255.0
!
router ospf 1
 network 3.3.3.3 0.0.0.0 area 0
 network 192.168.13.0 0.0.0.255 area 0
 network 192.168.23.0 0.0.0.255 area 0
!
end
hostname H1
!
interface GigabitEthernet2
 mac-address 0000.5e00.5365
 ip address 192.168.12.101 255.255.255.0
!
end
hostname H2
!
interface GigabitEthernet2
 mac-address 0000.5e00.5366
 ip address 192.168.12.102 255.255.255.0
!
end
hostname VTEP1
!
ip multicast-routing distributed
!
redundancy
bridge-domain 1 
 member vni 5012
 member GigabitEthernet2 service-instance 1
!
interface Loopback0
 ip address 1.1.1.1 255.255.255.255
 ip pim sparse-mode
!
interface GigabitEthernet2
 mac-address 0000.5e00.5301
 service instance 1 ethernet
  encapsulation untagged
!
interface GigabitEthernet3
 mac-address 0000.5e00.5311
 ip address 192.168.13.1 255.255.255.0
 ip pim sparse-mode
!
interface nve1
 source-interface Loopback0
 member vni 5012 mcast-group 239.1.1.1
!
router ospf 1
 network 1.1.1.1 0.0.0.0 area 0
 network 192.168.13.0 0.0.0.255 area 0
!
ip pim bidir-enable
ip pim rp-address 3.3.3.3 bidir
!
end
hostname VTEP2
!
ip multicast-routing distributed
!
redundancy
bridge-domain 1 
 member vni 5012
 member GigabitEthernet2 service-instance 1
!
interface Loopback0
 ip address 2.2.2.2 255.255.255.255
 ip pim sparse-mode
!
interface GigabitEthernet2
 mac-address 0000.5e00.5302
 service instance 1 ethernet
  encapsulation untagged
!
interface GigabitEthernet3
 mac-address 0000.5e00.5322
 ip address 192.168.23.2 255.255.255.0
 ip pim sparse-mode
!
interface nve1
 source-interface Loopback0
 member vni 5012 mcast-group 239.1.1.1
!
router ospf 1
 network 2.2.2.2 0.0.0.0 area 0
 network 192.168.23.0 0.0.0.255 area 0
!
ip pim bidir-enable
ip pim rp-address 3.3.3.3 bidir
!
end


Conclusion

You have now learned how to configure VXLAN with Multicast bidirectional PIM, how to verify your configuration, and you have seen the flood and learn system in action. I hope you enjoyed this lesson. If you have any questions feel free to leave a comment!


No comments:

Post a Comment