8 Lessons Learned - Arista Datacenter Interconnect (DCI) with VXLAN and vARP

Posted by Kevin Giusti on September 28, 2017




Though our name would imply we only work with wide area networks, WAN Dynamics has been working on some very interesting (and fun!) datacenter deployments as well, using primarily Arista Networks gear. First thing to note: this kit and company are fantastic. We genuinely enjoy working with Arista hardware, software and most of all, their people.

Not only are these boxes rock solid but there is an air of unity and clearly defined vision that permeates the culture of Arista. In a network world of new features pushed out the door too soon, poorly QA’d code and awful support experiences, it’s a refreshing and welcome change.

During these datacenter deployments, we’ve learned a few things and in what we feel is the true spirit of Arista, felt we should share them with the community. We’d like to detail here 8 things we found interesting and felt that folks should be aware of when deploying Arista based solutions with VXLAN and vARP. Enjoy!

Lesson 1: You will likely need the same Virtual ARP (vARP) MAC Address in multiple datacenters.

For subnets that lived in more than one datacenter, we needed the same default gateway so that VMs could be VMotioned over to a remote data center and still function properly. Originally, we used different MAC addresses for VARP in the primary data center and in the secondary data center. The thought was that each device after it moved and ARP’d, would grab the local MAC address of it's default gateway, that being regionally the closest and fastest to respond. After running through testing we found that occasionally a host or server in the primary data center would end up using the MAC address for vARP of the on the core switches in the secondary data center and vice versa. In order to fix this, we made the change to use the same MAC address for VARP in all data centers.

Lesson 2: If you have the same MAC address in multiple datacenters, you will get “MAC Flap” log messages in your switches.

Since we were using the same MAC address for vARP on four data center core switches we would get “MAC Flap” errors on the VXLAN DCI switches being used for VXLAN bridging. Since the switches do not use this MAC address for a routing decision, we created static MAC address entries to suppress the log message for each of the VLANs that had an SVI in both data centers.

The error: Oct 20 15:19:45 7280-dci1 PortSec: %ETH-4-HOST_FLAPPING: Host 00:1c:73:00:00:99 in VLAN 1 is flapping between interface Port-Channel30 and interface

Vxlan1 Oct 20 15:30:12 7280-dci1 PortSec: %ETH-4-HOST_FLAPPING: Host

00:1c:73:00:00:99 in VLAN 1 is flapping between interface Vxlan1 and interface Port- Channel30

The configuration that stopped the error:

mac address-table static 001c.7300.0099 vlan 1 interface Port-Channel30

8-lessons-learned-graphic


Lesson 3: OSPF peering over point to point with MTU 9100


Even if the server admins promise they will not need jumbo frames between data centers, make sure to change MTU of OSPF interfaces on the point to point circuits between data centers to something like 9100 to allow jumbo frames and accommodate for VXLAN overhead. Even if this is not a requirement right now, it will most likely be at some point. It’s easiest to accommodate for it in the beginning rather having to come back later on and change interface MTU.

Lesson 4: MAC address timeout issue

We ran into an issue that was somewhat difficult to troubleshoot where traffic would intermittently be dropped between data centers which was traversing from one VLAN to another and going across VXLAN. This was causing the Arista switches to constantly re-learn MAC addresses and this was hitting the Control Plane Policing (CoPP) policy on the switches and causing the traffic to be dropped. After working to Arista support we decided to bump up the MAC aging time globally to just over 4 hours. This did the trick and solved our issue.

Our Configuration:

mac address-table aging-time 14400

Lesson 5: Bi-Directional Forwarding Detection (BFD) between DCI and core datacenter switches

BFD is an awesome protocol to facilitate fast failover of dynamic routing sessions. This provided us with extremely fast path re-direction between our switches and quick recovery time for VXLAN across the DCI circuits.

Lesson 6: Spanning-Tree priority on Multi-Chassis Link Aggregated (MLAG’d) switches

Even though only one switch in an MLAG pair is active in spanning tree, we set the spanning tree priority the same (4096) on both switches. That way core switch that is the MLAG primary is rebooted for a software update the secondary switch will become active and we will keep the root bridge where it is desired and not have a top-of-rack switch take over root.

Lesson 7: Internet traffic considerations

In each datacenter, there were two Active/Standby high availability firewall pairs in the deployment. We had both firewall HA pairs advertise a default route into the Arista datacenter core switches. In order to keep traffic flow deterministic we gave the primary data center a better metric than the secondary data center. This way in a normal production day, Internet traffic would only traverse the firewall HA pair in the primary data center. This method kept us from having to worry about NAT translation issues on the firewall such as a packet coming in the firewall in primary data center, going to a server in the secondary data center and then leaving via the secondary data center firewall HA pair.

Lesson 8: Buffers matter

Time and again we have seen clients underestimate the need to buy a switch with deep buffers for storage and other high volume networks. This cannot be understated, heed the warning of your sales engineer (SE) if he tells you that you need a deeper buffer switch. Yes it will be more expensive, but if you need big buffers and you don’t get them, there is far more cost to deploying a network that can’t sustain it’s load. Going back to the well to ask for more money to buy the switches you should have bought in the first place reflects very poorly on an engineer to IT executives who may not understand things down to the level of buffers. They rely on their trusted engineers as advisors to make the right choices the first time around.

In Closing…
The team at WAN Dynamics cannot speak more highly of Arista based solutions. Should you want to learn more and discuss some of the other things we’ve learned, please reach out to us. We’re happy to work up an engagement, do some ad hoc consulting or just shoot the breeze for a bit. We love this stuff and want you to as well so are happy to share the knowledge!