It's been a very busy few weeks. TheData Storage Innovations (DSI) conference, the Ethernet Summit conference, EMCWorld, and next week at CiscoLive, I've been starting to talk about a new concept in Data Center storage networks calledDynamic FCoE. Understandably, there have been a lot of questions about it, and I wanted to try to get this blog out as quickly as possible.
The TL;DRversion: Dynamic FCoE combines the best of Ethernet Fabrics and traditional deterministic storage environments to create massively scalable and highly resilient FC-based fabrics. If you thought you knew what you could do with storage networks, this takes everything to a whole, new level.
As conversations about Ethernet networks move into different technology types and uses, the exciting part is the capability for network topologies to morph into a variety of configurations. I've mentioned ages ago that each Data Center is like a fingerprint -no two data centers are exactly alike.
When I was at the 2014 Ethernet Summit there were keynotes being made by Google, Comcast, HP, as well as Cisco, and each of the speakers were talking about how much of the network traffic in their data centers had shifted from the traditional "Access, Aggregation, Core" layered system -a traditional CiscoNorth-Southapproach -into a much heavierEast-Westtraffic pattern:
Traditional "layered" approach to networksIn other words, data centers are beginning to see much more server-to-server connectivity, which in turn brings new demands on networks. For networks that are designed to handle higher loads "in and out" of the data center, obviously you can wind up with architectural bottlenecks when traffic patterns change.
For that reason, new architectures are emerging that capitalize on the predictable performance and tried-and-true "Leaf/Spine" architectures (also called Clos). These architectures often are designed to have 40, or even 100G connections between the leafs and spines, which give massive bandwidth capabilities the more they scale.
New, high-bandwidth Leaf/Spine topologiesWhile this does help solve some of the Ethernet issues inside a modern data center, most storage environments are still "North-South."
Traditional storage topologies are North-SouthSo, many question arise when we start to figure this out:
Dynamic FCoE is the ability to overlay FCoE traffic across these types of architectures dynamically, but itdoesrequire a slightly different way of thinking about how storage is done over the network.
For example, in a typical storage SAN environment, we keep the resiliency and redundancy by keeping "SAN A/B" separation, like the image below:
Redundancy ho! Paranoia is the name of the gameEach of these is a duplication of the other. When designing these types of environments, savvy storage architects have to take into consideration how much bandwidth they need in order to handlenotjust normal traffic, but bursty traffic as well. That number is based on something called "oversubscription," which is where you calculate how many hosts you need to have connected to each target (storage) port. Generally speaking, that can fall anywhere between 4:1 (hosts:targets) to 20:1, depending on the application environment.
(For comparison, NFS and iSCSI networks can be an order of magnitude or more oversubscribed).
On top of that, because storage isthemost persistent item in the data center (storage can stay in use foryears), architects have to plan for growth as well. This kind of overprovisioning gives them a comfortable overhead for growth, but can wind up being an expensive use of resources that are simply waiting to be "grown into."
Thenthey have to plan for what happens if a SAN were to have an issue, like a link failure or, worse, if an entire switch were to go down:
When a SAN goes down, it's considered a Bad Thing?If something like this happens, though, you don't simply lose the use of that particular switch or link. Instead, you lose the storage capability of allthe bandwidth on SAN A. If the network hasnotbeen architected to accommodate this traffic, SAN B may wind up completely overloaded, and performance can take unacceptable hits.
While this creates a highly redundant and scalable network, and has been popular for many years, it does wind up placing an incredible, inflexible burden on the architecture.
This was the problem that FCoE was designed to solve -the ability to consolidate I/O onto the same wire and switch types so that you could manage bandwidth more efficiently, especially as throughput speeds hit 40- and 100 Gbps. Simply having and paying for bandwidth that you don't use gets very, very expensive. Moving away from a dedicated network into a more flexible, virtualized one upon which we can overlay FCoE traffic makes much more sense in these types of environments.
Dynamic FCoE modifies the way we see storage networks, as it allows us to use FCoE as an overlay on top of Ethernet forwarding technology -in this case FabricPath. This allows us to use the Equal-Cost Multipathing (ECMP) capabilities of large networks to add resiliency and robustness to our storage networks.
Let's take a look at a hypothetical Leaf/Spine architecture:
Hypothetical leaf/spine architecture with servers and storageIn this example we have our server (lower left) connected to a Top of Rack (TOR) switch, which in turn is connected toeverySpine switch. In this environment, the leaf switch can capitalize on the bandwidth ofeachSpine to access the east-west traffic destination (which may include storage systems).
Because we're using FabricPath between the leafs and spines, we can take advantage of the auto-discovery and lookup capabilities of the technology to dynamically discover and create the links between each of the switches. This means that as we add additional spines and/or leafs into the network, FabricPath does the heavy lifting of discovery and interconnectivity.
From a FCoE perspective, this isverycool. Why?
Because we don't actually have to manually go in and create those connections (called Inter-Switch Links, or ISLs). We set up the leafs as the "FCoE-aware" or "Fibre Channel-aware" FCF switches (yes, it works with native Fibre Channel as well), and then transport the frames across the Leaf/Spine "backplane."
In fact, once we have the FabricPath topology configured, when we establish the feature fcoe on the leaf switches the network:
In essence, you are setting up a dynamic, robust, multihop FCoE environmentwithout having to do it yourself.
What about the SAN A/B separation? With Dynamic FCoE, physical SAN A/B separation occurs at the edge layer, which is often the most vulnerable part of the storage network.
So, while the physical topology may look like the diagram above, the actual Fibre Channel topology winds up looking like this (from a storage perspective):
Logical SAN A/B separation gives tremendous benefitsThis gives storage environments an incredible flexibility in terms of deployment. Because of the fact that we are using the Ethernet topology for these dynamic discovery and instantiation, we can capitalize on eventual growth patterns that may include additional storage leafs, server leafs, or transport spines.
Moreover, we can take advantage of the higher bandwidth that naturally comes from robust leaf-spine communication, namely 40G interconnects and eventually 100G interconnects.
Obviously, this would be a pointless solution if we didn't have the same reliability that we've always had -or better. I'm going to argue that we have, in fact, upped the ante when it comes to both of these.
Take a look at the graphic below:
Equal-Cost MultiPathing (ECMP) and Dynamic VE_Port CreationIn our environment we have established a link between our leafs and spines, and made our connections across each of the spines. Now, let's take a look at a particular link:
Each ISL is functionally equivalent to traditional North-South storageLooking at this from the perspective of a "SAN A" link, you have effectively a edge-core topology (with a transport spine in the middle). Our VE_Ports have been created to handle the ISL traffic. In short, this lookspreciselylike the "North-South" traffic that we were looking at earlier, just drawn on its side.
What happens if we were to lose the spine for this connection?
Each ISL is functionally equivalent to traditional North-South storaghicBecause the traffic has been load-balanced acrossall the spines the FCF Leaf doesnotlose its connectivity to its destination storage leaf. That means that -unlike a more traditional storage topology, we don't loseallthe bandwidth for that side of the network, nor does the MPIO software kick in. In fact, from an FC perspective, there is no FSPF re-calculation, and no RGE (resume-generating event) for the admin.
Losing a "core" spine switch is not as catastrophic to the SANIn short, in order to lose our SAN capability ourentire data center spinewould have to go down. This is what I consider to be a pretty impressive improvement in reliability.
What happens if we were to lose the edge switch?
Loss of edge/leaf switch triggers MPIO as normal, but doesn't impact network bandwidth as muchIn this case, the MPIO software would kick inexactly as it normally would, and traffic would failover to SAN B. However, in this case, we have not lost the remainder of the bandwidth on SAN A. SAN B can use the same spine transports it always could, and so we have restricted our failure domain to just the switch, as opposed to the entire SAN bandwidth capabilities.
Generally in blogs I don't like to throw in a bunch of products in a technology piece, but it doesn't make much sense to talk about the technology and not say where you can use it.
Rightnowyou can configure Dynamic FCoE using FabricPath on the following platforms:
You'll need the R7.0(1)N1(1) software release on each of the platforms for this capability. The Nexus 7000 support (as a Spine transport switch) is coming later this year.
There are no additional licenses beyond the normal FabricPath and FCoE (for the leafs) ones, and storage vendor qualifications are coming over the summer, or later this year as well.
If you happen to be going to CiscoLive, I will be discussing Dynamic FCoE in a number of my breakout sessions, as well as demonstrating it live and in action in the booth. Additionally, I'll be presenting inside Cisco's theatre booth.
Or, you can just look for the hat and corral me to ask me more questions.
Personally, I think that Dynamic FCoE is a very cool concept for a number of reasons, not the least of which is that we are adding additional options for customers, additional reliability and high availability mechanisms, and making it easier to deploy storage networks in an ever-changing world of smarter networks.
In short, we can:
Does this mean that youmustdo this? Absolutely not. You can still do Classical Multihop FCoE or even Single-Hop FCoE that branches off to a traditional FC SAN complete with physical SAN A/B separation. In fact, you can even have a hybrid environment where both can run simultaneously. The point here is that if you are looking for one network 'to run them all," so to speak, you can have your FC and FCoE at the same time withoutneedingto have a completely separate network.
In the coming months, Cisco will be releasing additional hardware and software capabilities, including Dynamic FCoE using DFA, but that's a subject for a future blog post.
If you have any questions, please feel free to ask in the comments below.