NFS vs FC vs iSCSI

Which protocol is the best protocol of all? I’ve been designing large and small-scale infrastructures for about 20 years now, and I know which protocol is definitely the best.

The Big FC Shop

Sometimes I work on a project at a large enterprise with a massive FC infrastructure, with millions invested in switches, monitoring software, processes and personnel. Much of the time they’re really conservative about how they manage things. Altering the methods they perform backups or provision storage might take months of discussions, an update to the ISO9000 procedures, and signoff by the lawyers.

Let’s say they’re looking for a refresh of a few large arrays, they aren’t interested in change, and they aren’t interested in anything that requires changing their business practices. In such a situation, FC is the best protocol.

The New Cloud Project

From a data point of view, there is one aspect of Cloud that is almost ubiquitous – TCP/IP.

This is how a lot of legacy FC shops start down the road to IP storage protocols. It’s just not feasible to create a Cloud based on FC. I guess it would be possible to do something like Database-as-a-Service (DBaaS) with FC with a lot of scripting, but it would be difficult.  The work in automating the creation of FC zones, LUNs, LUN masking, LUN mapping, HBA management, OS discovery, and so forth would substantial.

In contrast, automation of an IP environment is pretty easy. Creating an iSCSI LUN or NFS filesystem and mapping it to a particular IP address endpoint isn’t hard, and there are no network configuration steps such as adding WWN’s to a zone and activating the zone changes. It’s just some endpoints.

In this case, we can’t say whether NFS or iSCSI is better, but it’s clear that IP is the best protocol.

It is worth mentioning that a lot of Cloud projects involve moving files around. For example, you can use something like CloudSync  to replicate an on-site filesystem into Amazon S3 for purposes of Cloud analytics. While an iSCSI LUN is a great way to encapsulate a filesystem and move it around/to/from a Cloud, if you’re dealing with files then NFS is the best protocol.

Outsourcing and Turf Wars

Attempting to use a substandard IP network poisons IT professionals against iSCSI or NFS. They start with perfectly designed FC architecture, managed with care, and then try going to IP where any ‘ol cable plugged into any ‘ol port. That doesn’t work out well.

Sometimes that happens due to poor planning, sometimes it’s because the IP network is managed by a wholly different organization who might understand desktop LAN’s and cross-site WAN, but don’t understand how to design a high-speed storage network.

I’ve encountered more than a few cases where an IP protocol is outright impossible because the entire IP infrastructure has been outsourced to a third-party, and the contract doesn’t permit creation of a decent storage-class IP network. For example, the contract might specify a per-port charge for 1Gb and 10Gb ethernet and nothing more. There is no ability to specify which switches are used or the number of hops between them.

If it’s just impossible or impractical to leverage IP, then FC is the best protocol.

IP Options

Okay, so you’re going IP. Shall it be iSCSI or NFS?

Personally, I prefer NFS because it’s a native clustered filesystem. You can see the utilization, you can see the files, you can move then around, and you do everything from a central location. In contrast, if you used iSCSI then each little LUN would be just a chunk of bytes from a storage server point of view. The only way to find out what’s happening is to go to each individual host and check the filesystems.

This leads to inefficiency. You end up with unused space trapped in a LUN that can’t easily be deployed elsewhere. Yes, there are some hole-punching capabilities in some cases, but why go through the trouble if you don’t have to?

There are some cases where iSCSI is preferable because of established processes that are hard to break. This is similar to The Big FC Shop scenario. I know some banks which have huge operations that are highly dependent on Oracle ASM, which requires block-based storage. They want to go to an IP protocol for all the cost savings and flexibility features, but it’s not easy to change the established processes. In this case, iSCSI is the best protocol.

Also, NFS vs iSCSI can both use the same IP protocol on the same IP infrastructure. It’s not an either/or situation. You can mix and match. Maybe there will be older databases that migrate from FC to iSCSI (something that can generally be done transparently) and then your newer projects on straight NFS. In these cases, NFS and iSCSI are the best protocols.

Virtualization

The choice of protocol with a virtualization project is intertwined with plans for leveraging datastores. My presumption is that anything beyond the boot device will not be on a datastore. The real data should be directly accessed by the VM, which means NFS, an iSCSI initiator on the guest OS itself, or if someone really must use FC then an RDM or other pass-through device.

That’s not a requirement, nor is it a best practice. It’s just a presumption. It’s based on the following rationale which I’m largely lifting from my TR-3633:

  • Transparency. When a VM owns its filesystems, it is easier for a storage administrator to identify the source of the filesystems for their data.
  • Performance. Testing has shown that there is a performance effect from channeling all I/O through a hypervisor datastore. It’s not major, but it’s there.
  • Money. Why pay for hypervisor CPU licenses just to manage datastores?
  • Manageability. When a VM owns its filesystems, the use or nonuse of a hypervisor layer affects manageability. The same procedures for provisioning, monitoring, data protection, and so on can be used across the entire estate, including both virtualized and nonvirtualized environments.
  • Stability and troubleshooting. When a VM owns its filesystems, delivering good, stable performance and troubleshooting problems are much simpler because the entire storage stack is present on the VM. The hypervisor’s only role is to transport FC or IP frames. When a datastore is included in a configuration, it complicates the configuration by introducing another set of timeouts, parameters, log files, and potential bugs.
  • Portability. When a VM owns its filesystems, the process of moving an environment becomes much simpler. Filesystems can easily be moved between virtualized and nonvirtualized guests.
  • Vendor lock-in. After data is placed in a datastore, leveraging a different hypervisor or take the data out of the virtualized environment entirely becomes very difficult.
  • Snapshot enablement. This is complicated to explain. Backups in a virtualized environment can become a problem because of the relatively limited bandwidth. You might have enough network bandwidth for day-to-day activities, but then the backups come and that connection between your hypervisor and the network at large is a chokepoint. Many snapshot-based backup strategies are simpler without datastores in the way.

I am not anti-datastore, I just see a lot of problems where someone is blindly stuffing everything into a datastore just because (a) they could or (b) the hypervisor vendor said it was a “best practice”. Just because your hypervisor supports a datastore on a single 64TB LUN doesn’t make it a good idea.

If a datastore is providing actual tangible value, by all means use a datastore. Just look at the big picture and long term goals.

If you are using a datastore, NFS is the best protocol in cases where it’s possible to run a quality IP network. The reason is simple – manageability. You can see the individual datastores, copy the, restore them from snapshots instantly without a need for file copies, etc. If you can’t run a good IP network, then FC is the best protocol.

If you’re not using a datastore, on the whole NFS and iSCSI are the best protocols because you don’t have to care about the hypervisor. It’s just an IP connection between the guest operating system and the storage system. Using a pass-through device such as an RDM requires doing something on the hypervisor, which complicates management and limits automation options.

X is faster than Y

No it isn’t. A lot of people think FC is faster than iSCSI or NFS because long ago they compared 4Gb FC storage to gigabit ethernet storage. Protocol has nothing to do with this, it’s about transport clock speed. The number 4 is larger than the number 1. The result is 4Gb FC is about 4X faster than NFS. Obviously.

Try comparing 10Gb ethernet to 4Gb FC and you’ll see your NFS/iSCSI storage is faster. The protocol didn’t get faster, it’s the adapter that got faster.

There’s also an occasional claim that the “overhead” renders one protocol faster than another. That too is misleading. Overall, we’re talking microseconds of overhead. I’ve been presenting on this for about 7 years and I’ve got slides that detail what’s happening.

For the most part, small random IO’s have the least overhead on NFS. Large block IO’s have less overhead on FC. Even with current all-Flash arrays, the best you’ll do with latency is around 300 microseconds and the protocol overhead accounts for maybe 10 microseconds of that.

If you’re concerned about speed, the best protocol depends on how old your network infrastructure is. It’s almost irrelevant now, since a simple 40Gb ethernet or 32GB FC infrastructure will offer more bandwidth than even an all-Flash array can deliver.

I could keep writing for another 10 pages on this topic, but I think you get the idea.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s