NetApp NVMe for your database

I’ve been assisting with Oracle tests on NetApp NVMe over Fabric (NVMeoF). The results are really impressive, but there’s a few things worth explaining first…

NVMe is not media!

There’s a lot of confusion, some of it deliberately created, that NVMe is storage media.

NVMe is a protocol, not a type of drive. You could build an NVMe-based tape drive if you really wanted to.

The reason some vendors want you to think that NVMe is some kind of next-generation Flash as it allows them to slap a NVMe drive into a storage array and declare it an NVMe-based system. That’s misleading.

SCSI vs NVMe

NVMe is the next generation storage protocol. The prior standard for accessing drives, including Flash drives, is based on the SCSI protocol. It was first published in 1981 and was designed to process one command at a time with a maximum bandwidth of around 40Mbit.

Almost everything in use today is still SCSI. It might be SCSI over IP (iSCSI) or it could be serial-attached SCSI (SAS) or it could be SCSI over FCP. It’s still SCSI.

The protocol has evolved, but it’s taken a lot of effort to retrofit SCSI to enable its use on modern Flash drives. It won’t be able to meet the requirements of the next generation of solid state media.

That’s where NVMe comes in. The basic improvements are as follows:

  • Designed to process 4,294,967,296 operations at a time, not just one at a time
  • Designed for speeds approaching 100,000Mb, not just 40Mb
  • Designed for media that responds in microseconds, not tens of milliseconds
  • Designed for media without spinning motors. No need for obsolete commands used to eject tape cartridges and read bar codes

NVMe networks

So, we want to replace SCSI with NVMe, How do we do that? There are number of options, but one of the simplest is using the same FC network you’re already using. Remember, FC is a networking protocol, not a storage protocol. You can almost certainly use NVMe over the exact same FC network you have now.

We call this NVMeoF – NVMe over Fabric. The reason it’s faster is you’ve replaced the legacy SCSI storage protocol with the modern NVMe storage protocol.

There are other options as well, including IP networking options, but nothing is simpler than FC. With NetApp, using NVMeoF can be nothing more than adding a license code to ONTAP and your storage now becomes available via NVMe over your existing FC network.

Superior Manageability

NVMeoF actually improves manageability too. Oracle DBA’s and the OS and storage admins who support them are accustomed to the hassle of dealing with lots of LUNs in order to get good performance. The main value of a logical volume manager was to bond together the performance and capacity of lots of small LUNs into a large high-performance, dynamically resizable logical unit.

NVMeoF simplifies things. You can place an entire database on a single NVMe namespace (basically a LUN) without a need to create a multi-LUN logical volume. If your database grows, you can just increase the size of the namespace. There’s no need to create and discover a new LUN, add it to ASM, and then start a rebalance.

The result is NVMeoF allows your DBA, storage, and system administrator team to focus on productive work.

Performance: The Numbers

This is the sort of performance you can expect from NetApp NVMeoF solutions. This is an actual AWR report from a real Oracle database. I’ve edited out the non-storage related statistics.

What you’re looking at is 23K read IOPS and 8K write IOPS. That’s nothing amazing, but that’s not really the point of NVMeoF. If you want 1M IOPS, you can do that today with AFF systems.

Load Profile                    Per Second 
~~~~~~~~~~~~~~~            --------------- 
 Physical read (blocks):          23,487.1 
Physical write (blocks):           7,880.0  

The performance value of NVMeoF is in the dramatically improved response time. The random read latency is just over 100 microseconds. The usual latency target for all-flash solutions is 1ms. NVMeoF is 10X faster.

Foreground Events by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                    Avg   
Event                              Wait   
------------------------------  -------- 
db file sequential read         100.08us       

The result is your applications run better and your users get results quicker.

Opening the NVMe throttle

In addition, many database workloads are limited because storage latency prevent them from reaching the IOPS levels required to complete the operation.

When you drop the latency to NVMeoF levels, that same database can do things like this:

Load Profile                    Per Second 
~~~~~~~~~~~~~~~            --------------- 
 Physical read (blocks):           443,850 
Physical write (blocks):            44,385
Foreground Events by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                      Avg   
Event                                Wait   
------------------------------   --------- 
db file sequential read           133.82us   

That’s 488K IOPS at 133 microseconds of latency.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s