Kubernetes+ONTAP = DBaaS: Part 1

The following is what happens when someone with zero experience with Kubernetes builds a containerized database environment. In this case, I’ve picked Oracle for the database, but the basic ideas here work with most relational databases.

I apologize to anyone who is an expert on Kubernetes because I’m probably going to mangle a few explanations or I’m going to be overexplaining some dead-basic ideas. You’ll still learn something, because I’m addressing a typical database problem using containers in a rather unusual way.

Background

NetApp, and in particular ONTAP, have been very successful with database projects because ultimately, we’re talking about a huge amount of data that needs to be restored and cloned a lot. It’s not hard to see the value. If you’ve got a mission-critical database that gets toasted and it’s 50TB you probably don’t want to sit around waiting to restore 50TB from tape. Even restoring 50TB from a disk-based solution can be unacceptably slow. With ONTAP, you can restore that 50TB database with a single command in just a few seconds.  While getting a clone fully operational takes a little more thought, the cloning process still only takes a couple minutes, and you can clone that 50TB data set 30 or 40 times in a space efficient manner.

DBaaS

Database as-a-Service is the concept of managing a database as a database, as opposed to managing a server and some storage and some application binaries that were used to create a database. The various Rapid Deployment Service (RDS) offerings from various hyperscaler cloud companies offer the same benefits. You just get a SQL connection to a database, and you shouldn’t have to be concerned with the details of where that database is running.

In my view, NetApp has never had a complete solution for DBaaS. The building blocks were all there, though, and have been for years. The ONTAP API’s allow simple provisioning, backups, restores, and cloning of huge amounts of data. Many customers will manually type the required commands and are happy with that approach, and it’s fine for a relatively static production environment, but as needs become more dynamic you need something to glue it all together. You need an orchestration layer. That’s what’s been missing.

About a year ago I looked into DBaaS with the hope of developing a reference architecture. The end results needed to fit the following criteria:

  • Fast provisioning of a database.
  • Simple backup strategy
  • Rapid cloning of a database
  • Simple teardown of a database
  • Option for secure multitenancy, with each database running in isolation from one another
  • Everything must be easily automatable and be able to integrate with a variety of automation frameworks

You’ll notice that HA and load balancing isn’t part of this. When I started looking around, I was assuming that whatever I built would be based on ESX and VMware HA would be providing the HA and load balancing capabilities. OpenStack was obviously an option, but then I found a better way – containers.

Containers

I knew what containers were, and I was aware of Docker and was even in some conversations a number of years back with the Docker folks, but at the time there was virtually no interest in supporting a need for persistent data so I mostly forgot about it. More recently, I had heard about the NetApp Trident driver for Docker and decided to look into it more.

For those of you that don’t know what a container is here, here’s a brief explanation of why you ought to care. Even if you’re a container expert, this is worth a read because I look at containers differently.

First, consider a hypervisor. A long time ago, hypervisors actually did a lot of real work, but then the CPU’s gained virtualization capabilities etched into the silicon. These days a hypervisor does very little. It’s basically a referee. The hypervisor creates a virtual machine using the CPU’s native capabilities and then says “ok, boot!” and the virtual machine boots. The hypervisor then largely stands back and lets the CPU enforce the isolation between the various virtual machines that are running. There is some value in how a hypervisor provides virtual devices, but overall the selection of a hypervisor isn’t about the hypervisor per se, it’s about the management features that come with the hypervisor.

There’s an efficiency problem with a hypervisor approach. If you have a server with 25 virtual machines running, you’re stuck with 25 operating systems to install and patch. Starting an app requires a booted VM, so you’ll have 25 kernels up and running and doing very similar tasks. Wouldn’t it be nice to remove those 25 operating systems and workload of 25 running kernels and dodge the need for installing Oracle 25 times, not to mention patching all 25 instances from time to time?

This is where namespaces come in. The feature has been around for ages, but only become popularized recently with Docker. When you run an application such as a database it will usually be running on the global namespace that encompasses the entire running OS environment. You could, however, limit it. That’s what a container really is. It’s a namespace management framework.

The result in a database context is you can run something like Oracle sqlplus in its own private namespace. It can see its files and a network interface and nothing else. No other files, no other processes. When you type “startup” that sqlplus session will fork and execute as the various Oracle binaries that make up an operational database environment. Among other things, this means you can run multiple databases with the same Oracle SID on the same system so long as they are in different process namespaces.

Oracle and Containers – Not Supported

So, containers can do cool things. Now what?

A couple years ago, when I first read up on Docker, I was interested to see if anyone had made Oracle run in a Docker container yet. I couldn’t find any reports of success, there was nothing on Oracle’s support site about Docker support, and I found a number of blogs that said it was flat-out impossible.

This wasn’t surprising because the Oracle database binaries are thoroughly embedded into the OS itself. There’s lots of filesystems involved, there are config files typically found in /etc or /var/opt, and some utilities need to be in /usr/local/bin. This was WAY beyond what Docker was intended to do in terms of the complexity of the container filesystem layout. You’d basically need the entire OS within the container.

Oracle and Containers – Well Supported!

I checked again last year on the status of Oracle supportability on Docker and was really, really impressed. They’ve got excellent support now. You have to be careful about the word “certified” and “supported” when interpreting Oracle documents. The term “certified” means they’ve thoroughly tested a particular combination, while “supported” means they’ll help you through the normal support process. As far as I can tell, the end result is about the same. The main benefit of certification is extra assurance that you won’t need to call Oracle for support in the first place. I can’t copy the Oracle document directly, but if you have a support login the main details are in Doc ID 2216342.1 at the time I’m writing this.

So, with respect to Oracle in a container support, things good. They’ve even added containerized RAC support with Oracle 18. That should tell you how serious Oracle is about containers.

What is Kubernetes?

I started talking to customers about their container plans, and everyone told me that Docker might be interesting, but they really want Kubernetes or OpenShift support to make it manageable. Docker is still the underlying container management framework, but it’s not especially manageable. The two options seem to be very similar to me, and as I understand it OpenShift is RedHat’s “distribution” of Kubernetes. The only clear difference I’ve seen that I care about is OpenShift seems better designed to expose containers to the outside network, whereas Kubernetes is more aimed at containers communicating with other containers.

Docker has a manageable framework called Docker Swarm. It’s simpler than Kubernetes, but it’s also less powerful. If you want to do something complicated, you’ll probably be happier with the enhanced features of Kubernetes/OpenShift, but DbaaS is about doing very simple things over and over. Due to customer demand I opted for the Kubernetes approach. If someone out there reads these posts and thinks, “gee, I wish there was an OpenShift version for this DBaaS solution” let me know. I can look at what it takes to change.

The Story

I have documented my efforts in 5 parts:

  • Introduction to DBaaS with Containers
  • Docker setup, including the Dockerfile that defines the container.
  • Kubernetes setup.
  • Storage configuration, including the NetApp Trident driver.
  • Orchestration and Manageability

Part 2 continues with a detailed explanation of how I set up the environment.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s