Oracle Databases and Efficiency

I’m the author of TR-3633, which covers the use of ONTAP space efficiency features with Oracle databases, but here’s a longer explanation of the options.

Deduplication

Personally, I recommend wholly disabling deduplication for Oracle databases outside a few edge cases.

The reason is there are normally no duplicate blocks to be found. Each Oracle block has globally unique metadata in the header of each of those 8K, 16K, or 32K blocks. The trailer at the end of the block is almost globally unique. If you can get more than 1% deduplication with a database using an 8K block, I’d be surprised.

ONTAP is based on 4K blocks, so if you’re using 16K or 32K blocks you might see some efficiency because those internal 4K blocks are empty, but the inline zero detection of ONTAP will also take care of those blocks by thin-provisioning the zeroed space.

The only notable exception is a database that is duplicated from another database via Oracle RMAN. That creates a nearly binary-identical copy of the original datafiles. They can deduplicate with the original, but on the whole NetApp customers clone data using the native storage cloning capabilities rather than taking the time and trouble to copy everything with RMAN.

Note that deduplication of Oracle RMAN backups to AltaVault works very well. AltaVault uses a variable block size. I once saw 127:1 efficiency during a POC because AltaVault detected tiny some repeating patterns in the dummy data. I had to pick another way of creating dummy data that was more realistic.

Compression

The best practice for Oracle database volumes is enabling compression and compaction. By “compression” I mean inline adaptive compression. That uses an 8K block size. Yes, that’s a smaller size than some other arrays out there, and that can result in slightly lower efficiency numbers, but it comes with a performance benefit.

Let’s say you used a compression block size of 32K. That’s currently how ONTAP “secondary compression” works. It’s more efficient because more data is being compressed as a unit, but it’s almost never a good idea with a database because a database workload involves small block overwrites of its datafiles.

If you use a compression block size that is larger than the block used for overwrites, you take a performance hit. For example, updating a single 8K block on an array that compresses data in units of 32K requires the storage system to read the current 32KB back from the drives, uncompress it, update the changed 8K, recompress it, and write it. back again. Any time you ask a storage system to perform a read in order to complete a write, you hurt performance. It’s obviously more significant when spinning media is involved, but it’s measurable even on an all-SSD system.

If you ever see a particular vendor, and you know who they are, who insists on doing performance tests with 32K, 64K, or larger blocks, you know why. They don’t want you to see the performance impact of a real-world IO pattern with their products. I’ve seen performance hits of up to 25%, which is a lot just to gain a little extra efficiency.

ONTAP adaptive inline compression has basically no performance impact. I’m sure somewhere out there is a customer with an ONTAP system that is burning away at 99.999% CPU capacity and maybe possibly they would see a performance improvement by disabling compression, but that’s a fringe situation. I’ve never seen that happen, but I have seen performance improvements with the use of compression because it allows ONTAP to write fewer blocks.

Secondary Compression

We do have some customers using secondary compression on read-only or highly inactive datafiles. Performance was not impacted, and efficiency did improve. Test that carefully, though.

In general, I’d only recommend using inline secondary compression on volumes that are dedicated for archive logs, which aren’t subject to overwrites. If it’s not a significant amount of data, just use adaptive inline compression everywhere for the sake of consistency, but I’ve seen a few customers with 100TB+ of archive logs stored online. An extra 10% savings of storing such data with secondary compression can free up a lot of capacity.

Compaction

Compaction is frequently misunderstood. Unlike compression, it doesn’t require real work by the storage OS. It’s just a different way of storing data.

ONTAP uses a 4K block as the basic unit of allocation. That does NOT mean that every IO performed by ONTAP is 4K. We’ve seen some really bizarre claims on that point by competitors. The 4K block sizes is the basic unit of physical allocation on the drives. The actual IO block sizes are variable.

Let’s say you have a database where all those 8K blocks compressed down to 1K each. Without compaction, you’d only get about 2:1 efficiency because each of those 1K units would occupy an entire 4K physical block on the drives. The other 3K is wasted space.

Compaction basically allows ONTAP to store blocks within blocks. It requires a few extra bytes of metadata, but that’s about it. Now those four 4K blocks that compressed to 1K each can be stored together in a single 4K physical block on storage.

Results

Most customers seem to get between 1.5:1 and 3:1 savings through compression and compaction, but the results will vary a lot. That’s not a scientific analysis, and that’s not a guarantee. That’s just my impression. I’ve seen up to 7:1 on real-world datafiles, and I’ve seen 1:1. There’s no way to be sure without testing real data.

A newly created database should show about 85:1 efficiency. The internal zeroed bytes compress almost totally out of existence, leaving just a concatenated string of headers and trailers to be stored on the drives.

Some databases include a lot of encrypted or compressed data. It’s not necessarily the database that is doing the compression or the compaction. Sometimes it’s the application using the database.

Thin Provisioning

One thing that tripped me up the first time I tested compression was the need for enabling thin provisioning. That didn’t make sense until I thought about it for a while.

Let’s say you store an 800GB database on a 1TB volume that is fully provisioned, and it compressed to 400GB. You will not see any savings because ONTAP won’t let you realize the savings. It would be meaningless. Full provisioning means you’ve reserved the entire 1TB of space.

True, your 800GB database is now occupying only 400GB on the drives, but that does not mean that your 1TB volume has 600GB free space.

Assume you write 400GB of totally uncompressible data to that 1TB volume. Add that 400GB to the 400GB consumed by the compressed database and you get 800GB total. You also now have 1.2TB of logical data on a 1TB volume. That’s requires provisioning. The logical data stored exceeds the available physical space.

A volume with a space guarantee won’t let you get into a situation where you might run out of space unexpectedly. You have to explicitly enable it.

Procedure

Here’s a demonstration of enabling and forcing efficiency on a volume. The syntax might vary a little depend on the version of ONTAP you’re using.

Note: You will need to delete all snapshots before proceeding. The presence of snapshots interferes with the ability to force compression of previously existing data. If the snapshots must be preserved, one option may be to FlexClone and then split a new volume and compress that data as the new production data. When the snapshots finally all age off the system, delete the old volume.

First, here’s my current space utilization.

[root@jfs0 ~]# df -k /oradata0
 Filesystem                  1K-blocks      Used Available Use% Mounted on
 fas8060-nfs1:/jfs0_oradata0 398458880 269427968 129030912  68% /oradata0

Next, I enable thin provisioning so I can make use of any space saved. This covers both LUN and file services.

EcoSystems-8060::> volume modify -vserver jfsCloud0 -volume jfs0_oradata0 -space-guarantee none
 Volume modify successful on volume jfs0_oradata0 of Vserver jfsCloud0.

EcoSystems-8060::> volume modify -vserver jfsCloud0 -volume jfs0_oradata0 -fractional-reserve 0
 Volume modify successful on volume jfs0_oradata0 of Vserver jfsCloud0.

Next, enable space efficiency settings. I want inline adaptive compression, compaction and *no* deduplication of any kind.

EcoSystems-8060::> volume efficiency on -vserver jfsCloud0 -volume jfs0_oradata0
 Efficiency for volume "jfs0_oradata0" of Vserver "jfsCloud0" is enabled.

EcoSystems-8060::> volume efficiency modify -vserver jfsCloud0 -volume jfs0_oradata0 -inline-dedupe false -data-compaction true -inline-compression true -policy inline-only -cross-volume-inline-dedupe false
 Warning: The efficiency policy "inline-only" will disable background efficiency operations with deduplication on the volume. Use with "-inline-compression true" or "-inline-dedupe true" to perform compression or deduplication on the new data being added to the volume respectively.

I’m not using an AFF system, so I have to temporarily switch out of the inline-only policy:

EcoSystems-8060::volume efficiency> volume efficiency modify -vserver jfsCloud0 -volume jfs0_oradata0 -policy default

I don’t have a compression policy set any more, so I have to set it manually:

EcoSystems-8060::volume efficiency*> set diag
EcoSystems-8060::volume efficiency*> volume efficiency modify -vserver jfsCloud0 -volume jfs0_oradata0 -compression-type adaptive -compression true

Here’s what I want to see. The output is edited a bit. These are the important settings: adaptive compression, compaction, no dedupe.

EcoSystems-8060::volume efficiency*> volume efficiency show -vserver jfsCloud0 -volume jfs0_oradata0
 Vserver Name: jfsCloud0
 Volume Name: jfs0_oradata0
 Volume Path: /vol/jfs0_oradata0
 State: Enabled
 Status: Idle
 Compression Type: adaptive
 Inline Dedupe: false
 Data Compaction: true
 Cross Volume Inline Deduplication: false

Now I force background compression and compaction

EcoSystems-8060::volume efficiency*> volume efficiency start -vserver jfsCloud0 -volume jfs0_oradata0 -scan-old-data -compaction true

Warning: This operation scans all of the data in volume "jfs0_oradata0" of Vserver "jfsCloud0". It may take a significant time, and may degrade performance during that time.
 Do you want to continue? {y|n}: y

The efficiency operation for volume "jfs0_oradata0" of Vserver "jfsCloud0" has started.

Now I have to wait. This is a big volume.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s