Sunday, October 14, 2012

The Death of the Hard Drive - and the Database Server ?

I'll start this post with a prediction: Hard drives are doomed

Actually, not much to argue with there for the informed observer. The new solid state hard drives can run rings around the mechanical version. Oh, they're not quite there yet - they're smaller and more expensive - but ten years from now you can bet that will all have changed.
And solid state hard drives are actually just memory - the only reason that they are packaged as hard drives is because that is the paradigm that we are locked into.

They're not the same memory as we buy for our motherboards.  That's dynamic, or volatile, RAM, where the data disappears with the power. The drives use static, or nonvolatile, memory (otherwise known as EEPROM).  The write cycles are orders of magnitude longer, and there is a limit to the number of write cycles per bit/byte/unit of storage, so the solid state drives have management units to spread the writing load around so as not to exhaust this limit.
Managing the write cycle limit is the one compelling reason to keep this memory as an independent unit, but I suspect that soon someone will find away around this limitation and also a way to greatly decrease the write cycle time.
And it's just a matter of time before the capacity increases beyond the ability of mechanics to compete with.

Once these things change, there will be no reason not to connect static memory direct to the processor. Why pipe it thought a slow interface when a 64 bit processor can directly address 18 million terabytes ?
The static memory will replace the role of the hard drive and probably some uses of the dynamic memory, and the dynamic memory will remain the fast, volatile work area it is now.

But the IDE/SATA/SCSI-connected box we call a hard drive will be gone.

And this brings us to the corollary of the first prediction:
Database servers as we know them are doomed

And that's because database servers are designed specifically to take memory based structures and store them in detached persistent storage - ie. a hard drive.

Now let's backtrack a fraction.
Database servers perform a lot of valuable and complex tasks, including not just storing the data, but also navigating it and keeping it consistent - transactions and query optimisation being two great examples.
So there's no way I'm predicting that these functions are going to disappear.

ACID is mandatory for any reliable system and in fact, I think this is a great litmus test for how any database system performs: assuming competent database design and indexing, can it handle an ad-hoc, highly complex query on millions of rows and still return results with low latency and throughput times ?
In the case of MS SQL Server and Oracle, the answer is yes (and server clustering is another game altogether, we won't cover it here). But many other database systems fail this test dismally.

But I do strongly believe that there will be a paradigm shift in how the data is actually stored. What concerns me is that none of the standard vendors appear to be gearing up for this.
Most of the serous RDBMSs utilise modelling of the hard disk topology at a very low level, so as to squeeze the maximum performance out of the disk.

So what happens when the hard drive disappears ? Well, the data will be stored persistently  in-place in memory. But we'll still need transactions. We'll still need a query optimiser.
And we'll still need to house the data store on a separate server and communicate with it somehow, whether by API or by SQL.
How is that going to look, and who's planning for this ? As far as I can see, as the Johnny Cash song goes: nobody.

No comments:

Post a Comment