There has been an explosion of talk about “NoSQL” lately (ie. I’ve seen a few posts about it), and since it is every blogger’s obligation to follow the crowd and re-write what everyone else is saying, I shall now present my thoughts on NoSQL.
My first thought about NoSQL was: What the heck is NoSQL?
Wikipedia (as of this writing) defines NoSQL as “a movement promoting a loosely defined class of non-relational data stores that break with a long history of relational databases and ACID guarantees.”
Word salad. To translate, NoSQL (also known as “structured storage”) is a new kind of database — popularized by such big name companies as Google, Amazon and Facebook — designed to store and quickly retrieve bajillions of terabytes of data, something that is challenging to pull off with your average enterprise relational database. Another attractive feature of a NoSQL database (to places like Google, Amazon and Facebook, at least) is the inherent ability to scale up to accommodate bajillions of simultaneous users.
The NoSQL concept also seems to focus on a “non-fixed table schema.” Presumably, that means columns (or whatever the NoSQL equivalent is — “keys,” I guess) could be added or updated at any point in the life of an application without too much trauma. I could see this being useful for rapid software iterations, where you don’t necessarily know what the final schema is going to be when you start out. (Eg. you roll out a version with a new table, and then in the next iteration a week later you find out you need to add or delete a column, which can be a massive pain with a large relational database.) In olden days, the schema would be ironed out in the “design” and “alpha testing” and “beta testing” stages, but obviously we in the industry don’t have time for that stuff anymore.
From an application developer perspective, NoSQL databases appear to shift the burden of data integrity from the database to the application. To the application, I presume the NoSQL database would look like a big dictionary or hashtable (key/value collection) – ie. a big dumb storage area whose only purpose is to read and write bits. (Similar to, you know, a hard drive — see The Daily WTF’s April Fool’s Joke) I wouldn’t think there’d be any need for a database administrator or database developer in a NoSQL shop; only application developers and system administrators. There wouldn’t be any “query optimization” or “stored procedures” because there’s… wait for it… NoSQL.
My second thought about NoSQL was: Why should I care about NoSQL? I’m not writing Google, Amazon or Facebook. My database of choice works fine. I already know how to build columns and tables, and I already know how to write applications against them.
If you’re an IT veteran, you’ll know the answer is: Because someday your gullible CEO will drop by and say, “A 20-year-old consultant told me about how great this new-fangled NoSQL is, so we’re paying him tons of money to migrate our data warehouse to it.” Afterwards, when your whole system is lying on the server room floor in shattered pieces and angry customers are jamming the phone lines, you’ll be the one that has to undo everything the consultant did, so knowing about NoSQL will help you do that.
Of course I kid. There are plenty of cases where I could easily see this being a good idea. (Like, say, if you’re writing Google, Amazon or Facebook.)
But I’m a Microsoft .NET developer, so that’s about as far as I can go in researching NoSQL. Every available server implementation I’ve seen (http://en.wikipedia.org/wiki/Structured_storage) runs on Linux, which means Mono or Java or some other Linux-capable language as a client. So I won’t be firing up Visual Studio 2008 to check out Cassandra anytime soon, and I don’t have the motivation to setup a Linux development environment just to play around with NoSQL.
It may not sound like it, but — unlike most new whizbang trends in the industry — NoSQL appeals to me because I’m primarily an application developer. I’ve often found myself wishing (rashly, in most cases) that the database (or the DBA) would get out of my way and let me handle the data storage. Generating SQL to feed to the database has always been a pain, even if it’s disguised behind ADO.NET or LINQ or some other ORM. NoSQL sounds like it should integrate better with applications.
So I say, bring on the NoSQL!