On being Anti-Anti-SQL

There seems to be a draft forming as a result of a new movement of people who are "Anti-SQL" or "Anti-RDBMS". This article for example talks about the first meeting of what is being called the NoSQL community, and they are not alone. There are plenty of articles and blog posts online about "Thinking Beyond the Relational Database", "Ten reasons why CouchDB is better then mysql" and Beyond MySQL, a paradigm shift from RDBMS, noticing a pattern? While I'm sure there are many reasons why people are Anti-SQL the key points of contention are generally (in no particular order):

1. It doesn't scale well in terms of request rate
2. It doesn't scale well in terms of data size
3. Too many unnecessary features (e.g. joins, transactions)
4. They are slow

The overarching argument here is that since relational databases are very general-purpose and have to support just about all use cases that they suffer as a result, either in terms of performance or scalability. This observation has lead those members of the anti-SQL camp to take the opposite approach, namely start with a minimal feature set and only add that which is necessary. This approach has resulted in key-value stores like BerkeleyDB, Tokyo Cabinet, and Memcached, column-0riented databases like BigTable and its open-source counter parts HBase/HyperTable, document databases, graph databases...and so on. While each of these datastores are unique in their own right, I'm not going to spend time discussing the relative merits of one over the other and will instead point you to the awesomely titled (and informative) talk "Drop ACID and think about databases", this concise writeup which compares their relative features, and this very matter of fact review of many of them.

Having used a large number of such datastores I can confidently say that if the use case you are looking to fulfill can be met by one of the previously mentioned datastores that there are some great performance gains to be had. Performance gains are awesome, but the crucial part of the previous sentence is whether you can find a datastore that meets your use case. Like it or not, we have all become accustomed the relational database. It is an important part of just about every web framework, and is the crucial component of a large portion of web applications (chicken and egg discussion left as an exercise for the reader). While there are likely some cases where problems fit brilliantly into such systems, there are many which simply do not. My advice on this is simple, if you need the performance/scalability or it is a natural fit for your application then by all means go ahead, but the second you have to start employing any sort of trickery to get your app to fit into that model, stop right there, you are fighting a battle that you've likely already lost.

This leads to my last point and the one that inspired the title of this post which is my opinion of being Anti-Anti-SQL. That is not to say that I am against non-relational databases, I am in fact a very big believer of them and have extensive knowledge about quite a few. Instead, I have an issue with those who are trying to demonize the relational database into something that is outdated, decrepit, and woefully inept at the task it performs. Granted the relational databases is not without its issues, but realistically the number of applications which would fit into a non-relational database is fairly small, and the number which require it for performance reasons is even smaller. There are the Google's, Amazon's, and Yahoo!'s of the world but for every one of them there are an infinite number of tiny web services who could only dream of hitting the scalability bottleneck of their database.
If the Anti-SQL community truly feels like the relational database needs to be sent out to pasture then what they need to do is simple, change the way we (application programmers) think. Almost all web applications being developed today assume the existence of a relational database and code accordingly. Non-relational databases are generally not considered until 1) it is a necessity as a result of scale/size or 2) the developers previously experienced 1). Instead the goal should be to make the non-relational database into a first class citizen, the default unless a relational database is absolutely necessary.

This is clearly not an easy task, but in order to have a chance it is crucial to focus on the things that can make a difference. If the adoption of high level scripting languages has taught us anything, it is that people are willing to sacrifice performance for ease of use. While you could spend time bit-twiddling your C code trying to eek out a .5% improvement, instead spend time writing a good Ruby/Python/PHP plugin to interact with your datastore. In essence it all boils down to this:

While performance gets your foot in the door, usability makes the sale.

1 comments:

Puneet said...

Nice post. Not be nitpicky but I think this RWW article (http://www.readwriteweb.com/enterprise/2009/02/is-the-relational-database-doomed.php) which probably made no sql mainstream, deserves a reference.

Also on a side note, there are some strange problems with comments field in Firefox 3.5. I couldnt paste anything into the text area (neither with Ctrl-V or Edit->Paste (paste was disabled in the menu)).

Post a Comment