Around thirteen years ago, I helped build a prototype for a custom CRM system that ran on an object database (ObjectStore). The idea isn’t quite as crazy as it sounds. The data was extremely hierarchical with parent companies and subsidiaries and divisions and then people assigned to the individual divisions. It was the kind of data model where nearly every query had several recursive joins and there were concerns about performance. Also, the team was really curious about object databases so it was a pretty cool project.
One thing that I learned during that project is that (at least back then) the object database market was doomed. The problem was that when you said “database,” people heard “tables of information.” When you said “data” people wanted to bring the database administrator (DBA) into the discussion. An object database, which has no tables and was alien to most DBAs, broke those two key assumptions and created an atmosphere of fear, uncertainty and doubt. The DBA, who built a career on SQL, didn’t want to be responsible for something unfamiliar. The ObjectStore sales guy told me that he was only successful when the internal object database champion positioned the product as a “permanent object cache” rather than a database. By hiding the word “data,” projects were able to fly under the DBA radar.
Fast forward to the present and it feels like the same conflict is happening over NoSQL databases. All the same dynamics seem to be here. Programmers love the idea of breaking out of old-fashioned tables for their non-tabular data. Programmers also like the idea of data that is as distributed as their applications are. Many DBAs are fearful of the technology. Will this marginalize their skills? Will they be on the hook when the thing blows up?
I don’t know if NoSQL databases will suffer the same fate as object databases did back in the 90′s but the landscape seems to have shifted since then. The biggest change is that DBAs are less powerful than they used to be. It used to be that if you were working on any application that was even remotely related to data, you had to have at least a slice of the DBA’s time allocated to your project. Now, unless the application/business is very data centric (like accounting, ERP, CRM, etc.), there may not even be a DBA in the picture. This trend is a result of two innovations. First, is object relational mapping (ORM) technology where schemas and queries are automatically generated based on the code that the programmer writes. With ORM, you work in an object model and the data model follows. This takes the data model out of the DBA’s hands. The second innovation is cheap databases. When databases were expensive, they were centrally managed and tightly controlled. To get access to a database, you needed to involve the database group. Now, with free databases, the database becomes just another component in the application. The database group doesn’t get involved.
Now that the database is a decision made by the programmer, I think non-relational databases have a better chance of adoption. Writing non-SQL queries to modify data is less daunting for a programmer who is accustomed to working in different programming languages. Still, the programmer needs good tools to browse and modify data because he doesn’t want to write code for everything. Successful NoSQL databases will have administration tools. The JCR has the JCR Explorer. CMIS has a cool Adobe Air-based explorer. Both of these cases are repository standards that sit above a (relational or non-relational) database but they were critical for adoption. CouchDB has an administration client called Futon but most of the other NoSQL databases just support an API. You also want to have the data accessible to reporting and business intelligence tools. I think that a proliferation of administration/inspection/reporting tools will be a good signal that NoSQL is taking off.
Another potential advantage is the trend toward distributed applications which breaks the model of having a centralized database service. Oracle spent so much marketing force building up their database as being the centralized information repository to rule the enterprise. In this world of distributed services talking through open APIs, that monolithic image looks primitive. What is more important is minimal latency, fault tolerance, and the ability to scale to very large data sets. A large centralized (and generalized) resource is at a disadvantage along all three of these dimensions. When you start talking about lots of independent databases, the homogeneity of data persistence becomes less of a concern. It’s not like you are going to be integrating these services with SQL. If you did, your integration would be very brittle because these agilely-developed services are in a constant state of evolution. You just need to have strong, stable APIs to push and pull data in the necessary formats.
The geeky programmer in me (that loved working on that CRM project) is rooting for NoSQL databases. The recovering DBA in me cringes at the thought of battling data corruption with inferior, unfamiliar tools. In a perfect world, there will be room for both technologies: relational databases for relational data that needs to be centrally managed as an enterprise asset; NoSQL databases for data that doesn’t naturally fit into a relational database schema or has volumes that would strain traditional database technology.
Related posts:

Yes as a contributor of CouchDB you can understand that I don’t see nosql db going the same fate for a lot of reasons.
First of all, most nosql dbs have more than just an API but offer a REST interface which is by definition language agnostic.
Also the conflict you mentioned here is not really valid anymore. Again, don’t focus on the API but on everything else. Data can be retrieved in different ways.
Spoken like a true programmer Nicolas! I don’t know if you caught the link to the cartoon but you just told Bob to go %^$% himself.
Take a look also at Riak
This is another great db!
And Plone has been using an object database for the past 10 years.
Oh Darn. Limi beat me to it.
I know some people that prefer to run MySQL under their Zope stack, but Zodb remains the default for most people using those tools.
Seth,
I noticed the same thing when we started ‘selling’ Zope systems into enterprises a number of years ago. As soon as you mentioned the word ‘database’ the client inevitably said ‘Ahh yes, we’ll get out DBAs to take a look at it’. They then look at it and get all confused when you explain to them that databases cover a wider field then just SQL
“But it stores all your data in a big file!”… well where do you think Oracle stores its data?
In the end I learnt the easiest thing was to just not use the word ‘database’ at all and just say it has a ‘high performance transactional object store’.
Whilst NoSQL is nothing new as you point out, I guess what it has done is getting people thinking outside the relational model.
-Matt
Nothing new, well yes and no:
Yes most of those dbs use the same backend such b+tree, or even the exact same mysql backend (innodb and such) but no because the big change is how you interact with it (and obviously how the data are organized).
I like the idea of NoSQL — I think it’s exciting and interesting and I hope the entire space succeeds in spades.
But in my experience, projects that would derive actual, practical benefit from a NoSQL approach are few and far between. A simple SQL database will well-serve the needs of 95% of the projects I come across.
object database != document oriented database
While I get the larger point, I don’t see the equivalence. NoSQL “databases” are key-value stores. ORM has been moved out of the equation and into code (where it probably belongs) instead of being hard-wired into an object database. But the real difference for me is in how a company winds up with a NoSQL solution vs how you wound up with an object database (in my experience):
I’m probably going to implement a NoSQL solution for a small section of one of my current projects because it makes sense there: the data isn’t very relational, there’s a lot of it and trying to use a relational database is creating a performance issue. As such, I’ll install a couple of options locally at 0 cost, play with them, see if there’s a pre-built Django solution and implement the one that works best. It’ll get implemented because it makes sense and the only cost is my time.
When I worked at Fair-sized Consultancy in the past, we looked at an object database (it might have been ObjectStore). I knew we were wasting our time when they sent 10 people and gave us a free hard-cover textbook. The upfront cost for the software alone was so high the only way that would have ever come into our company is if it were championed by one of the owners, not one of the programmers. The cost/ licensing model didn’t make sense for 99.99% of projects, so there was no traction in the blogosphere, people didn’t see a reason to build tools or harnesses for them, etc.
Thanks for the comment Tom. The only parallel that I was referring to is that once again we are talking about working with data in a repository that is not familiar to DBAs. I brought up ORMs to raise the point that DBAs are a lot less powerful now than they were back in the 1990′s.
BTW, I have a couple of ideas for using a NoSQL database as well.
Sorry if I sounded harsh, I totally get what you mean. My only concern with the shift of power to developers is that every new toy gets used whether it’s applicable or not (“For a man with a hammer, every problem looks like a nail.”) and I wonder how many people are solving problems that RDBS are good at with a NoSQL store, just because.