The point is to use a single word that people can rally behind to be disruptive. In my world, NoSQL is a movement about using the right tool (especially open source) for the job. It is a movement about building these tools to the point where they can be consumed and put into production quickly.
There is no value in being precise (in marketing) since then we are just cats that need to be managed. By being imprecise and use a bullshit marketing term, we gain collective market power.
I mean, you wouldn't want to say "database management systems that differ from classic relational database management systems in some way, which may not require fixed table schemas, and usually avoid join operations and typically scale horizontally" in every third sentence would you?
Now, that is marketable.
Risk, reliability, performance expectations, etc. I don't know if I've ever met a project of any significant complexity that would have been more of a success in any area, deployed any quicker, or developed any easier by not using a traditional RDBMS.
There are those projects out there, I just don't have any personal experience with them. But what's more, I seriously doubt 99% of people using NoSQL solutions do either.
Databases cross a funny line. There's language, syntax, a lot going on under the covers, and complex technical decisions and trade-offs at every turn to maintain durability and performance on modern hardware.
Sure, I can set work_mem in Postgres to 1GB and sort this query in-memory like a king, but then I'm (worst-case-scenario) reducing my 20-connection server by 20GB it might have used for cache instead speeding every other query up.
I can imagine issues are as simple as having the right LRU in front of my data, or I can guarantee durability and achieve really amazing performance if I'm willing to invest myself into learning why particular design decisions, made over years by some genuinely brilliant people, were made.
Truth is, just to use one of these systems, you mostly don't have to care to meet your own requirements. And I think people take that for granted. When performance requirements are no longer being met, it's easy to blame the tool (the RDBMS), but I think a much safer assumption is to blame the developer and hit the books. The PostgreSQL 9.0 Administration Cookbook and PostgreSQL 9.0 High Performance books are surprisingly good reads. It doesn't take all that much personal effort really. Less certainly than switching tool-chains.
This is one of those comments you write, goes off-track, and then decide to just delete. Well I'm going to post it damn-it. :-)
Are you an engineer? I ask because I find it very hard to believe an engineer would write that.
I'm thinking on how to rewrite that. My aim was to say that when we communicate to non-engineers, it should be imprecise and more sales pitchy.
When we need to sell our ideas. It doesn't help us to build an awesome taxonomy when we need to sell the right tool. More often enough, when we try to get precise with non-engineers, it owns us. It creates the need for a manager to organize the cats.
If we were precise, then we would have 10+ movements. a "column store" movement. A document movement. A map reduce movement.
Think of it as a shallow search. When we start a conversation, I first want you to be overview-y and high-level. Then I will (or you will) steer the conversation into the direction where more detail is required.
Wouldn't want a conversation polluted with unnecessary detail. It would drag on forever and bore even the most engineering of minds ... but damn it would be precise!
It may seem like a fad to most people but it still is the easiest, shortest way to describe a group of aesthetics, use cases, or features in only one word. Think of it this way: We do it all the time with races, skin colors, and religions. White, Black, Muslims, Christians, Latinos, Asians... RDMS, and NoSQL...
How about "In this case, the cost of being precise outweighs the benefits"?
NoSQL is a phrase that means "use the right tool for the job" instead of just throwing endless amounts of money at low-risk, but 100x (1000x?) more expensive solutions that will eventually fail at internet scale load anyways.
NoSQL means taking on more of the engineering risk, rather than shifting it to your vendor.
Grouping Cassandra, CouchDB, MongoDB, Redis, etc. is the same as grouping MySQL with git - they all store data in one way or another, but the data they store and the way they store it varies wildly.
It's also sad, because all these "alternative" databases offer a lot of features that RDBMS' don't, and they should be promoted instead of stuck under an "umbrella" term to protest against SQL databases.
In contrast, thanks to the NoSQL movement and the exploration of alternative models that it has encouraged, the quality of discussions is extremely different today. More and more engineers are aware of the benefits and issues with various models and are much more open to alternatives. So when I say that I am extremely thankful to whoever coined the damn term and for efforts like NoSQL Summer, I really mean it. It has really improved my quality of life.
Now I get that "NoSQL" isn't a 100% accurate term. But what marketing term ever is? Take something like AJAX — most "AJAX" apps have never dealt with XML, yet the term has been extremely useful. It helped solidify a broad-based effort to explore using JavaScript and thanks to the "AJAX movement" of a few years ago, we now have XHR in all modern browsers and awesome libraries like jQuery!
The real issue as I see it is that projects are keen to differentiate and are thus reacting to being lumped together with extremely different systems. Now no-one who understands the technologies is ever going to compare the likes of Redis, CouchDB, Neo4j, Cassandra and Hadoop as equivalents, but it is understandable that projects are afraid of being considered equivalent by those who are simply choosing a NoSQL system for their project without understanding the differences.
This follows onto another issue — a leader (or two) often emerge once a new domain has been established and the "smaller" projects are cautious of being sidelined by "big boys" like Cassandra/MongoDB/Hadoop. To continue with the AJAX example, in the early days there used to be a whole bunch of options regarding JavaScript libraries: Prototype, jsolait, MochiKit, MooTools, jQuery, Dojo, etc. In contrast, nowadays, jQuery is the default choice for the vast majority of developers. I don't think that there is such a clear winner in the NoSQL field yet. In fact, given the massive fragmentation, we probably haven't even heard of the final winner yet!
That is not to say that the concerns of the various projects aren't totally valid. But the issue is not with the "NoSQL" term but rather with differentiation and understanding — both of which can only be solved by better communication. Phrases like "Online Request Processing Systems (ORPS)" or even "Alternative Datastores" aren't exactly catchy marketing terms. NoSQL may not be perfect, but it's here and it's more than good enough. So can we please stop bashing it and focus on coming up with clearer differentiators? Thanks!
I'm confused by his five example for "1.2.3. Access Path Dependence" (page.378), where an app would fail if the data representation changed, because I think an app using a relational store would also fail if the relations were organized differently. I can see some possible resolutions, but the paper doesn't address the issue...
I concede that it's hard to assess a proposed approach when it doesn't actually exist yet; but I think that if you raise an issue with the existing approaches in a paper, it's reasonable to also assess your own proposal with respect to that issue.
e.g. maybe he imagined automatic views to convert the underlying relations (so that different relations are identical if they represent the same information...); or a manual conversion layer with views (but the same could be done for the other store!); or maybe he was only thinking of different physical representations when he wrote that part and it didn't occur to him that different relations also might be used
EDIT http://www.aisintl.com/case/library/Date_Birth%20of%20the%20...
I think he's saying that while the relational model has the same problem of retaining compatibility for old apps when it evolves, it this is * easier * to do this with the relational model. ie. the "number of access paths" for old apps becomes "excessively large" for non-relational models. He talks a bit about the complexity of representing different queries later on, but somewhat obliquely and doesn't draw the connection (and I don't quite follow what he means in the second last paragraph of section 1.5, where he mentions n!, 2n-1 and n+1 - I understand it so little, that I think there might be a typo).
Ah! He seems to have addressed it more directly in a previous, less-cited IBM-only paper from 1969... to which I happen to have a link right here: http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.173.5...
No it's not. It's a typical human response to needing to give names to groups of items. Collective nouns, even misjudged ones, bring concepts to light and allow us to discuss them collectively without lengthy explanations. The elements behind AJAX existed before the term was coined, but its coining gave everyone a single point to discuss and support and its usage exploded rapidly after its coining.
RadioLab's "Words" show - http://www.radiolab.org/2010/aug/09/ - goes into the value of words as a way to represent concepts and feelings and, specifically, how those things fail to exist without the words to define them.
Ouch. Nothing hurts your credibility quite like an egregious spelling mistake just as your rant is taxiing onto the runway.