Someone clearly forgot to tell the MongoDB crowd that they lost. Ever since an anonymous poster on HackerNews called the MongoDB baby "ugly", I've been watching to see if MongoDB's early rise would taper off and fall. After all, my own company, Nodeable, has had to switch from MongoDB to Cassandra due to some significant …
In other words, grok your data
We are using and no complaints. Our data is such that a traditional relational database isn't the right tool and Mongo has been fine. We'll see how the scaling goes, but right now we have other bottlenecks so any Mongo issues are going to be in the future. Fortunately we've architected things so that switching DBs is going to cause minimal (though not minimum) fuss.
Mine's the one with my Oracle certifications in the pocket.
Liked it a lot, but our application requires a lot of ad hoc queries over large datasets and MongoDB's query planner can only use one index per query. It does not support using multiple indexes combined with a temporary bitmap array like a lot of SQL databases that have been around the block for a decade.
MongoDB, "ugly"? Part of what attracted me to it was the excellent and clear documentation, unlike Hadoop and Cassandra which are (or were) much messier and ill-defined at the time. As a result it was also much easier to break into than its contemporaries.
Thanks for bringing this to my attention.
I was happily unaware of the spat, and after reading this I still don't see why it matters. To wit:
"Unfortunately for most developers, your application will never get to the size where picking one database over the other will even matter."
This is what enables, say, mysql to retain its popularity. It acts as a glorified concurrent file access method with record granularity for most uses. It's got fanbois all over*, though anyone with half a clue about database theory and who values their data knows to stay well away from it, as well as a number of other offerings. Yet apparently most people have the luxury to remain ignorant, become fanbois, and wage a jolly good holy war or two about their preferences.
Once you understand what the real issues are, the argument becomes moot, much like the need to argue about it in the first place. And then you realise that what you've been busy with is properly called "wasting your time". Best revel in the luxury while you can, eh.
In that light the proliferation of "NoSQL" way, way beyond the niche where it really matters is quite remarkable, really.
* Like the repeated, heated, assertions that the latest versions now really "are fully ACID compliant", honest, with the storage engine now used "by default" for user data. The manual doesn't support this, only says that storage engine "follows ACID". Like fish following migrating caribou. Look, you, as long as you "parse but ignore" constraints clauses, we really don't need to bother looking into "fully compliant" claims at all. It's just not there. Now run along and play with your toys, there's a good lad. At least the NoSQL crowd doesn't go there and claim that, and if speed really isn't going to matter, which indeed it mostly isn't**, then I'll pick ACID and boring old solid engineering. Yes, you may now git off me lawn, thank you.
** And where it does a little effort actually designing the database layout, or even just tinkering with the silliest, slowest queries and adding a few indices goes quite the long way. But that requires understanding the subject matter, or at least skimming an O'Reilly cookbook.
Re: Thanks for bringing this to my attention.
You were doing well until you went on your anti-MySQL rant.
When colossal sites like Youtube and Facebook can run on MySQL I really have trouble believeing people who say that is isn't good enough to hold data for "anyone with half a clue about database theory and who values their data".
youtube and facebook do not care about ACID.
Facebook doesn't care about you losing a few wall updates. If you care, well, why don't you just post them again? Youtube cares more about Big Content[tm] farting than the "rights" of their freeloading users. Case in point: Complain-bots that get birdsong deleted as copyright infringment; the DMCA requirement the actual holder to the rights needs to complain is dead letter to them. You need no ACID to keep track of that sort of content, nor the other user contributions that come with it.
You wouldn't want to run payroll on those systems, but that doesn't stop millions of people happily using them for other purposes. My point was about the former, and your counter with the latter is frankly orthogonal. Being big is no guarantee for good practices. It's easier to argue that good practices are more expensive the bigger you get, especially if you have to introduce them after you've grown big.
So much so that it was easier and cheaper for facebook to hire a crack team of C coders and have them build a php-to-C translator to speed up running the steaming pile of php mess that is the core "IP" of the business, than it is to actually re-architect and re-factor it into something resembling "good practices".
There's plenty of people who know their stuff at big companies, even if it doesn't show for this reason or that. It's the "and who value their data" that's tripping you up: These molochs do not, in fact, care much about any individual's data. It just needs to not leak too much lest too many people take notice, is all.
Re: Thanks for bringing this to my attention.
"anyone with half a clue about database theory and who values their data"
Faggotry from about 10 years ago, now?
Re: Thanks for bringing this to my attention.
"It acts as a glorified concurrent file access method with record granularity for most uses."
I mean, this kind of opinionated FAIL has to be read to be believed.
Re: youtube and facebook do not care about ACID.
"Facebook doesn't care about you losing a few wall updates. If you care, well, why don't you just post them again?"
Assumption on your part in that A) They don't care, B) Any lost wall updates are due to MySQL, can you back that up?
Although to be fair I also assume that they do care about their data, others can judge which assumption is more likely
"Youtube cares more about Big Content[tm] farting than the "rights" of their freeloading users. Case in point: Complain-bots that get birdsong deleted as copyright infringment; the DMCA requirement the actual holder to the rights needs to complain is dead letter to them."
You are totally going off-track here, this is about MySQL, not copyright enforcement.
"You need no ACID to keep track of that sort of content, nor the other user contributions that come with it. "You wouldn't want to run payroll on those systems,"
And yet financial companies do use MySQL for mission critical applications. A single example found via google.
http://www.mysqlconf.com/mysql2009/public/schedule/detail/6235 . Of course you could just say that they do not know what they are doing but then it is basically your opinion vs theirs.
"but that doesn't stop millions of people happily using them for other purposes. My point was about the former, and your counter with the latter is frankly orthogonal."
Your original point, as far as I can read, was, in a nutshell that 'MySQL Sucks and so does all its fanbois' and that anyone using it does not have "half a clue about database theory and who values their data"
At no point did you offer a caveat that it was suitable for certain situations by anyone with "half a clue". That, combined with your use of language is what made me say it was a rant rather than a reasoned discussion.
"Being big is no guarantee for good practices. It's easier to argue that good practices are more expensive the bigger you get, especially if you have to introduce them after you've grown big."
Very true, however when 'big' fails, which you imply is an inevitability with MySQL, it usually makes headline news in the tech world and to be honest, I cannot really remember any "Holy crap, $BIG_SITE irrecoverably mangled all it's data and it was conclusively proved to be due to MySQL's crappiness."
"So much so that it was easier and cheaper for facebook to hire a crack team of C coders and have them build a php-to-C translator to speed up running the steaming pile of php mess that is the core "IP" of the business, than it is to actually re-architect and re-factor it into something resembling "good practices"."
What has language choice got to do with best practices? Again, your prejudices (this time against PHP, which to be honest, I rather share) is getting in the way of making a reasoned argument, and again you make the assumption that the facebook code is a "steaming pile of php mess", unless you worked there or seen large parts of the source-code, you are in no position to comment.
"There's plenty of people who know their stuff at big companies, even if it doesn't show for this reason or that. It's the "and who value their data" that's tripping you up: These molochs do not, in fact, care much about any individual's data. It just needs to not leak too much lest too many people take notice, is all."
Again, an assumption that they do not care about their data (With another derogatory term thrown in for good measure.) with no supporting facts.
And finally, I said "large companies like Youtube and Facebook". There are plenty of large companies out there that use MySQL in mission critical situations (Where I imagine ACID compliance is rather important).
If you posted some supporting evidence about why the latest version of MySQL sucks you might actually convert some people. As it is, you just come off as someone ranting hence my original reply.
Nothing like a good rant to liven up the commentard's day.
I didn't assume B, since there is no need: If A holds, then it doesn't matter where any loss occurs. Thus, there is no need to make sure it doesn't happen in the database backend, meaning you can make do with one that doesn't provide ACID, and thus mysql fits the bill. The out-of-control copyright enforcement is another example of same: Their prorities aren't with caring for user data. Which was the point.
Example? Video gets deleted frivolously (and we know this happens regularly), account holder fights, amazingly wins, gets reinstated. Where's the comments? I think it was flickr where a couple of years worth of discussion went irrevocably *poof* that way, quite recently.
The way you choose to read my wording of that, well, that's up to you. I don't particularly like mysql and (heated, repeated) assertions it can do things it doesn't actually provide I like even less. I did note that plenty of "NoSQL" software doesn't claim ACID (indeed, it tends to being quite up-front about *not* providing such for speed). You are, however, elevating a footnote to the main thrust, which isn't quite how I wrote it. You might see more if you read again, disregarding the footnote.
By the same token, if I didn't say that people who do know what they're doing can take most any tool and make it dance, well, guess you'll have to sue me now. Where I didn't put in fair defence of mysql, I didn't put in the attacks you imply I did either. The point was, rather, and quite explicitly so, that the requirements of most database use(r)s are such that it enables them using databases in well-known ill-advised ways then feel like they're knowledgeable about databases.
Point in case? Weblogs and any other website with scripting language and database backing. Those databases store entire webpage templates, pictures, and all sorts of other things for the purpose of, well, serving all that back. There's no smarts required, it's all "glorified file access". You'd probably be better off running it all through the templater once at each update, then storing the results for serving later, many many times, on a file system. Which is what happens in a roundabout ("heroic") way on most bigger sites, as they snuck a cache, possibly together with a load balancer, in front of the webserving/scripting/database stack.
You see a lot of that with the most popular of them all, mysql, but as elsethread noted, not just with them. It just so happens that if you want a good one, you pick a different one. One that actually provides things like referential integrity. Wilfully turning that around and making it an attack on the software is missing the point, really. If it works for you, well, it works for you. But you probably aren't really requiring things like ACID as much as you think you do.
Likewise, what you take to be an attack on php isn't, not quite. It's an illustration of how starting on the wrong foot and letting it fester for too long might require engineering heroics to keep on trucking later. That piece of software is quite impressive. The fact that it was needed, and quite desperately so, is not at all, or at least not without Chinese connotations. That fact, of it being needed --you don't go there unless you have to; they had to and said so too-- explicitly to address scale problems, is really all the evidence you need to infer that the php code doesn't scale.
All in all a rather long counter that seems to hinge on your reading and your assumptions, even your imaginings. Well, I think I'll be content not being you. Converting people would in a sense be cruel and unusual punishment for it would deprive developers of opportunities to engage in pointless heroics (as, curiously, referred to in the linked content), even if it might save some poor sysadmin or ops guy some sanity points down the line. We all know sysadmins don't get to retain those points anyway. If you care, though, you may want to spend some time with db-class dot org. The basics of the theory and a bit of hands-on can be quite handy at times.
Jobs are booming?
Now, pardon my possible ignorance but if jobs regarding those engines are booming then I'd say you can draw several conclusions. Either these environments are (becoming) popular so more companies want to use them /or/ several companies have already jumped on the bandwagon yet it turned out that the engine required more maintenance than anticipated as such more people need to be hired.
So the way I see it this could go either way.
Re: Jobs are booming?
I am on the mongodb mailing list and to be honest the number of frankly stupidly basic questions on there really make me wonder about the people who are choosing to deploy it.
I beg clarification of your stats, Matt
The mongo blue line looks decidedly linear from jan'09 but not too bovvered by that, more wondering at the percentage growth figures. Hundreds of thousands of percent I don't believe unless you've started at a very low baseline indeed ie. from the original committers perhaps. Howsabout some actual absolute count of users. Where can that be found?
So what you're saying is....
So what you're saying is.... MongoDB is web scale?
- 'Kim Kardashian snaps naked selfies with a BLACKBERRY'. *Twitterati gasps*
- Crawling from the Wreckage THE DEATH OF ECONOMICS: Aircraft design vs flat-lining financial models
- Moon landing was real and WE CAN PROVE IT, says Nvidia
- Apple's iPhone 6 first-day sales are MEANINGLESS, mutters analyst
- Bargain basement iPhone shoppers BEWARE! eBay exposes users to phishing vuln