Jepsen Disputes MongoDB's Data Consistency Claims (opens in new tab)

(infoq.com)

791 pointsanarchyrucks5y ago399 comments

399 comments

In the circles I run in, MongoDB is regarded as a joke and the company behind it as basically duplicitous. For example, they still list Facebook as their first user of MongoDB on their website, for example, but there is no MongoDB use in Facebook hasn't been for years (it came in only via a startup acquisition).

I had the misfortune to use MongoDB at a previous job. The replication protocol wasn't atomic. You would find partial records that were never fixed in replicas. They claimed they fixed that in several releases, but never did. The right answer turned out to be to abandon MongoDB.

kyllo5y ago

I was floored by this comment yesterday from one of their Developer Relations people:

> Did any of you actually read the article? We are passing the Jepsen test suite and it was back in 2017 already. So, no, MongoDB is not losing anything if you know what you are doing.

https://twitter.com/MBeugnet/status/1253622755049734150?s=20

Can you imagine saying the phrase "if you know what you are doing," in public, to your users, as a DevRel person? Unbelievable.

btown5y ago

Generally speaking, there are many levels of “if you know what you are doing.”

- The system warns about unsafe usage at either compile time or runtime, and you ignore at your peril.

- The system does not warn, but official documentation is consistently verbose about what is required for safety.

- Official documentation isn’t consistently helpful and can be downright dangerous, but the community picks up the slack.

- The company gaslights the community into believing it is possible for a non-core-team member to “know what they are doing” from one of the above levels when Jepsen provides written evidence that this is not true.

I’m fine with things that are the third level from the top. I like to live dangerously. But I don’t think anyone can look at that last level and say “people are giving informed consent to this.”

1 more reply

dralley5y ago

That comment wasn't yesterday, it was a month ago. It was actually after that comment that Jepsen decided to retest MongoDB and write this new article

1 more reply

jen205y ago

Firstly let me point out that this response is neither intended as a defence of MongoDB defaults which are atrocious, or of the company, who are arguably duplicitous.

However I can _quite easily_ see how a non-native English speaker could use the phrase “if you know what you are doing” to mean “if you are careful”.

4 more replies

HelloNurse5y ago

Indirectly stating that they aren't good enough to use MongoDB properly could be offensive for thin-skinned developers, but it's only bad attitude.

I'm much more concretely worried by a software design for which the authors (not hostile critics) consider "if you know what you are doing" an acceptable safety and quality standard for data integrity.

1 more reply

mathattack5y ago

It’s not that he’s lying. He just has a tenuous arrangement with the truth.

macintux5y ago

The joke I learned early on: "Migrating away from Mongo is trivial: wait long enough, and all your data will be gone anyway."

I imagine things are better now.

MrBuddyCasino5y ago

MongoDB: the Snapchat of databases.

3 more replies

darkarmani5y ago

I just call it a probablistic datastore.

mathattack5y ago

They’re not.

mathattack5y ago

Let’s hope the data disappears after you’ve collected revenue but before you paid the expenses. Crap product, shady company, sleazeball salesforce.

chx5y ago

There was a time when I advocated for MongoDB with the usual caveats. The ability to easily store and index complex data was of great value. And then in 2015 October, within a week of each other, SQLite and MySQL both learned how to index on expressions and store JSON (SQLite 3.9 2015-10-14, MySQL 5.7 2015-10-21). PostgreSQL added jsonb the year prior in 9.4. At that moment the value of MongoDB for me diminished greatly.

tasuki5y ago

Why is storing json in a database important to you? Whenever I see json fields in PostgreSQL/MySQL, I know I'm most likely in for inconsistent data and a world of pain.

5 more replies

ashtonkem5y ago

Every single time I've had to work on top of a Mongo cluster, it has gone into "three stooges" mode, where each node insists that one of the others is master.

I pretty much refuse to deploy a new instance of it now, I've been burned too often.

DonHopkins5y ago

"I'm trying to sync, but nothing happens!"

https://www.youtube.com/watch?v=mlejsgxOxrU

kipply5y ago

Tangentially related their sales strategy is questionable.

As an intern at Shopify, I got an email from MongoDB asking us to switch. Shopify was 10 years old the time. Plus several coworkers would also receive similar emails two years later (and some in between of course).

I have a shirt from MemSQL that says "Friends don't let friends NoSQL" and I wear it proudly.

square_usual5y ago

If their sales team is sending emails to interns, their strategy is very questionable indeed.

2 more replies

icedchai5y ago

Isn't it amazing MongoDB is a 12 billion dollar company? Someone is using it and actually paying for it, even though it's not any of the developers you or I know.

mathattack5y ago

It’s the switching costs to get rid of it...

1 more reply

sneak5y ago

I run in two circles: the one you mention, but also the other: I have gotten pushback from people (usually devs at clients of mine) for saying it’s lunacy to run a real, actual business on Mongo. (This has always happened from orgs with <10TB of data in the database.)

You’d be astounded how common it is at so-called “enterprise” startups. It blew my mind.

A lot of people simply never went through the LAMP stack days and have little/no experience with real databases like Postgres (or even MySQL). It’s disheartening.

mathattack5y ago

I have found their salespeople to be the most sleazy and unethical of any that I’ve worked with. Much worse than all the other database vendors combined.

callalex5y ago

I know they are trying to beat Oracle in this sector, but I didn’t realize they meant it like this!

serkanh5y ago

Cant agree with this more! I have never met any sleazier sales reps than MOngodb.

1 more reply

tehlike5y ago

I got my pm friend prototype his idea on mean stack, but when we got more serious, immediately transitioned to postgres and started using sequelize as the orm. Pretty good decision so far. I don't think they will have cases that won't scale with orm for foreseeable future.

PacifyFish5y ago

How do you manage dev vs prod instances with sequelize? Specifically, do you use migrations for your local dB or just force sync it?

1 more reply

OOPMan5y ago

As a skateboarder I've always found the name itself rather amusing as the term mongo has relatively negative connotations in skating.

tincholio5y ago

It also does in Spanish, it's a (rather old, but still used) shortened version of "retard" (from "mongólico", originally used to describe people with Down's syndrome).

4 more replies

polote5y ago

Nobody seems to like it. Someone has any idea on why the company still has their revenue increasing ?

squeaky-clean5y ago

I like it, but I don't really really chime in on threads where it's mostly Mongo bashing and jokes, and I'm guessing others who like MongoDB do the same.

But I'd think MongoDB the company increasing in revenue isn't totally related to the quality of MongoDB the database. In fact a lot of their products seem to be targeting the "I don't want to learn how to set it up and understand indexes" crowd.

2 more replies

wolco5y ago

I like it as in certain situations.

For situations where you don't know the schema or for different schemas per record mongo is a great place to dump.

For data when you care about speed and don't care about losing some data. Think sending back a game screen when the client moves and requires a redraw. Depending on how fast the screen is changing dropping a screen isn't the biggest deal.

Reporting was a little bit more difficult but somehow rewarding.

MildlySerious5y ago

Marketing and advertising. They do a lot of it, trying to brand themselves as the core of the modern stack. The same thing they have been doing from the very beginning, when they had no product to back their claims. Reading the comments here, little seems to have changed.

Thaxll5y ago

Because good engineers don't take for granted what's written on HN and do a thorough evaluation of a product, MongoDB or something else.

1 more reply

dehrmann5y ago

But their marketing team early on was amazing.

bbulkow5y ago

that acquisition that used mongo got spun off and is now Honeycomb.io . nice people.

madhadron5y ago

Honeycomb.io is a reimplementation (with some nice additions) of some of Facebook's internal monitoring tools. Mongo was never involved in those tools, as far as I know. I have no idea if honeycomb is using it internally.

Diederich5y ago

> but there is no MongoDB use in Facebook hasn't been for years

Are you sure?

madhadron5y ago

Until very recently I worked there, and at once point I dug quite deeply into the company to see if I could find if it was still in use. I couldn't find documents referencing its present existence, packages to install it, or anything else, so I'm pretty certain that it's not in use.

2 more replies

hypewatch5y ago

It’s certainly not used for any mission critical apps. Facebook’s stack is pretty well known. They’ve been using sharded MySQL for a while now. Instagram started on PostgreSQL but I believe has switched to Cassandra.

1 more reply

naked-ferret5y ago

From the jepsen report:

"""

Curiously, MongoDB omitted any mention of these findings in their MongoDB and Jepsen page. Instead, that page discusses only passing results, makes no mention of read or write concern, buries the actual report in a footnote, and goes on to claim:

> MongoDB offers among the strongest data consistency, correctness, and safety guarantees of any database available today.

We encourage MongoDB to report Jepsen findings in context: while MongoDB did appear to offer per-document linearizability and causal consistency with the strongest settings, it also failed to offer those properties in most configurations.

"""

This is a really professional to tell someone to stop their nonsense.

mathattack5y ago

Amazing that anyone can trust Mongo after this BS.

Thaxll5y ago

MySQL and PG are not truly consistent per default, they don't fsync every writes.

MongoDB explains that pretty well: https://www.mongodb.com/faq and https://docs.mongodb.com/manual/core/causal-consistency-read...

castorp5y ago

> MySQL and PG are not truly consistent per default, they don't fsync every writes.

Postgres most certainly does fsync by default.

It's tru, you can disable it, but there is a big warning about "may corrupt your database" in the config file.

1 more reply

Carpetsmoker5y ago

I don't know if this is true or not, but it's besides the point; MongoDB omitted various failings from the Jepsen report to make their product look better than it actually is. This is not only unethical, but may also be illegal in various jurisdictions under false advertising laws.

Whatever failings MySQL or PostgreSQL may or may not have are not important at all here.

wolf550e5y ago

The default in MySQL and in postgresql is to fsync before commit and afaik that has always been the default.

1 more reply

foobarian5y ago

From top of linked article:

>>> I have to admit raising an eyebrow when I saw that web page. In that report, MongoDB lost data and violated causal by default. Somehow that became "among the strongest data consistency, correctness, and safety guarantees of any database available today"! <<<

It's not wrong, just misleading. Seems overblown given that most practitioners know how to read this kind of marketing speak.

takeda5y ago

> It's not wrong, just misleading. Seems overblown given that most practitioners know how to read this kind of marketing speak.

So basically whatever MongoDB was doing 10 years ago, they are continuing to do there. They did not change at all, yesterday or two days ago there were few people defending mongo that indeed in early years mongo want the greatest, but it is now and people should just stop being hang up in the past.

The reason why people lost their trust with mongo wasn't technical, it was this.

lostcolony5y ago

I appreciate your optimism in thinking that most (all?) people reaching for distributed systems actually know enough in the space to evaluate such claims.

1 more reply

thomascgalvin5y ago

You can tell a lot about a developer by their preferred database.

* Mongo: I like things easy, even if easy is dangerous. I probably write Javascript exclusively

* MySQL: I don't like to rock the boat, and MySQL is available everywhere

* PostgreSQL: I'm not afraid of the command line

* H2: My company can't afford a database admin, so I embedded the database in our application (I have actually done this)

* SQLite: I'm either using SQLite as my app's file format, writing a smartphone app, or about to realize the difference between load-in-test and load-in-production

* RabbitMQ: I don't know what a database is

* Redis: I got tired of optimizing SQL queries

* Oracle: I'm being paid to sell you Oracle

ceocoder5y ago

This might be a stupid question, but surely no one thinks of RabbigMQ as a database right? I’ve used it from 2012 to 2018 extensively, including using things like shovels to build hub spoke topologies, however not once did I think of it as anything but a message broker.

Did I miss something huge?

Hamuko5y ago

>This might be a stupid question, but surely no one thinks of RabbigMQ as a database right?

Arguably the world's most popular database is Microsoft Excel.

2 more replies

twic5y ago

I once worked on a system for notifying customers of events by posting to their APIs. Events came in on a Rabbit queue and got posted.

If a customer's API was down, the event would go back on the queue with a header saying to retry it after some time. You can do some sort of incantation to specifically retrieve messages with a suitable header value, to find messages which are ready to retry. We used exponential backoff, capped at one day, because the API might be down for a week.

I didn't think of RabbitMQ as a database when I started that work, but it looked a lot like it by the time I finished.

2 more replies

henryfjordan5y ago

RabbitMQ stores your data, right? Then it's a database! That's pretty much all it takes. A flat file, memory-store, SQL DB, Document store, any of them can be databases if that's where you stick your data!

But also no, RabbitMQ and Kafka and the like are clearly message buses and though they might also technically qualify as a DB it would be a poor descriptor.

3 more replies

danpalmer5y ago

It's both. It's best used when it's being used as a message broker, but any sufficiently advanced message broker will need many of the features of a database – durability of messages, querying in various ways, etc. I think it's reasonable to think of it as a very specialised database.

detaro5y ago

I interpret it as they'd probably not call it a database, but they might use it in places where a database would be better suited, and effectively store data in it.

11235813215y ago

As someone who chose MySQL and provides direction to developers who really like Postgres, and who also uses Postgres for fun, I do find myself having to both defend MySQL as a prudent option and convince them that I know anything at all about Postgres or computer science. :)

ForHackernews5y ago

I've heard MySQL (well, MariaDB, really) has improved a lot in recent years, but I still can't imagine why I'd ever choose it over Postgres for a professional project. Is there any reason?

It used to be that bargain basement shared-hosting providers would only give you a LAMP stack, so it was MySQL or nothing. But if you're on RDS, Postgres every time for my money.

1 more reply

gav5y ago

I tend to find people who argue with me against MySQL bring up things that haven't been true in a long time such as Unicode or NULL handling.

I'd probably choose Postgres over MySQL for a new project just to have the improved JSON support, but there's upsides to MySQL too:

- Per-thread vs per-process connection handling

- Ease of getting replication running

- Ability to use alternate engines such as MyRocks

6 more replies

ashtonkem5y ago

I prefer PostgreSQL, but MySQL provides a better clustering experience if you need more read capacity than a lone node can provide.

Oracle is great if and only if you have a use case that fits their strengths you have an Oracle specific DBA, and you do not care about the cost. I have been on teams where we met those criteria, and I genuinely had no complaints within that context.

larrik5y ago

Given both my experience and prior research, I don't believe you that Oracle is ever better than have the stuff on the above list, and I think it's worse than Postgres on every metric.

Every time I need to work with an Oracle DB it costs me weeks of wasted time.

For a specific example, I was migrating a magazine customer to a new platform, and all of the Oracle dumps and reads would silently truncate long textfields... The "Oracle experts" couldn't figure it out, and I had to try 5 different tools before finally finding one that let me read the entire field (it was some flavor of JDBC or something). To me, that's bonkers behavior, and is just one of the reasons I've sworn them off as anything other than con artists.

tjalfi5y ago

SQL Server: I use C# and write line-of-business applications.

yellowapple5y ago

My day job involves developing for / customizing / maintaining two separate third-party systems that rely on SQL Server (one of them optionally supports Oracle, but fuck that).

I gotta say, as much as I hate it with a passion, and as often as it breaks for seemingly silly reasons (so many deadlocks), it's at least tolerable (even if I feel like Postgres is better by just about every metric).

dustingetz5y ago

Datomic: I'm done already, send more work please

jugg1es5y ago

I've been working with a partner company that is using Datomic to back a relatively impressive product - but I don't really see much written about it. What has been your experience?

pnako5y ago

>* H2: My company can't afford a database admin, so I embedded the database in our application (I have actually done this)

I'm familiar with the variant, "InfoSec won't let us deploy a DB on the same host".

anaphor5y ago

SQLite: I enjoy using reliable and correct databases even at the cost of scalability

erik_seaberg5y ago

SQLite has always intentionally failed to report this error:

  sqlite> create table foo (n int);
  sqlite> insert into foo (n) values ('dave');
  sqlite> select count(*) from foo where n = 'dave';
  1

1 more reply

chipotle_coyote5y ago

I admit I was kind of thinking that, even though I appreciated the humor. :) I imagine an awful lot of web sites out there would do just fine with SQLite as their back end.

1 more reply

tgsovlerkhgsel5y ago

How far does SQLite scale? Obviously not good for anything public facing with thousands of concurrent users, obviously good enough for something you only use yourself, but what about internal tools with a couple hundred users total (few of them concurrent) - where's the limit when it starts slowing down?

mping5y ago

Curiously, I just read this: https://blog.expensify.com/2018/01/08/scaling-sqlite-to-4m-q...

1 more reply

mauflows5y ago

* Cockroach / Spanner: you know what's cooler than millions?

benibela5y ago

What if they prefer an XML database (like basex, exist, marklogic)?

kstrauser5y ago

We ask them politely, yet firmly, to leave.

kabes5y ago

Psychopath

OOPMan5y ago

I've used H2 Ina couple of my personal JVM applications mainly because when it comes to JVM it's a somewhat nicer fit than SQLite

golergka5y ago

I love postgresql, but I don't remember when did I last interact with it with command line instead of pgadmin.

tester7565y ago

What about MSSQL?

geerlingguy5y ago

"We are a Microsoft-only shop"

yawaramin5y ago

I have other boats to rock than MySQL! ;-)

philwelch5y ago

Neo4j?

maevyn115y ago

ha, nailed it dude.

cgijoe5y ago

HAHAHAH The RabbitMQ one got me. Have your upvote, sir.

tonigen5y ago

MySQL is actually amazing, scale better than PGsql supports Json and is available everywhere. I see no reason to use any other dB for 90% of the use cases u need a dB for

dijit5y ago

MySQL does not scale better than PostgreSQL.

I can tell you this emphatically as I spent 6 months trying to eke out performance with MySQL (5.6). PostgreSQL (9.4) handled the load much better without me having to change memory allocators or do any kind of aggressive tuning to the OS.

MySQL has some kind of mutex lock that stalls all threads, it's not noticeable until you have 48cores, 32 databases and a completely unconstrained I/O.

EDIT: it was PG 9.4 not 9.5

2 more replies

Thaxll5y ago

How MongoDB is dangerous or less consistant that PG? I have one for you: I can't use PG or MySQL because my app will go down if the master is down so then the entire backend fails. How do you do HA with default PG?

kinghajj5y ago

https://www.postgresql.org/docs/10/different-replication-sol...

Logical replication or synchronous multimaster replication may meet your needs.

threeseed5y ago

And you can tell a lot about a developer when they post comments like this.

Almost none of is remotely accurate e.g. RabbitMQ isn't even a database.

wzy5y ago

I can't believe the one item that was so obviously added as a joke went right over head.

It may be good idea to take a break from the computer and find something less stressful to do.

1 more reply

macmac5y ago

Re RabbitMQ, isn't that OPs point.

beardbandit5y ago

Man, people really hate Mongo.

We use it for a very specific use case and its been perfect for us when we need raw speed over everything. Data loss is tolerable.

1 more reply

rickbad685y ago

LOL

dang5y ago

All: we've changed the submitted URL from https://www.infoq.com/news/2020/05/Jepsen-MongoDB-4-2-6 to the work it is reporting on. You might want to read both, since the infoq.com article does give a bit of background.

Edit: never mind, I think the other URL - http://jepsen.io/analyses/mongodb-4.2.6 - deserves a more technical thread, so will invite aphyr to repost it instead. It had a thread already (https://news.ycombinator.com/item?id=23191439) but despite getting a lot of upvotes, failed to make the front page (http://hnrankings.info/23191439/). I have no idea why—there were no moderation or other penalties on it. Sometimes HN's software produces weird effects as the firehose of content tries to make it through the tiny aperture of the frontpage.

VonGuard5y ago

Lying about your test results from Jepsen is like going onto a reality show with Chef Ramsey, being thrown off for incompetence, then putting his name on your restautant's ads "Chef Ramsey ate here!"

I'd pay to watch Kyle screaming at people in the MongoDB offices, not that he screams or anything. Just a spectacular mental image: "IT'S NOT ATOMIC! IT COULDN'T SERIALIZE A DOG'S DINNER!"

jagannathtech5y ago

I would watch a tech version of Ramsey's show.. oh boy!

tluyben25y ago

Yep, always thought shame there isn’t one but too small of a niche I guess. Also, almost everyone telling online that they apply best practices at their company is maybe lying and wishful thinking; that would come out so no-one would apply for the show. So maybe more of a startup show where ‘a Ramsey’ comes in when a (bootstrapped or angel invested; VC funded is not saveable that way imho) company is in distress for tech reasons put in by the founders. Relevant pet peeve for this thread; let us (tiny, cash strapped startup company with founders who know just not enough about prod envs to do a lot of damage) do everything autoscale in the cloud and now we have a burnrate of $28k/mo on AWS bills with 5 users.

ncmncm5y ago

MongoDB's big problem is that their present user base does not want the problems fixed, particularly at default settings, because it would mean going slower. Their users are self-selected as not caring much about integrity and durability. There are lots of applications where those qualities are just not very important, but speed is. People with such applications do need help with data management, and have money to spend on it.

The stock market wants to see the product as a competitor with Oracle, so demands all the certifications that say so. MongoDB marketing wants to be able to collect money as if the product were competitive. Many of the customers have management that would be embarrassed to spend that kind of money on a database that is not. And, ultimately, many of the applications do have durability requirements for some of the data.

So, MongoDB's engineers are pulled in one direction by actual (paying) users, and the opposite direction by the money people. It's not a good place to be. They have very competent engineers, but they have set themselves a problem that might not be solvable under their constraints, and that they might not be able to prove they have solved, if they did. Time spent on it does not address what most customers want to see progress on.

gwbas1c5y ago

Translation: They were trying to be everything for everybody.

The syntax is very nice, I honestly think a lot of it's early success came from ease of use.

threeseed5y ago

If they only cared about performance then they would've left the write concern defaults to not acknowledge writes either locally or within a replica set. Or just read from the nearest replica and don't worry about potential consistency issues.

Also this isn't 2011. MongoDB is not a competitor to Oracle and never really has been by people that knew that a DocumentDB was not usable as a SQL one. It's other SQL databases that are the real competitors e.g. Snowflake, Redshift are.

ncmncm5y ago

You know it, I know it, MDB knows it, and most of their customers know it, but that doesn't matter: the stock market doesn't. MDB wants to be valued like a durable-database company, and to be able to charge durable-database prices. They need a plausible durable-database story to get those, regardless of what actual current users want.

It is possible there are still potential users not buying until they get that story. MDB wants those users.

jedberg5y ago

MongoDB started life as a database designed for speed and ease of use over durability. That's not a good look for a database.

People have told me that they have since changed, but the evidence is overwhelmingly and repeatedly against them.

They seem to have been successful on marketing alone. Or people care more about speed and ease of use than durability, and my assumptions about what people want in a database are just wrong.

otterley5y ago

> MongoDB started life as a database designed for speed and ease of use over durability. That's not a good look for a database.

I think it depends. One could say the same about Redis, but it's wildly successful and people love it.

The difference is now they are advertised. Redis makes no claims to be anything other than what it is - a fast in-memory database that has some persistence capability but isn't meant to be a long-term data store. MongoDB, on the other hand, made (and continues to make) claims about being comparable in atomicity and durability to traditional SQL databases (but magically much faster!) that haven't withstood scrutiny.

Keep in mind, too, that most data ain't worth much. It's one thing to entrust data of low value in MongoDB; another to store mission-critical data in it. I would look askew at leadership who didn't ask hard questions about storing data worth millions or billions of dollars in MongoDB without frequent snapshots -- and even then, the value mustn't be contingent on the 100% accuracy of said data.

gav5y ago

When I'm thinking about data stores in large systems I like to break them down depending on how they are used on two main axes: is it fast/slow moving and durability from "we don't care" and "we must never lose data".

It's easier to reason about systems if there's fewer things that require durability guarantees, ideally you want to be able to draw data flows that look like a tree instead of a graph.

I find that Redis fits great because it's perfect for a whole bunch of different temporal shared state needs, everything from sessions to partial results. I've also deployed things like Ehcache, MongoDB, and Memcached to fit these needs and found other tools such as Kafka or RabbitMQ to be great "glue".

Having the root of your important data be something "boring" like Postgres or MySQL (or even Oracle!) is just good risk management to me. I wouldn't want to trust Redis or MongoDB for important data because it adds to the things I have to worry about. It's "keeping your eggs in one basket" while making sure that basket is really well looked after.

bjt5y ago

Yes. What I love most about Redis is that the fundamental tradeoffs of the algorithms it's built on are surfaced up through the interface, and made very plain in the documentation.

Jare5y ago

Reading past marketing blurbs and using products for the things they are designed is part of any engineer's job. I was irritated by MongoDB's claims and defaults, but that didn't stop us from putting it in production. We used it from 2012 to 2016 (their most infamous years?), and for our use cases, scale, size+expertise, and feature set, it was a perfect match. In our case, durability was a smaller concern by design (lots of write-only data, lots of ephemeral data), but we still configured it carefully and never ran into any data loss whatsoever; snapshots worked, migrations worked, etc.

If the service had lasted longer, scaled bigger, and the business it supported had been more successful, we might have ended up with a now-classic MongoDB to pg migration. That was always an acceptable outcome, and it would have not invalidated going with Mongo at the start.

collyw5y ago

>In our case, durability was a smaller concern by design (lots of write-only data, lots of ephemeral data),

I assume that you mean write once data. If you mean write only you might as well use /dev/null.

2 more replies

kiwicopple5y ago

We[1] have done 50+ conversations with developers this year (mostly indie and small startups). You’re right about the ease of use. The top reasons are

  - they don’t know why, it was just the one they learned/heard about first
  - there is a lot of tooling for it

A lot of them even knew about the limitations of MongoDB but they still choose it.

We concluded that other databases need to start prioritising usability; something few developer tools usually care about.

[1] https//supabase.io

pier255y ago

So Supabase is like Hasura for REST?

When are you launching?

1 more reply

collyw5y ago

Maybe it's just because I know SQL reasonably well but I don't even find Mongo particularly easy to use. Not for complex queries anyway.

jedberg5y ago

I think the ease of use was more in the administration. It was (is?) super easy to set up and run (for small installations).

thomascgalvin5y ago

> Or people care more about speed and ease of use than durability

I think 90% of the Mongo installs I've been exposed to were set up by people that were tired of fighting with Hibernate configurations and schema migrations.

It's also popular among people whose definition of "legacy software" is "that app I stopped working on after three months because I have something shiny and new."

collyw5y ago

We have it at our work. I bet it's because it was the hip new thing to try out in 2013. Our tech lead is more into tech challenges than building a maintainable app.

cpuguy835y ago

I used it effectively to denormalize and combine some data from other services... sort of like a 2nd level, queryable cache. Worked very well for my needs. This was 7-8 yrs ago.

jedberg5y ago

Yes, this is true, it's good as a cache since it values speed over durability. But since it's not built for that, you could potentially do better with an in-memory database like Redis.

gwbas1c5y ago

I find with the MongoDB style of database, it's easy to prototype without needing to do the heavy schema management of SQL.

But, if you need a traditional ACID database, the flexibility comes with punch in the groin technical debt.

speedgoose5y ago

The Jepsen analysis : https://jepsen.io/analyses/mongodb-4.2.6

erulabs5y ago

I wonder if I'm the only sysadmin in the world who doesn't hate MongoDB. Yes, I wouldn't use it for new projects, and yes, I wish RethinkDB had taken its place, but it's not as horrible as people seem to think. Default configuration... If it weren't for RDS' doing PG-bouncer-style connection management, 95% of production postgres instances would probably fail. It innodb_buffer_pool_size wasn't set properly, plenty of data-centers would light on fire. If no one setup a firewall or AOF for redis, it's data-loss and data-exposure waiting to happen. If no one adds auth to an HTTP route, it's open to the world, etc etc etc. If tech-stacks were legos, software engineers would earn a heck of a lot less.

I absolutely agree it's been used by people who just don't want to write SQL queries, or being used as a text-search-engine in place of something like more appropriate like ElasticSearch, but to mock successful projects who were based on it seems silly. It reminds me of interviewing candidates at a startup who primarily used PHP/MySQL. Most of them openly laughed and called it all horrible. I voted "no" on them, and sometimes injected a somewhat toxic "ah, you're right - we should close up shop. Someone call Facebook - tell them their tech stack is horrible - shut it all down!".

You can learn a lot about a developer by asking "What do you think about Mongo, JavaScript, or PHP", and if their response isn't a shrug, they're probably more concerned with what editor is correct than if the product they're building is useful. It's an exceptional filter to reject zealots and find pragmatists.

All that said, MariaDB with MyRocks is _awesome_, but certainly not with the default settings :)

isatty5y ago

> You can learn a lot about a developer by asking "What do you think about Mongo, JavaScript, or PHP", and if their response isn't a shrug, they're probably more concerned with what editor is correct than if the product they're building is useful. It's an exceptional filter to reject zealots and find pragmatists.

Sure, if they’re being rude about it. A developer saying that it will not fit the use case or talking about spending a month of their time fixing a production issue caused by MongoDB will definitely not get a “no” from me. I’m not hiring subservient people I’m hiring people who can think for themselves and choose the right tool for the job, which Mongo rarely is.

erulabs5y ago

Yes, I couldn't agree more. Didn't mean people can't have opinions of course - just that they shouldn't be scoffing/laughing at tools - to me its a particularly red flag. Not wanting to use a particular tool is totally fine!

1 more reply

ianamartin5y ago

RethinkDB is a better solution to every problem that MongoDB claims to solve. I wouldn't use it for everything. But once my need for a document store outgrows what's convenient and easy in Postgres with JSONB, I reach for Rethink. It's great. There's a Jepsen analysis of it a while back too that is quite positive.

It's a shame that Rethink did so many things right and failed as a company while Mongo continues to do almost everything wrong as a company and still gets business.

rixed5y ago

> It's a shame that Rethink did so many things right and failed as a company while Mongo continues to do almost everything wrong as a company and still gets business.

This seems to be more the rule than the exception, doesn't it?

It's even not that hard to come up with explanations for this, main one certainly being that popularity depends essentially upon simplicity.

And simplicity might not even be economically as inept as we would like it to be. Indeed, since only a small minority of all the systems that are designed reach production and stay there for long then it can make sense to use the quickest piece of junk available, at least until proven it will stick.

akamaozu5y ago

My current data solution is layers of code on top of redis, trying really hard to be everything Rethink was.

Easy access to changelogs should be an "easy to access" feature in all databases. Event driven systems aren't rare: the data store needs to be done to tell interested parties that underlying data has changed.

coffeemug5y ago

Imagine how I feel about it :)

Ice_cream_suit5y ago

There is much amusement to be obtained from reading Jepsen's report:

"MongoDB’s default level of write concern was (and remains) acknowledgement by a single node, which means MongoDB may lose data by default.

...Similarly, MongoDB’s default level of read concern allows aborted reads: readers can observe state that is not fully committed, and could be discarded in the future. As the read isolation consistency docs note, “Read uncommitted is the default isolation level”.

We found that due to these weak defaults, MongoDB’s causal sessions did not preserve causal consistency by default: users needed to specify both write and read concern majority (or higher) to actually get causal consistency. MongoDB closed the issue, saying it was working as designed"

http://jepsen.io/analyses/mongodb-4.2.6

crazybit5y ago

MongoDB is horrible, I get it.

What do I use in this situation:

1) I need to store 100,000,000+ json files in a database

2) query the data in these json files

3) json files come from thousands upon thousands of different sources, each with their own drastically different "schema"

4) constantly adding more json files from constantly new sources

5) no time to figure out the schema prior to adding into the database

6) don't care if a json file is lost once in awhile

7) only 1 table, no relational tables needed

8) easy replication and sharding across servers sought after

9) don't actually require json, so long as data can be easily mapped from json to database format and back

10) can self host, no cloud only lock-in

Recommendations?

gilbetron5y ago

Elasticsearch? http://smnh.me/indexing-and-searching-arbitrary-json-data-us...

Depends on what your queries look like, I guess.

inglor5y ago

Just adding that I have used elasticsearch for a use case under the above constraints several times in the past and it worked well.

Ironically once because mongo was such a pain to work with I dumped the data from it into ES to get the better API, usability and Kibana.

kreetx5y ago

I don't think it's that simple (being horrible). MongoDB can be great for some specific situations, perhaps yours. It's just that it's not for many others, and you'd need to be an expert to find this out from the docs.

oblio5y ago

Postgresql with 1 table with JSON fields?

NelsonMinar5y ago

I think it's remarkable this report has been out for a week now and no one at MongoDB has commented on it. At least, not that I have seen.

pengaru5y ago

Maybe they're too busy spending their MDB money.

https://www.google.com/search?q=NASDAQ:+MDB

threeseed5y ago

I genuinely am confused by comments like this.

Are companies not supposed to invest money into their product, sales, people etc ?

And why does being listed on the NASDAQ imply being flush with money ?

2 more replies

seemslegit5y ago

"We found that due to these weak defaults, MongoDB’s causal sessions did not preserve causal consistency by default: users needed to specify both write and read concern majority (or higher) to actually get causal consistency. MongoDB closed the issue, saying it was working as designed, and updated their isolation documentation to note that even though MongoDB offers “causal consistency in client sessions”, that guarantee does not hold unless users take care to use both read and write concern majority. A detailed table now shows the properties offered by weaker read and write concerns."

That sounds like a valid redress, or am I missing something ?

Smaug1235y ago

Kyle's point is that it's arguably valid but certainly unhelpful: the default settings are liable to lead to data loss. Moreover, he draws attention specifically to transactions as something which you would expect to make things safer, but in fact there's a rather arcane part of the documentation that notes that you need to manually specify both read and write concerns on every transaction individually if you want transactions to behave consistently, regardless of the concerns specified at the database level.

Basically, there are a large number of pitfalls that it's very easy to fall into unless you have an encyclopaedic knowledge of the documentation, and you need to ignore some of the words that are used (like "transaction" or "ACID") because they carry connotations that either do not apply or only apply if you do extra work to make it so.

jb36895y ago

> the default settings are liable to lead to data loss

In Mongo's defense, the defaults are similar to what you would likely have with a replicated MySQL/Postgres cluster (single node accepting writes with slaves replicating from there; no concept of write concern). My assumption here is that he is assuming the primary dies before the writes have replicated to the secondaries; that is exactly how master-slave fails too. Maybe there are systems folks can use for having write concern in those databases, but in the companies I've worked for we didn't have them and we definitely didn't have automated failovers

scarface745y ago

How is this any different than DynamoDB where you specify that you want either eventual consistency vs strong consistency? DDB also does eventual consistent reads by default.

Is the argument that Mongo’s documentation isn’t clear?

3 more replies

arpa5y ago

Oh, Jepsen and MongoDB again? Somebody get the popcorn!

balfirevic5y ago

Unfortunately, not an entertaining showdown - too one-sided.

arpa5y ago

I remember having immensely enjoyed the original "Call me maybe" analysis [https://aphyr.com/posts/284-jepsen-mongodb]. Sometimes it's just fun to see someone beaten.

saagarjha5y ago

Because MongoDB is web scale?

2 more replies

sacks2k5y ago

I still remember when MongoDB was the new kid on the block and it was lauded as the only thing you should be using here on HN.

I'm glad my gut instinct was correct and that it really wasn't worth the hype. It reminds me of Ruby on Rails.

nexuist5y ago

I've never used RoR but I know people that still swear by it. It's outdated by today's "standards," but ActiveRecord was and is still a gem (heh) and a lot of RoR's foundational principles have been adopted by the existing major frameworks.

Regardless of technical acumen, I believe RoR doesn't deserve to be compared to Mongo for one reason: the RoR developers never tried to gaslight their users into thinking they're the reason everything broke; they never said only "if you know what you're doing" can you avoid these hidden pitfalls.

veritas32415y ago

Every time I see a post about Mongo it makes me wonder what could have been if RethinkDB was managed differently.

winrid5y ago

I worked at one company where the network traffic just on the MongoDB master was around 2gb/s. We had machines with terrabytes of memory, and Mongo worked fine - until we had some replica set nightmares. Mongo support is amazing, but when replication breaks it's very hard to diagnose (usually it was our fault, but it felt very fragile).

holoduke5y ago

I used mongodb for 1 year for a milti million user app. I abondened it. The reliability and stability is just not good. I wanted it to be good, but it turned out to be a different

Too5y ago

Ok, so defaults suck, marketing is misleading, documentation and error messages are not exactly obvious. Assuming you are already stuck in the soup, putting those issues aside and getting practical instead instead of throwing more fire on the discussion:

If you set w: majority and r: linearizable/snapshot, both on collection, client and on transactions. Plus assuming you accept snapshot over Isolation. How bad are those remaining cases in reality and how do these issues compare to other databases? The final "read your future writes" error looks quite scary and does not seem to be caused by configuration error, same with "duplicate effects".

eternalban5y ago

"Informally, I would summarize the CAP theorem as: If the network is broken, your database won’t work."

- Dwight Merriman, former CEO, and "one of the original authors of MongoDB" [1]

A word to the wise suffices. Sometimes the word in question is implied by other words.

For those who get this oblique post, note that throwing the above bon mot in an interview session for a "distributed systems engineer" and asking for an opinion is a excellent way to differentiate between Peter Principle and Principal Engineer.

[1]: https://web.archive.org/web/20100903213540/http://blog.mongo...

twoodfin5y ago

Discussed previously:

https://news.ycombinator.com/item?id=23191439

dang5y ago

Surprisingly, it seems not to have made the front page: http://hnrankings.info/23191439/. There's clearly community appetite to discuss this, so we won't treat the current submission as a dupe.

kevinburke5y ago

“ Did HN's antispam measures get a lot more aggressive recently? The last handful of Jepsen reports have really struggled to make it to frontpage, despite significantly higher vote-to-age ratios than comparable posts. Once they're on FP, they reliably hit top 10, but Dgraph's (1/2) ” https://twitter.com/jepsen_io/status/1261640852666855426

1 more reply

matthewborden5y ago

Our company migrated away from MongoDB, here's a talk about how we did it, in case you're thinking about what is involved and how to do it safely: https://www.youtube.com/watch?v=Knd3m2qh0o8

mmackh5y ago

Ubiquity used MongoDB for their CloudKey Gen1 series. When there was an unexpected shutdown, there’s a random chance it would lose its configuration [1]. If your SD backup didn’t work, you’d lose configuration for all WiFi hotspots. If you did client installs like I did, this was a total nightmare. How did they solve it? Release new, more expensive hardware with a battery backup acting like a UPS. Never solved Gen1 issues. Imagine your phone corrupting after a hard reset. Thanks Ubiquity & MongoDB

[1] https://community.ui.com/questions/MongoDB-corrupt-after-eve...

numlock865y ago

If you want to be "that guy" on parties, ask people what MongoDB is trying so solve. If they bring up the typical "noSQL document store" stuff, aks them why you'd want to use MongoDB for that.

KingOfCoders5y ago

MongoDB uninstalled our cloud hosted cluster once and the site was down and we needed to setup a large database from backups. Their response was very unhelpful. I would never touch MongoDB again.

whoevercares5y ago

Regardless of tech, MDB is a weird stock that go up steadily every time.

miked855y ago

I have never understood the stock price. I tried shorting at one point, that was a mistake.

aphyr5y ago

It looks like relatively few people clicked through to read the analysis itself, so @dang's kindly offered to repost it. You can find the analysis here:

https://jepsen.io/analyses/mongodb-4.2.6

... and the corresponding HN thread here:

https://news.ycombinator.com/item?id=23290844

jwr5y ago

If you're looking for MongoDB done right, it does exist and it's called RethinkDB. For some reason it didn't catch on and become popular — but it's nicer, and most importantly, it doesn't lose your data.

Data point: I have been running my production system (a fairly complex SaaS) on RethinkDB for the last 4 years.

neximo645y ago

RethinkDB is no longer supported, its major caveat.

jwr5y ago

Yes. Although the degree of "support" always depends on how much you pay for it :-) I doubt MongoDB is "supported" in the way most people understand that word.

From my point of view, RethinkDB is not regularly developed and improved. There is progress, but it's slow. Which is a pity, because it's a really good database, and one that tries really hard to be correct above all else.

The only other correct distributed database with strict serializable guarantees that I know of is FoundationDB, which nowhere near as easy to use as RethinkDB is (but it's somewhat easier with their document layer, which pretends to be MongoDB, just done right).

rmdashrfstar5y ago

Main argument for using document-oriented databases: https://martinfowler.com/bliki/AggregateOrientedDatabase.htm...

pier255y ago

Anyone has a recommendation for a NoSQL database?

https://news.ycombinator.com/item?id=23253870

(not Mongo obviously)

balfirevic5y ago

This question sounded familiar - turns out I replied to it in another thread: https://news.ycombinator.com/item?id=23286054

To repeat my (non)answer:

There is no way to recommend NoSQL database without knowing what you need it for because NoSQL databases are highly specialized systems. If you need general-purpose database use an SQL one.

It's kind of a weird question, now that I think about it. Why would anyone seek out a database based on what it doesn't have?

lmm5y ago

I'd actually say the reverse. SQL databases are highly specialised datastores: they make sense if you need one particular transaction model and one particular query language and are prepared to coerce your data into one particular model to do so.

If you're starting from just "I need to store some data" I'd look to e.g. Riak or Cassandra before looking to an SQL database.

1 more reply

hartator5y ago

[repost - asking for help] I am disappointed with the direction that MongoDB took this past few years. Going ACID shows in benchmarks [1] and it’s not advisable if you are using MongoDB for stats and queue. (No one uses MongoDB for financial transactions despite the changes.)

And the recent change to a restrictive license is worrisome as well. I have been thinking of forking 3.4 and make it back to “true” open source and awesome performance. (If any C++ devs want to help out, reach out to me! username @gmail.com)

[1] https://link.medium.com/PXIeZfhhH6

toomuchtodo5y ago

Why not use PostgreSQL instead? It supports a JSON document data type natively. It also has exceptional stewardship as an open source project.

Mongo should never be a first choice, but a last choice for edge cases.

wdb5y ago

I really enjoy using PostgreSQL only I just don't know how to make it scale easily. Running it on large VM in the cloud works fine until you have lots of data or need it easily accessible. How can you have the data in three different regions (e.g. Europe, US, Asia) when you using something like Google Cloud? Seems to be a hard problem to crack.

2 more replies

aeonsky5y ago

Postgres has terrible indexing with json. It doesn’t keep statistics so simple queries sometimes take much longer than expected due to query planner not knowing much about the data.

6 more replies

hartator5y ago

> Why not use PostgreSQL instead? It supports a JSON document data type natively.

Yes, that's the thing, it's just a field type. It's not really that different than dumping your JSON in a TEXT column. MongoDB is fun because it's truly JSON - BSON - so you don't have to run migrations, you can store complex documents, and have a more object oriented way of storing your data than SQL.

5 more replies

manigandham5y ago

Postgres already handles JSON well. MySQL does a good job now too. And there are tons of other JSONb/document-stores like Couchbase, CouchDB, RavenDB, MarkLogic, ElasticSearch, ArangoDB, CosmosDB, AWS DocumentDB, and even RethinkDB that still exists.

It's a nice goal but there's likely not much of a commercial market for it, if that's your roadmap.

JoshTriplett5y ago

> And the recent change to a restrictive license is worrisome as well. I have been thinking of forking 3.4 and make it back to “true” open source and awesome performance.

Please do; someone needs to take that first step, and then many more could potentially contribute.

DabbyDabberson5y ago

the license change was needed to keep MDB alive. Amazon's documentDB is just a fork of mdb before the new license.

codecamper5y ago

<rant>

This corruption is brought on by the stock market.

Have a look also at Shopify. They go and tack on 2% fees when customers use Google Pay or Apple Pay to checkout with. They recently announced that FB would be pulling ecom sales within app, and yet Shopify plans to charge 2% on top of FB fees. That's what I could gather despite the pricing being rather opaque.

Is this a step forward or backwards? Charging 2% / transaction for modern Internet protocols running on cheap hardware across a public network?

</rant>

jpxw5y ago

Obligatory https://www.youtube.com/watch?v=b2F-DItXtZs

crackinmalackin5y ago

Can anyone share any positive experiences with MongoDB? I wouldn’t think MongoDB as perfect like any other piece of tech, but the unanimous hatred for it seems a little overblown. Not trying to discredit the bad experiences people have had with it. Just curious to know where people are using it successfully

Hydraulix9895y ago

This has been a known issue for a while:

https://hackingdistributed.com/2013/01/29/mongo-ft/

MongoDB: Broken By Design

threeseed5y ago

Might want to read up as this involves a completely different set of issues.

And most of those listed in the blog were fixed many years before 2013.

Hydraulix9895y ago

Actually, I read both articles. In fact, the author of the first article was my very own distributed systems professor in school. The persisting issue in both articles is a non rigorous specification of when a write actually completed. Both articles point out that a fault tolerant database should be ACID compliant, which does not live up to MongoDB’s claims.

Beefin5y ago

This is a good of the HN MDB hate: everything referenced has been addressed long before 2013. It was a new DB then and early adopters should know what they’re getting into

Hydraulix9895y ago

No. New doesn’t mean broken. Hyperdex was brand new at the time and still managed to be consistent with data.

etxm5y ago

MongoDB is the /dev/null of databases

therealdrag05y ago

How is Cassandra as an alternative to MongoDB?

fennecfoxen5y ago

Did you ever play a game of Civilization and attack an entry-level Warrior with a stack of machine gun infantry?

Well, the warrior has lower upkeep costs. Keep that in mind.

threeseed5y ago

It's not.

The only thing similar about the two is that they both store data and have the letter D in their name. Otherwise they are completely different, Cassandra being a BigTable style database and MongoDB being a document one.

jb36895y ago

I mean, they are completely different. MongoDB is more-or-less a traditional RDBMS with automated failover and trying to staple on more advanced features. Cassandra is a masterless DynamoDB-ish database with features like hinted handoffs. You really need to know how consistency and distributed systems work if you're looking to pick Cassandra. It's a great implementation, you just can't compare it to MySQL/Postgres/etc like you can with Mongo

ianamartin5y ago

wat.

I hope this is a joke.

m0zg5y ago

Dan Luu suggested on Twitter that MongoDB trolled Kyle into testing Jepsen again. I think they've made a mistake though. :-)

jtdev5y ago

It seems that the only tangible benefit remaining for DocumentDBs over SQL platforms (PostgreSQL, SQL Server, etc.) is scalability. Jr. devs thinking they can have a career in software dev without learning SQL is not a benefit.

gigatexal5y ago

Typical HN posts of late hating on Javascript and MongoDB from database elitists -- the thing is there's a tool for a job and as engineers we need to figure out what tool best suits our use cases. It could very well be a NoSQL database such as Mongo or a relational one like Postgres or MySQL.

calcifer5y ago

> the thing is there's a tool for a job

Really? Which job do you belive needs a "maybe store some of this data, sometimes" kind of database?

Andrew_nenakhov5y ago

I'm not defending mongodb in and sense and had stern talks with some of my junior developers who were too eager to try out this new hot mongo thingy on a new website, but there are plenty such jobs.

For example, climate data gathered from hundreds of thousands of devices every minute can very much survive some data to be lost. Or some astronomical observations data.

I wouldn't choose mongoDB for it, though.

1 more reply

j / k navigate · click thread line to collapse

399 comments

madhadron5y ago

kyllo5y ago

I was floored by this comment yesterday from one of their Developer Relations people:

> Did any of you actually read the article? We are passing the Jepsen test suite and it was back in 2017 already. So, no, MongoDB is not losing anything if you know what you are doing.

https://twitter.com/MBeugnet/status/1253622755049734150?s=20

Can you imagine saying the phrase "if you know what you are doing," in public, to your users, as a DevRel person? Unbelievable.

btown5y ago

Generally speaking, there are many levels of “if you know what you are doing.”

- The system warns about unsafe usage at either compile time or runtime, and you ignore at your peril.

- The system does not warn, but official documentation is consistently verbose about what is required for safety.

- Official documentation isn’t consistently helpful and can be downright dangerous, but the community picks up the slack.

1 more reply

dralley5y ago

That comment wasn't yesterday, it was a month ago. It was actually after that comment that Jepsen decided to retest MongoDB and write this new article

1 more reply

jen205y ago

Firstly let me point out that this response is neither intended as a defence of MongoDB defaults which are atrocious, or of the company, who are arguably duplicitous.

However I can _quite easily_ see how a non-native English speaker could use the phrase “if you know what you are doing” to mean “if you are careful”.

4 more replies

HelloNurse5y ago

Indirectly stating that they aren't good enough to use MongoDB properly could be offensive for thin-skinned developers, but it's only bad attitude.

1 more reply

mathattack5y ago

It’s not that he’s lying. He just has a tenuous arrangement with the truth.

macintux5y ago

The joke I learned early on: "Migrating away from Mongo is trivial: wait long enough, and all your data will be gone anyway."

I imagine things are better now.

MrBuddyCasino5y ago

MongoDB: the Snapchat of databases.

3 more replies

darkarmani5y ago

I just call it a probablistic datastore.

mathattack5y ago

They’re not.

mathattack5y ago

Let’s hope the data disappears after you’ve collected revenue but before you paid the expenses. Crap product, shady company, sleazeball salesforce.

chx5y ago

tasuki5y ago

Why is storing json in a database important to you? Whenever I see json fields in PostgreSQL/MySQL, I know I'm most likely in for inconsistent data and a world of pain.

5 more replies

ashtonkem5y ago

Every single time I've had to work on top of a Mongo cluster, it has gone into "three stooges" mode, where each node insists that one of the others is master.

I pretty much refuse to deploy a new instance of it now, I've been burned too often.

DonHopkins5y ago

"I'm trying to sync, but nothing happens!"

https://www.youtube.com/watch?v=mlejsgxOxrU

kipply5y ago

Tangentially related their sales strategy is questionable.

I have a shirt from MemSQL that says "Friends don't let friends NoSQL" and I wear it proudly.

square_usual5y ago

If their sales team is sending emails to interns, their strategy is very questionable indeed.

2 more replies

icedchai5y ago

Isn't it amazing MongoDB is a 12 billion dollar company? Someone is using it and actually paying for it, even though it's not any of the developers you or I know.

mathattack5y ago

It’s the switching costs to get rid of it...

1 more reply

sneak5y ago

You’d be astounded how common it is at so-called “enterprise” startups. It blew my mind.

A lot of people simply never went through the LAMP stack days and have little/no experience with real databases like Postgres (or even MySQL). It’s disheartening.

mathattack5y ago

I have found their salespeople to be the most sleazy and unethical of any that I’ve worked with. Much worse than all the other database vendors combined.

callalex5y ago

I know they are trying to beat Oracle in this sector, but I didn’t realize they meant it like this!

serkanh5y ago

Cant agree with this more! I have never met any sleazier sales reps than MOngodb.

1 more reply

tehlike5y ago

PacifyFish5y ago

How do you manage dev vs prod instances with sequelize? Specifically, do you use migrations for your local dB or just force sync it?

1 more reply

OOPMan5y ago

As a skateboarder I've always found the name itself rather amusing as the term mongo has relatively negative connotations in skating.

tincholio5y ago

It also does in Spanish, it's a (rather old, but still used) shortened version of "retard" (from "mongólico", originally used to describe people with Down's syndrome).

4 more replies

polote5y ago

Nobody seems to like it. Someone has any idea on why the company still has their revenue increasing ?

squeaky-clean5y ago

I like it, but I don't really really chime in on threads where it's mostly Mongo bashing and jokes, and I'm guessing others who like MongoDB do the same.

2 more replies

wolco5y ago

I like it as in certain situations.

For situations where you don't know the schema or for different schemas per record mongo is a great place to dump.

Reporting was a little bit more difficult but somehow rewarding.

MildlySerious5y ago

Thaxll5y ago

Because good engineers don't take for granted what's written on HN and do a thorough evaluation of a product, MongoDB or something else.

1 more reply

dehrmann5y ago

But their marketing team early on was amazing.

bbulkow5y ago

that acquisition that used mongo got spun off and is now Honeycomb.io . nice people.

madhadron5y ago

Diederich5y ago

> but there is no MongoDB use in Facebook hasn't been for years

Are you sure?

madhadron5y ago

2 more replies

hypewatch5y ago

1 more reply

naked-ferret5y ago

From the jepsen report:

"""

> MongoDB offers among the strongest data consistency, correctness, and safety guarantees of any database available today.

"""

This is a really professional to tell someone to stop their nonsense.

mathattack5y ago

Amazing that anyone can trust Mongo after this BS.

Thaxll5y ago

MySQL and PG are not truly consistent per default, they don't fsync every writes.

MongoDB explains that pretty well: https://www.mongodb.com/faq and https://docs.mongodb.com/manual/core/causal-consistency-read...

castorp5y ago

> MySQL and PG are not truly consistent per default, they don't fsync every writes.

Postgres most certainly does fsync by default.

It's tru, you can disable it, but there is a big warning about "may corrupt your database" in the config file.

1 more reply

Carpetsmoker5y ago

Whatever failings MySQL or PostgreSQL may or may not have are not important at all here.

wolf550e5y ago

The default in MySQL and in postgresql is to fsync before commit and afaik that has always been the default.

1 more reply

foobarian5y ago

From top of linked article:

It's not wrong, just misleading. Seems overblown given that most practitioners know how to read this kind of marketing speak.

takeda5y ago

> It's not wrong, just misleading. Seems overblown given that most practitioners know how to read this kind of marketing speak.

The reason why people lost their trust with mongo wasn't technical, it was this.

lostcolony5y ago

I appreciate your optimism in thinking that most (all?) people reaching for distributed systems actually know enough in the space to evaluate such claims.

1 more reply

thomascgalvin5y ago

You can tell a lot about a developer by their preferred database.

* Mongo: I like things easy, even if easy is dangerous. I probably write Javascript exclusively

* MySQL: I don't like to rock the boat, and MySQL is available everywhere

* PostgreSQL: I'm not afraid of the command line

* H2: My company can't afford a database admin, so I embedded the database in our application (I have actually done this)

* SQLite: I'm either using SQLite as my app's file format, writing a smartphone app, or about to realize the difference between load-in-test and load-in-production

* RabbitMQ: I don't know what a database is

* Redis: I got tired of optimizing SQL queries

* Oracle: I'm being paid to sell you Oracle

ceocoder5y ago

Did I miss something huge?

Hamuko5y ago

>This might be a stupid question, but surely no one thinks of RabbigMQ as a database right?

Arguably the world's most popular database is Microsoft Excel.

2 more replies

twic5y ago

I once worked on a system for notifying customers of events by posting to their APIs. Events came in on a Rabbit queue and got posted.

I didn't think of RabbitMQ as a database when I started that work, but it looked a lot like it by the time I finished.

2 more replies

henryfjordan5y ago

But also no, RabbitMQ and Kafka and the like are clearly message buses and though they might also technically qualify as a DB it would be a poor descriptor.

3 more replies

danpalmer5y ago

detaro5y ago

I interpret it as they'd probably not call it a database, but they might use it in places where a database would be better suited, and effectively store data in it.

11235813215y ago

ForHackernews5y ago

I've heard MySQL (well, MariaDB, really) has improved a lot in recent years, but I still can't imagine why I'd ever choose it over Postgres for a professional project. Is there any reason?

It used to be that bargain basement shared-hosting providers would only give you a LAMP stack, so it was MySQL or nothing. But if you're on RDS, Postgres every time for my money.

1 more reply

gav5y ago

I tend to find people who argue with me against MySQL bring up things that haven't been true in a long time such as Unicode or NULL handling.

I'd probably choose Postgres over MySQL for a new project just to have the improved JSON support, but there's upsides to MySQL too:

- Per-thread vs per-process connection handling

- Ease of getting replication running

- Ability to use alternate engines such as MyRocks

6 more replies

ashtonkem5y ago

I prefer PostgreSQL, but MySQL provides a better clustering experience if you need more read capacity than a lone node can provide.

larrik5y ago

Given both my experience and prior research, I don't believe you that Oracle is ever better than have the stuff on the above list, and I think it's worse than Postgres on every metric.

Every time I need to work with an Oracle DB it costs me weeks of wasted time.

tjalfi5y ago

SQL Server: I use C# and write line-of-business applications.

yellowapple5y ago

My day job involves developing for / customizing / maintaining two separate third-party systems that rely on SQL Server (one of them optionally supports Oracle, but fuck that).

dustingetz5y ago

Datomic: I'm done already, send more work please

jugg1es5y ago

I've been working with a partner company that is using Datomic to back a relatively impressive product - but I don't really see much written about it. What has been your experience?

pnako5y ago

>* H2: My company can't afford a database admin, so I embedded the database in our application (I have actually done this)

I'm familiar with the variant, "InfoSec won't let us deploy a DB on the same host".

anaphor5y ago

SQLite: I enjoy using reliable and correct databases even at the cost of scalability

erik_seaberg5y ago

SQLite has always intentionally failed to report this error:

  sqlite> create table foo (n int);
  sqlite> insert into foo (n) values ('dave');
  sqlite> select count(*) from foo where n = 'dave';
  1

1 more reply

chipotle_coyote5y ago

I admit I was kind of thinking that, even though I appreciated the humor. :) I imagine an awful lot of web sites out there would do just fine with SQLite as their back end.

1 more reply

tgsovlerkhgsel5y ago

mping5y ago

Curiously, I just read this: https://blog.expensify.com/2018/01/08/scaling-sqlite-to-4m-q...

1 more reply

mauflows5y ago

* Cockroach / Spanner: you know what's cooler than millions?

benibela5y ago

What if they prefer an XML database (like basex, exist, marklogic)?

kstrauser5y ago

We ask them politely, yet firmly, to leave.

kabes5y ago

Psychopath

OOPMan5y ago

I've used H2 Ina couple of my personal JVM applications mainly because when it comes to JVM it's a somewhat nicer fit than SQLite

golergka5y ago

I love postgresql, but I don't remember when did I last interact with it with command line instead of pgadmin.

tester7565y ago

What about MSSQL?

geerlingguy5y ago

"We are a Microsoft-only shop"

yawaramin5y ago

I have other boats to rock than MySQL! ;-)

philwelch5y ago

Neo4j?

maevyn115y ago

ha, nailed it dude.

cgijoe5y ago

HAHAHAH The RabbitMQ one got me. Have your upvote, sir.

tonigen5y ago

MySQL is actually amazing, scale better than PGsql supports Json and is available everywhere. I see no reason to use any other dB for 90% of the use cases u need a dB for

dijit5y ago

MySQL does not scale better than PostgreSQL.

MySQL has some kind of mutex lock that stalls all threads, it's not noticeable until you have 48cores, 32 databases and a completely unconstrained I/O.

EDIT: it was PG 9.4 not 9.5

2 more replies

Thaxll5y ago

kinghajj5y ago

https://www.postgresql.org/docs/10/different-replication-sol...

Logical replication or synchronous multimaster replication may meet your needs.

threeseed5y ago

And you can tell a lot about a developer when they post comments like this.

Almost none of is remotely accurate e.g. RabbitMQ isn't even a database.

wzy5y ago

I can't believe the one item that was so obviously added as a joke went right over head.

It may be good idea to take a break from the computer and find something less stressful to do.

1 more reply

macmac5y ago

Re RabbitMQ, isn't that OPs point.

beardbandit5y ago

Man, people really hate Mongo.

We use it for a very specific use case and its been perfect for us when we need raw speed over everything. Data loss is tolerable.

1 more reply

rickbad685y ago

LOL

dang5y ago

VonGuard5y ago

Lying about your test results from Jepsen is like going onto a reality show with Chef Ramsey, being thrown off for incompetence, then putting his name on your restautant's ads "Chef Ramsey ate here!"

I'd pay to watch Kyle screaming at people in the MongoDB offices, not that he screams or anything. Just a spectacular mental image: "IT'S NOT ATOMIC! IT COULDN'T SERIALIZE A DOG'S DINNER!"

jagannathtech5y ago

I would watch a tech version of Ramsey's show.. oh boy!

tluyben25y ago

ncmncm5y ago

gwbas1c5y ago

Translation: They were trying to be everything for everybody.

The syntax is very nice, I honestly think a lot of it's early success came from ease of use.

threeseed5y ago

ncmncm5y ago

It is possible there are still potential users not buying until they get that story. MDB wants those users.

jedberg5y ago

MongoDB started life as a database designed for speed and ease of use over durability. That's not a good look for a database.

People have told me that they have since changed, but the evidence is overwhelmingly and repeatedly against them.

They seem to have been successful on marketing alone. Or people care more about speed and ease of use than durability, and my assumptions about what people want in a database are just wrong.

otterley5y ago

> MongoDB started life as a database designed for speed and ease of use over durability. That's not a good look for a database.

I think it depends. One could say the same about Redis, but it's wildly successful and people love it.

gav5y ago

It's easier to reason about systems if there's fewer things that require durability guarantees, ideally you want to be able to draw data flows that look like a tree instead of a graph.

bjt5y ago

Yes. What I love most about Redis is that the fundamental tradeoffs of the algorithms it's built on are surfaced up through the interface, and made very plain in the documentation.

Jare5y ago

collyw5y ago

>In our case, durability was a smaller concern by design (lots of write-only data, lots of ephemeral data),

I assume that you mean write once data. If you mean write only you might as well use /dev/null.

2 more replies

kiwicopple5y ago

We[1] have done 50+ conversations with developers this year (mostly indie and small startups). You’re right about the ease of use. The top reasons are

  - they don’t know why, it was just the one they learned/heard about first
  - there is a lot of tooling for it

A lot of them even knew about the limitations of MongoDB but they still choose it.

We concluded that other databases need to start prioritising usability; something few developer tools usually care about.

[1] https//supabase.io

pier255y ago

So Supabase is like Hasura for REST?

When are you launching?

1 more reply

collyw5y ago

Maybe it's just because I know SQL reasonably well but I don't even find Mongo particularly easy to use. Not for complex queries anyway.

jedberg5y ago

I think the ease of use was more in the administration. It was (is?) super easy to set up and run (for small installations).

thomascgalvin5y ago

> Or people care more about speed and ease of use than durability

I think 90% of the Mongo installs I've been exposed to were set up by people that were tired of fighting with Hibernate configurations and schema migrations.

It's also popular among people whose definition of "legacy software" is "that app I stopped working on after three months because I have something shiny and new."

collyw5y ago

We have it at our work. I bet it's because it was the hip new thing to try out in 2013. Our tech lead is more into tech challenges than building a maintainable app.

cpuguy835y ago

I used it effectively to denormalize and combine some data from other services... sort of like a 2nd level, queryable cache. Worked very well for my needs. This was 7-8 yrs ago.

jedberg5y ago

Yes, this is true, it's good as a cache since it values speed over durability. But since it's not built for that, you could potentially do better with an in-memory database like Redis.

gwbas1c5y ago

I find with the MongoDB style of database, it's easy to prototype without needing to do the heavy schema management of SQL.

But, if you need a traditional ACID database, the flexibility comes with punch in the groin technical debt.

speedgoose5y ago

The Jepsen analysis : https://jepsen.io/analyses/mongodb-4.2.6

erulabs5y ago

All that said, MariaDB with MyRocks is _awesome_, but certainly not with the default settings :)

isatty5y ago

erulabs5y ago

1 more reply

ianamartin5y ago

It's a shame that Rethink did so many things right and failed as a company while Mongo continues to do almost everything wrong as a company and still gets business.

rixed5y ago

> It's a shame that Rethink did so many things right and failed as a company while Mongo continues to do almost everything wrong as a company and still gets business.

This seems to be more the rule than the exception, doesn't it?

It's even not that hard to come up with explanations for this, main one certainly being that popularity depends essentially upon simplicity.

akamaozu5y ago

My current data solution is layers of code on top of redis, trying really hard to be everything Rethink was.

coffeemug5y ago

Imagine how I feel about it :)

Ice_cream_suit5y ago

There is much amusement to be obtained from reading Jepsen's report:

"MongoDB’s default level of write concern was (and remains) acknowledgement by a single node, which means MongoDB may lose data by default.

http://jepsen.io/analyses/mongodb-4.2.6

crazybit5y ago

MongoDB is horrible, I get it.

What do I use in this situation:

1) I need to store 100,000,000+ json files in a database

2) query the data in these json files

3) json files come from thousands upon thousands of different sources, each with their own drastically different "schema"

4) constantly adding more json files from constantly new sources

5) no time to figure out the schema prior to adding into the database

6) don't care if a json file is lost once in awhile

7) only 1 table, no relational tables needed

8) easy replication and sharding across servers sought after

9) don't actually require json, so long as data can be easily mapped from json to database format and back

10) can self host, no cloud only lock-in

Recommendations?

gilbetron5y ago

Elasticsearch? http://smnh.me/indexing-and-searching-arbitrary-json-data-us...

Depends on what your queries look like, I guess.

inglor5y ago

Just adding that I have used elasticsearch for a use case under the above constraints several times in the past and it worked well.

Ironically once because mongo was such a pain to work with I dumped the data from it into ES to get the better API, usability and Kibana.

kreetx5y ago

oblio5y ago

Postgresql with 1 table with JSON fields?

NelsonMinar5y ago

I think it's remarkable this report has been out for a week now and no one at MongoDB has commented on it. At least, not that I have seen.

pengaru5y ago

Maybe they're too busy spending their MDB money.

https://www.google.com/search?q=NASDAQ:+MDB

threeseed5y ago

I genuinely am confused by comments like this.

Are companies not supposed to invest money into their product, sales, people etc ?

And why does being listed on the NASDAQ imply being flush with money ?

2 more replies

seemslegit5y ago

That sounds like a valid redress, or am I missing something ?

Smaug1235y ago

jb36895y ago

> the default settings are liable to lead to data loss

scarface745y ago

How is this any different than DynamoDB where you specify that you want either eventual consistency vs strong consistency? DDB also does eventual consistent reads by default.

Is the argument that Mongo’s documentation isn’t clear?

3 more replies

arpa5y ago

Oh, Jepsen and MongoDB again? Somebody get the popcorn!

balfirevic5y ago

Unfortunately, not an entertaining showdown - too one-sided.

arpa5y ago

I remember having immensely enjoyed the original "Call me maybe" analysis [https://aphyr.com/posts/284-jepsen-mongodb]. Sometimes it's just fun to see someone beaten.

saagarjha5y ago

Because MongoDB is web scale?

2 more replies

sacks2k5y ago

I still remember when MongoDB was the new kid on the block and it was lauded as the only thing you should be using here on HN.

I'm glad my gut instinct was correct and that it really wasn't worth the hype. It reminds me of Ruby on Rails.

nexuist5y ago

veritas32415y ago

Every time I see a post about Mongo it makes me wonder what could have been if RethinkDB was managed differently.

winrid5y ago

holoduke5y ago

I used mongodb for 1 year for a milti million user app. I abondened it. The reliability and stability is just not good. I wanted it to be good, but it turned out to be a different

Too5y ago

eternalban5y ago

"Informally, I would summarize the CAP theorem as: If the network is broken, your database won’t work."

- Dwight Merriman, former CEO, and "one of the original authors of MongoDB" [1]

A word to the wise suffices. Sometimes the word in question is implied by other words.

[1]: https://web.archive.org/web/20100903213540/http://blog.mongo...

twoodfin5y ago

Discussed previously:

https://news.ycombinator.com/item?id=23191439

dang5y ago

Surprisingly, it seems not to have made the front page: http://hnrankings.info/23191439/. There's clearly community appetite to discuss this, so we won't treat the current submission as a dupe.

kevinburke5y ago

1 more reply

matthewborden5y ago

Our company migrated away from MongoDB, here's a talk about how we did it, in case you're thinking about what is involved and how to do it safely: https://www.youtube.com/watch?v=Knd3m2qh0o8

mmackh5y ago

[1] https://community.ui.com/questions/MongoDB-corrupt-after-eve...

numlock865y ago

If you want to be "that guy" on parties, ask people what MongoDB is trying so solve. If they bring up the typical "noSQL document store" stuff, aks them why you'd want to use MongoDB for that.

KingOfCoders5y ago

MongoDB uninstalled our cloud hosted cluster once and the site was down and we needed to setup a large database from backups. Their response was very unhelpful. I would never touch MongoDB again.

whoevercares5y ago

Regardless of tech, MDB is a weird stock that go up steadily every time.

miked855y ago

I have never understood the stock price. I tried shorting at one point, that was a mistake.

aphyr5y ago

It looks like relatively few people clicked through to read the analysis itself, so @dang's kindly offered to repost it. You can find the analysis here:

https://jepsen.io/analyses/mongodb-4.2.6

... and the corresponding HN thread here:

https://news.ycombinator.com/item?id=23290844

jwr5y ago

Data point: I have been running my production system (a fairly complex SaaS) on RethinkDB for the last 4 years.

neximo645y ago

RethinkDB is no longer supported, its major caveat.

jwr5y ago

Yes. Although the degree of "support" always depends on how much you pay for it :-) I doubt MongoDB is "supported" in the way most people understand that word.

rmdashrfstar5y ago

Main argument for using document-oriented databases: https://martinfowler.com/bliki/AggregateOrientedDatabase.htm...

pier255y ago

Anyone has a recommendation for a NoSQL database?

https://news.ycombinator.com/item?id=23253870

(not Mongo obviously)

balfirevic5y ago

This question sounded familiar - turns out I replied to it in another thread: https://news.ycombinator.com/item?id=23286054

To repeat my (non)answer:

There is no way to recommend NoSQL database without knowing what you need it for because NoSQL databases are highly specialized systems. If you need general-purpose database use an SQL one.

It's kind of a weird question, now that I think about it. Why would anyone seek out a database based on what it doesn't have?

lmm5y ago

If you're starting from just "I need to store some data" I'd look to e.g. Riak or Cassandra before looking to an SQL database.

1 more reply

hartator5y ago

[1] https://link.medium.com/PXIeZfhhH6

toomuchtodo5y ago

Why not use PostgreSQL instead? It supports a JSON document data type natively. It also has exceptional stewardship as an open source project.

Mongo should never be a first choice, but a last choice for edge cases.

wdb5y ago

2 more replies

aeonsky5y ago

Postgres has terrible indexing with json. It doesn’t keep statistics so simple queries sometimes take much longer than expected due to query planner not knowing much about the data.

6 more replies

hartator5y ago

> Why not use PostgreSQL instead? It supports a JSON document data type natively.

5 more replies

manigandham5y ago

It's a nice goal but there's likely not much of a commercial market for it, if that's your roadmap.

JoshTriplett5y ago

> And the recent change to a restrictive license is worrisome as well. I have been thinking of forking 3.4 and make it back to “true” open source and awesome performance.

Please do; someone needs to take that first step, and then many more could potentially contribute.

DabbyDabberson5y ago

the license change was needed to keep MDB alive. Amazon's documentDB is just a fork of mdb before the new license.

codecamper5y ago

<rant>

This corruption is brought on by the stock market.

Is this a step forward or backwards? Charging 2% / transaction for modern Internet protocols running on cheap hardware across a public network?

</rant>

jpxw5y ago

Obligatory https://www.youtube.com/watch?v=b2F-DItXtZs

crackinmalackin5y ago

Hydraulix9895y ago

This has been a known issue for a while:

https://hackingdistributed.com/2013/01/29/mongo-ft/

MongoDB: Broken By Design

threeseed5y ago

Might want to read up as this involves a completely different set of issues.

And most of those listed in the blog were fixed many years before 2013.

Hydraulix9895y ago

Beefin5y ago

This is a good of the HN MDB hate: everything referenced has been addressed long before 2013. It was a new DB then and early adopters should know what they’re getting into

Hydraulix9895y ago

No. New doesn’t mean broken. Hyperdex was brand new at the time and still managed to be consistent with data.

etxm5y ago

MongoDB is the /dev/null of databases

therealdrag05y ago

How is Cassandra as an alternative to MongoDB?

fennecfoxen5y ago

Did you ever play a game of Civilization and attack an entry-level Warrior with a stack of machine gun infantry?

Well, the warrior has lower upkeep costs. Keep that in mind.

threeseed5y ago

It's not.

jb36895y ago

ianamartin5y ago

wat.

I hope this is a joke.

m0zg5y ago

Dan Luu suggested on Twitter that MongoDB trolled Kyle into testing Jepsen again. I think they've made a mistake though. :-)

jtdev5y ago

gigatexal5y ago

calcifer5y ago

> the thing is there's a tool for a job

Really? Which job do you belive needs a "maybe store some of this data, sometimes" kind of database?

Andrew_nenakhov5y ago

I'm not defending mongodb in and sense and had stern talks with some of my junior developers who were too eager to try out this new hot mongo thingy on a new website, but there are plenty such jobs.

For example, climate data gathered from hundreds of thousands of devices every minute can very much survive some data to be lost. Or some astronomical observations data.

I wouldn't choose mongoDB for it, though.

1 more reply

j / k navigate · click thread line to collapse