Reply to Aphyr attack on Redis Sentinel (opens in new tab)

(antirez.com)

171 pointsmattyb13y ago40 comments

40 comments

It's very refreshing to see here that "attack" is not used in the way that one might expect from just the headline, meaning "a possibly unwarranted criticism that I didn't like or found unfair, or that I am taking personally".

Legion13y ago

I am endlessly impressed with how antirez responds to any critique of Redis that I've ever seen. He's always taken it as a positive, and looked for the truth in the critique, rather than searching for something to be wrong and try to discredit the critique.

My opinion of him and the Redis project increases further every time.

danso13y ago

Really? I hadn't seen the original posts before clicking on the this one and I assumed this was some kind of security breach...I hadn't heard of Aphyr before but just assumed it was some kind of netsec (white or black hat) group. I actually skimmed the OP's first paragraphs several times because I didn't understand what was going on.

That said, I agree that DB reliability should be taken with the same rigor as net security...but I was kind of under the impression that it already was, in that DBs are pretty serious business. Also, "attack" has the connotation of, well, an "attack"...here, some of the failures happen in regular business operations, which is a problem different from when the system is under "attack".

But at least the OP took the criticism graciously. When I read what the case actually was, I then worried that the OP was having a bunker mentality.

pygy_13y ago

Tangentially related:

In the PostgreSQL evaluation[0], Aphyr noticed that, if a packet confirming a transaction is dropped, the client ends up in a deadlock.

Does PostgreSQL keep a record of the past transactions, and their success or failure. If so, is it possible to query it?

[0] http://aphyr.com/posts/282-call-me-maybe-postgres

aphyr13y ago

Yes, you can recover from lost acknowledgements by asking for the transaction ID from postgres before committing--or by making up your own flake ID and writing it to a table. Given a queue with at-least-once delivery (which includes, say, durable storage on the client), you can check for the presence of that ID at a later time and re-apply the transaction to recover from network errors safely.

The transaction ID does wrap around, so there's a time limit depending on your transaction throughput. You can also ask for certain transactional properties on rows, though this won't allow you to recover in all (most?) cases.

fdr13y ago

Database constraints usually catch these problems in event of re-submission, especially if the client can assign primary keys (e.g., a UUIDv4) a-priori, but this also tends to be true in simpler cases, too.

All in all, I am not sure if anyone should find this surprising: if anyone has ever had a network stall when clicking the 'confirm' button at a web-based store, they are familiar with the uncertainty as to whether the order has been submitted or not (resolved typically by browsing the history or waiting for an email, or no).

I would guess modern e-commerce vendors would send you a UUID or moral equivalent to de-dup cart resubmissions these days...but if not, it'd be interesting to know why not.

aphyr13y ago

Correct; if your writes are idempotent, retrying is safe. I cover this in the post as well. My above comment shows that it's possible to recover consistency even for writes which are not idempotent--though depending on the semantics of your retries, there may be some locking required.

pygy_13y ago

Hey! I didn't expect you to chime in right here.

Thanks for the explanation.

praptak13y ago

Yet more tangentially related: it is an instance of the Byzantine Agreement problem, which is unsolvable in general: no finite protocol guarantees consistent state in the presence of packet loss.

aphyr13y ago

Yep, FLP applies here--but if a network works long enough to complete a round eventually, e3PC or similar can succeed. Pretty much all real-world networks do that. :)

Glyptodon13y ago

Redis is one of those things I both love and love to hate.

I've had good results using Redis as a lock server, but I live in (perhaps misplaced) fear of a client hanging or crashing leaving a lock stranded. Not that this is really Redis's problem.

antirez13y ago

Hello, you can easily mount a lock that auto-releases itself after some timeout using the new (2.6.13) extended SET command (see http://redis.io/commands/set) or simply a Lua script.

Glyptodon13y ago

Since the jobs we're locking can have somewhat inconsistent times we're actually using an implementation where the tasks can get a lock with a time limit and can extend their lock so long as they still have it, so they do potentially auto-release.

Even given this, bad lock timing (not that likely) or a crash (more likely) could let inconsistency in.

Shrugs

Like I said, my problem is not really Redis's. If I can't trust everything that uses a lock not to crash 99.99% of the time I should really be looking at our jobs and not at Redis.

Even then, though, it's probably more a matter of me not trusting things than it is said things not actually being trustworthy.

rmaccloy13y ago

We're about to open source a similar deal (redis-based "soft guarantee" mutexes) -- ours is written in Python and mostly used as a way to coordinate (very frequent) parallel task execution a la CountDownLatch, so 100% reliable exclusion in the face of failure isn't critical.

I'd be interested to hear about your implementation if you can share (email is HN username at gmail.com)

1 more reply

jamwt13y ago

I recommend dreadlock:

https://github.com/jamwt/dreadlock

It will release the lock when the client dies (disclaimer: I wrote it).

Or you can go whole hog and use zookeeper + ephemeral nodes. More robust but quite a bit more complex.

nutmeg13y ago

A response to this article on Redis: http://aphyr.com/posts/283-call-me-maybe-redis

keeran13y ago

His continued use of "CP" confused me for a while, so TIL about CAP Theorem

http://en.wikipedia.org/wiki/CAP_theorem

krenoten13y ago

If you have the time, this video by Basho's CTO will give you a much better understanding of the tradeoffs that are involved in distributed system design: http://www.infoq.com/presentations/Concurrency-Scale-Distrib...

A great alternative to thinking about things in terms of CAP that Justin brings up is harvest-yield, where yield is the probability of completing a request and harvest is the fraction of your data that the response actually represents. Here's the paper: http://lab.mscs.mu.edu/Dist2012/lectures/HarvestYield.pdf

bretthoerner13y ago

And better: http://henryr.github.io/cap-faq/

lacksconfidence13y ago

hmm, i'm not sure how it could be better worded, but since antirez already links to this, i had thought you were posting a response to antirez's comments

undoware13y ago

I'm frustrated that when the HN editors deduped the original story, they apparently deleted ALL the instances, leaving only this one. I wanted to read the discussion on the subject of Aphyr's research, not Antirez' response.

It looks bad, HN. We all know that VMWare is litigious as (try looking up benchmarks sometime.) But to (presumably) cave so quickly and effortlessly suggests... well, I'm not sure.

The other possibility is that Aphyr yanked them himself, probably under duress (or else there'd just be an 'update' at the bottom of the research's page.) Aphyr, is this what happened? I figure you probably can't talk freely if so, but say something.

antirez13y ago

Hello,

1) I no longer work for VMware, but Pivotal. Redis is open source and copyright is of the original guys that wrote the code: I, Pieter Noordhuis, other contributors.

2) I posted the link to the original article in the first very lines of my reply. Actually thanks to my reply the exposure the Aphyr research had about Redis is the greatest, compared to the other data stores mentioned. I publicly said thank you to Aphyr on Twitter, and posted its blog post.

So I really don't understand your theories here.

undoware13y ago

Sorry, to clarify -- I was suggesting that it was possible that VMWare (a sponsor of Redis, correct?) leaned on someone. I didn't mean to besmirch you or redis, antirez, and I enjoyed your response.

It wouldn't be the first time a reputable news site was forced to bury a story by a litigious company. Sponsoring FOSS does not make any organization beyond doubt. Especially if they, say, have a history of suing anyone who benchmarks them.

andypiper13y ago

As Antirez says, VMware were formerly a sponsor of Redis, and he now works for Pivotal (as do I), who are the current sponsor of the project. Either way, I'm highly skeptical that anyone at either company did such a thing.

1 more reply

aphyr13y ago

https://www.hnsearch.com/search#request/all&q=aphyr.com

HN stories on my original posts are still there, as far as I can tell. They just never hit frontpage.

antirez13y ago

Aphyr, this is very lame, it's not common to see a work like what you did, and none of your stories hit the HN front page? I don't know what to think, but I hope that at least my post will help to show more people your awesome work.

hendzen13y ago

I think Aphyr's series was a little too meaty for the general HN audience (of today).

Talking about things like the FLP impossibility result, CAP theorem and specifying protocols with TLA+ may be a bit over the heads of many HN readers - clearly, people would rather read stories about the latest funding round, acquisition or frontend UI framework than a substantive article on distributed systems.

1 more reply

undoware13y ago

Yes, they did -- I saw them do so. There were several, in fact. And then they were gone. You've been robbed?

tptacek13y ago

It is extremely unlikely that any pressure was put on the HN admins by VMWare or anyone else to get stories scrubbed. It's almost as unlikely that VMWare gives a shit about stories about Redis.

jacquesm13y ago

It's not unlikely they got a bunch of (unjust) flags.

tptacek13y ago

Based on pressure from VMWare? No, that's extraordinarily unlikely.

JulianMorrison13y ago

RethinkDB people, how does your database compare?

contingencies13y ago

DRBD? http://drbd.org/

aphyr13y ago

Same limitations as any asynchronously replicated system; if both nodes diverge during a partition, you'll probably have to drop one's writes.

http://aphyr.com/posts/287-asynchronous-replication-with-fai...

contingencies13y ago

Right. By operating at the block level it's a little more portable than most of the solutions discussed, though. Worth people's consideration, IMHO.

aphyr13y ago

I'm inclined to think just the opposite. It's often possible to recover divergent data structures logically. Good luck doing that on an arbitrary block store.

1 more reply

j / k navigate · click thread line to collapse

40 comments

brown9-213y ago

Legion13y ago

My opinion of him and the Redis project increases further every time.

danso13y ago

But at least the OP took the criticism graciously. When I read what the case actually was, I then worried that the OP was having a bunker mentality.

pygy_13y ago

Tangentially related:

In the PostgreSQL evaluation[0], Aphyr noticed that, if a packet confirming a transaction is dropped, the client ends up in a deadlock.

Does PostgreSQL keep a record of the past transactions, and their success or failure. If so, is it possible to query it?

[0] http://aphyr.com/posts/282-call-me-maybe-postgres

aphyr13y ago

fdr13y ago

I would guess modern e-commerce vendors would send you a UUID or moral equivalent to de-dup cart resubmissions these days...but if not, it'd be interesting to know why not.

aphyr13y ago

pygy_13y ago

Hey! I didn't expect you to chime in right here.

Thanks for the explanation.

praptak13y ago

Yet more tangentially related: it is an instance of the Byzantine Agreement problem, which is unsolvable in general: no finite protocol guarantees consistent state in the presence of packet loss.

aphyr13y ago

Yep, FLP applies here--but if a network works long enough to complete a round eventually, e3PC or similar can succeed. Pretty much all real-world networks do that. :)

Glyptodon13y ago

Redis is one of those things I both love and love to hate.

I've had good results using Redis as a lock server, but I live in (perhaps misplaced) fear of a client hanging or crashing leaving a lock stranded. Not that this is really Redis's problem.

antirez13y ago

Hello, you can easily mount a lock that auto-releases itself after some timeout using the new (2.6.13) extended SET command (see http://redis.io/commands/set) or simply a Lua script.

Glyptodon13y ago

Even given this, bad lock timing (not that likely) or a crash (more likely) could let inconsistency in.

Shrugs

Like I said, my problem is not really Redis's. If I can't trust everything that uses a lock not to crash 99.99% of the time I should really be looking at our jobs and not at Redis.

Even then, though, it's probably more a matter of me not trusting things than it is said things not actually being trustworthy.

rmaccloy13y ago

I'd be interested to hear about your implementation if you can share (email is HN username at gmail.com)

1 more reply

jamwt13y ago

I recommend dreadlock:

https://github.com/jamwt/dreadlock

It will release the lock when the client dies (disclaimer: I wrote it).

Or you can go whole hog and use zookeeper + ephemeral nodes. More robust but quite a bit more complex.

nutmeg13y ago

A response to this article on Redis: http://aphyr.com/posts/283-call-me-maybe-redis

keeran13y ago

His continued use of "CP" confused me for a while, so TIL about CAP Theorem

http://en.wikipedia.org/wiki/CAP_theorem

krenoten13y ago

bretthoerner13y ago

And better: http://henryr.github.io/cap-faq/

lacksconfidence13y ago

hmm, i'm not sure how it could be better worded, but since antirez already links to this, i had thought you were posting a response to antirez's comments

undoware13y ago

It looks bad, HN. We all know that VMWare is litigious as (try looking up benchmarks sometime.) But to (presumably) cave so quickly and effortlessly suggests... well, I'm not sure.

antirez13y ago

Hello,

1) I no longer work for VMware, but Pivotal. Redis is open source and copyright is of the original guys that wrote the code: I, Pieter Noordhuis, other contributors.

So I really don't understand your theories here.

undoware13y ago

Sorry, to clarify -- I was suggesting that it was possible that VMWare (a sponsor of Redis, correct?) leaned on someone. I didn't mean to besmirch you or redis, antirez, and I enjoyed your response.

andypiper13y ago

1 more reply

aphyr13y ago

https://www.hnsearch.com/search#request/all&q=aphyr.com

HN stories on my original posts are still there, as far as I can tell. They just never hit frontpage.

antirez13y ago

hendzen13y ago

I think Aphyr's series was a little too meaty for the general HN audience (of today).

1 more reply

undoware13y ago

Yes, they did -- I saw them do so. There were several, in fact. And then they were gone. You've been robbed?

tptacek13y ago

It is extremely unlikely that any pressure was put on the HN admins by VMWare or anyone else to get stories scrubbed. It's almost as unlikely that VMWare gives a shit about stories about Redis.

jacquesm13y ago

It's not unlikely they got a bunch of (unjust) flags.

tptacek13y ago

Based on pressure from VMWare? No, that's extraordinarily unlikely.

JulianMorrison13y ago

RethinkDB people, how does your database compare?

contingencies13y ago

DRBD? http://drbd.org/

aphyr13y ago

Same limitations as any asynchronously replicated system; if both nodes diverge during a partition, you'll probably have to drop one's writes.

http://aphyr.com/posts/287-asynchronous-replication-with-fai...

contingencies13y ago

Right. By operating at the block level it's a little more portable than most of the solutions discussed, though. Worth people's consideration, IMHO.

aphyr13y ago

I'm inclined to think just the opposite. It's often possible to recover divergent data structures logically. Good luck doing that on an arbitrary block store.

1 more reply

j / k navigate · click thread line to collapse