I could be mistaken but I believe he reported the security issue through our regular support channel which is why it took three days to see (instead of our security channel). From the time I saw it, I fixed it with the patch going live within an hour or two.
When I DID see it, tried it myself with a quick shell script that that curled and backgrounded the same request a bunch of times, I just kind of chuckled. It was a good bug. Josip is top notch.
I believe the race condition is on the rise in terms of severity and importance. Developers are aware of common OWASP bugs, but this type of race condition is often overlooked and developers are going to NEED to be just as aware of. Way to go.
That's the problem with OWASP, when developer from a big company sees race condition for the first time and is surprised
I worry that a malicious attacker could finger the service for potential victims.
1. The user submits X number of requests within a second. 2. The system puts the request in a command queue that synchronizes the commands by coupon code, for example. 3. The command is popped off the queue and an event is generated and saved saying the coupon was redeemed. 4. The next command is picked up and all events are applied before processing. At this point, the command is no longer valid so you reject and send an event saying that an attempt was made to redeem a redeemed coupon. 5. Do the same for subsequent requests.
To me, this approach is safer and easier to reason about. You have a log of the fact that someone made the attempt so you can report on this. Not sure you get that benefit from a stored procedure and a transaction unless you build it in and then increase the running time of the open transaction.
It's not necessarily different than using a normal RDBMS, right - you could do a check in SQL outside a transaction and end up writing multiple times. But with an RDBMS, you can easily solve the situation by turning on a transaction and leaving no question about things.
This is why things like VoltDB ("NewSQL") are pushing to keep SQL and ACID, and figure out a way to scale, instead of throwing it all aside and making the developer deal with consistency issues.
It's not that you can't end up with the same functionality using eventual consistency, just that it's harder. Just look at Amazon's "apology based computing" (I think that was the name) and how they structure their systems to be resilient by being able to resolve multiple conflicting commands in a proper way (deciding, without communication, which command wins, figuring out rollbacks, etc.) It's fantastic, and perhaps it's the only feasible way to operate at their scale. But it's also a hell of a lot more complicated than "UseTransaction = true".
(So my predictions/guesses: If developers that'd otherwise use a traditional ACID RDBMS switch to non-ACID (BASE?) systems, they'll end up introducing bugs due to the shifted responsibility of handling data consistency. And seeing how big servers are, and even how far sharding can take you with normal RDBMS, the scale at which people "need" to drop ACID is probably far higher than the point at which people are dropping it.)
In most systems the reward is zero, so you can infer if a person has taken the time to submit a bug report it is because he/she is invested in seeing it fixed.
Context: I work at a decent sized company in SV on this type of problem.
Properly designed bug bounty programs are a cornerstone to any company who remotely cares about the security of their product, period.
The idea of misaligned incentives due to poor bug reports being free to submit is ignorant - and worse toxic, because it sounds so true to an executive who has no actual understanding of the issue.
A quality bug report should take no more than 1 minute for a reviewer to look at and know if it's really a bug or not. If it can't, it should be rejected saying provide more clear details. For example a dom based xss attack could be reported with just a target URL and it is quite clear what the problem is. That would take 10 seconds to analyze.
Additionally, most bugs reported to most decent sized companies are reported by someone who has previously reported a bug to the company before. If someone is constantly reporting good bugs or the opposite, its quite easy to prioritize which of those individuals gets their emails read first.
1. Code sent.
2. Check if valid.
3. Redeem code.
4. Invalid code.
Now if i send 10 requests at the same time with the same code maybe 4-6 will hit the code part after 2.
And your window of opportunity is the time it takes to go from 3 to 4. Sometimes certain tasks are put inside async queue, you have a slight delay to your database server or you need to wait for db replication to kick in.
Because normally there is no code part to recheck how often this code was used.
The problem is concurrency. Whenever you have multiple things happening at once, you have concurrency and programming concurrent system is always really hard.
Unfortunately the software industry has never really got a grip on this problem and there are lots of developers who have never really studied multi-threading at all. That's a problem, because it's something that takes a lot of practice and you have to just incorporate it into the way you think. After a while you do get a sixth sense for race conditions, but you'll still write racy code from time to time anyway. It's just tricky to get it right 100% of the time.
spdy has already outlined what is happening here, but this problem is something that is covered in literally any introductory course to database systems or multi-threaded programming. If you have two threads (or processes) in flight simultaneously that are reading and writing to a shared data store, then you need some kind of mutual exclusion. That can mean a lock:
1. Request for /reviews/add is received.
2. Database lock on the page reviews table is acquired.
3. Check if the user has already posted a review. If so, release the lock and abort (any good framework will release locks for you automatically if you throw an exception).
4. Add review to table.
5. Release lock.
At the point where the lock is acquired if another web server is in the middle of the operation, conceptually speaking the first one will stop and wait for the table to become available.
Real implementations don't actually "stop and wait" - that would be too slow. They use database transactions instead where both web server processes/threads proceed optimistically, and at the end the database will undo one of the changes if they detect that there was a conflict .... but you can imagine it as being like stop and wait.
Of course once you have concurrency, you have all the joy that comes with it like various kinds of deadlock.
It's funny, a few days ago I was letting my mind wonder and ended up thinking about web apps for some reason. Oh yes, now I remember, I was thinking about concurrency strategies in a software library I maintain and how to explain it to people. And then I was thinking how hard multi-threading is and how many people are selling snake-oil silver bullets to it, and started to wonder how many web apps had race conditions in them. And then I just discarded the thought as one of no consequence and got on with my day, haha :) Perhaps I should have tried to earn a bit of money this way instead.
for i in `seq 1 16`;do
curl.*& #copied from chrome dev tools. & to background
doneI have considered writing a program that will let me send of a bunch of HTTP requests at once, but wait to close all the connections at the exact same time. That would probably be the most effective way to trigger race conditions.
There are no guarantees in the SQL standard that queries with subqueries should be atomic.
The only truly safe way to protect yourself is to fix the schema in a way that you can make use of unique indexes. Those are guaranteed to be unique no matter what.
You can then get the exact request by using Chrome developer-tools. (Find the POST-request in the network-tab, right-click and select copy as cURL)
less cynical answer: Commonly you already have some kind of means to handle races - locking, transactions, some other variety of extra check - and the fix for newly discovered races is "oh, I didn't realise that could happen. add lock"
Adding a random time to sleep might work, but some requests would run noticeably slower.
- update
thanks for the downvotes guys. keep up the good work
One time they paid me $5000 for a bug I never could have found, but they did internally based on my low severity report. (http://josipfranjkovic.blogspot.com/2013/11/facebook-bug-bou...)
stop jumping into the hate wagon everybody