- Someone implemented a YAML parser that executed code. This should have been obviously wrong to them, but it wasn't.
- Thousands of ostensible developers used this parser, saw the fact that it could deserialize more than just data, and never said "Oh dear, that's a massive red flag".
- The bug in the YAML parser was reported and the author of the YAML library genuinely couldn't figure out why this mattered or how it could be bad.
- The issue was reported to RubyGems multiple times and they did nothing.
This isn't the same thing as a complex and accidental bug that even careful engineers have difficulty avoiding, after they've already taken steps to reduce the failure surface of their code through privilege separation, high-level languages/libraries, etc.
This is systemic engineering incompetence that apparently pervades an entire language community, and this is the tipping point where other people start looking for these issues.
http://wouter.coekaerts.be/2011/spring-vulnerabilities
If J2EE is a boring platform to you, pick your favorite and Google for a few variants. You'll find a serialization vulnerability. It's hard stuff, by nature.
* The bug in the YAML parser was reported and the author of the YAML library genuinely couldn't figure out why this mattered or how it could be bad.*
Do you have a citation for this? What particular bug in the parser are you referring to? The behavior which is being exploited is a fairly complicated interaction between the parser and client Rails code -- I banged my head against the wall trying to get code execution with Ruby 1.8.7's parser for over 12 hours, for example, without any luck unless I coded a too-stupid-to-be-real victim class. (It's my understanding that at least one security researcher has a way to make that happen, but that knowledge was hard won.)
Yes, this is always a bad idea. It's actually in a similar problem space as the constant stream of vulnerabilities in the Java security sandbox (eg, applets); all it takes is one mistake and you lose.
And thus, people have been saying to turn off Java in the browser for 4+ years, and this is also why Spring shouldn't have implemented such code.
> It's hard stuff, by nature.
Which is why deserializing into executable code is a bad idea, by nature. I'd thought this was well established by now, but apparently it is not.
> Do you have a citation for this? What particular bug in the parser are you referring to?
Wackiness ensued: http://blog.o0o.nu/2010/07/cve-2010-1870-struts2xwork-remote...
It would obviously be unfair to claim on this basis, or the recent problems with the Java browser plugin, that the "entire Java language community" has a bad attitude on security matters. Communities are big, each of them has a range of attitudes within it, and most importantly --- regardless of attitude --- sooner or later, everyone screws up.
the particular issue in the Yaml parser is explained pretty well here: http://www.insinuator.net/2013/01/rails-yaml/
First, given how many times I've seen a deserialization library "helpfully" allow you to deserialize into arbitrary objects in a language that is sufficiently dynamic to turn this into arbitrary code execution, evidence suggests this is not an accurate summary. I'd like to see "Don't deserialize into arbitrary objects" become General Programming Wisdom, but it is not there yet.
It's not like we live in a world where XSS is rare or anything anyhow. The general level of programming aptitude is low here. That's bad, regrettable, something I'd love to see change and love to help change, but it is also something we have to deal with as a brute fact.
Secondly, there's still the points of A: even if you don't use Ruby on Rails, your life may still be adversely affected by the Severity: Apocalyptic bug, and B: what are you going to do when the Severity: Apocalyptic bug is located in your codebase? And that's putting aside the obvious matters of what to do if you use Ruby on Rails and this was your codebase. The exact details of today's Severity: Apolalyptic bug are less relevant than you may initially think. Go back and read the piece, strike every sentence that contains "YAML". It's still a very important piece.
At which point a re-quoting of my favorite line in the piece is probably called for: "If you believe in karma or capricious supernatural agencies which have an active interest in balancing accounts, chortling about Ruby on Rails developers suffering at the moment would be about as well-advised as a classical Roman cursing the gods during a thunderstorm while tapdancing naked in the pool on top of a temple consecrated to Zeus while holding nothing but a bronze rod used for making obscene gestures towards the heavens." Epic.
I think that's pifflesnort's point.
You're definitely right that the security reports should be handled better. I hope that this whole situation results in a better security culture in the Ruby community.
Regarding your tone ("intellectually dishonest", "trained monkey", "systemic engineering incompetence pervades an entire language community"), it's a bit of hyperbole and active trolling. You are certainly right in many of your points, and you are certainly coming off as a jerk. It may not be as cathartic for you, but I'd suggest toning it down to "reasonable human being" level in the future.
The Rails community has exhibited such self-assured, self-promotional exuberance for so long (and continues to do so here), it feels necessary to rely on equivalently forceful and bellicose language to have a hope of countering the spin and marketing messaging.
Case in point, the article seriously says, with a straight face:
"They’re being found at breakneck pace right now precisely because they required substantial new security technology to actually exploit, and that new technology has unlocked an exciting new frontier in vulnerability research."
Substantial new security technology? To claim that a well known vulnerability source -- parsers executing code -- involves not only substantial new technology, but is a new frontier in vulnerability research?
This is pure marketing drivel intended to spin responsibility away from Ruby/Rails, because the problems are somehow advanced and new. This is not coming from some unknown corner of the community, but from a well-known entity with a significant voice.
Also, more than other communities, Ruby has a cultural gap between the people developing the language and core libraries and the people using it to write web apps and frameworks.
Here's two good technical writeups of the exploit as it applies to Rails apps: http://blog.codeclimate.com/blog/2013/01/10/rails-remote-cod... http://ronin-ruby.github.com/blog/2013/01/09/rails-pocs.html
My point is that it's 'taken so long' because all this code is stuff that was written in a totally different time and place. And then was built on top of, after years and years and years.
Now that it _is_ being examined, that's why you see some many advisories. This is a good thing, not a bad one! It's all being looked through and taken care of.
And then, as someone else said, becuase of layering. The next downstream user using YAML might not have even realized that YAML had this feature, on top of not realizing the danger of this feature. And then someone else downstream of THAT library, etc.
Maybe it _should_ have been obvious, but it wasn't, as evidenced, as you say, by all the people who have done it before. After the FIRST time it was discovered, it should have been obvious, why did it happen even a second?
In part, becuase for whatever reason, none of those exploits got the (negative) publicity that the rails/yaml one is getting. Hopefully it (the dangers of serialization formats allowing arbitrary class/type de-serialization) WILL become obvious to competent developers NOW, but it was not before.
20 years ago, you could write code thinking that giving untrusted user input to it was a _special case_. "Well, I guess, now that you mention it, if you give untrusted input that may have been constructed by an attacker to this function it would be dangerous, but why/how would anyone do that?" Things have changed. There's a lot more code where you should be assuming that passing untrusted input to it will be done, unless you specifically and loudly document not to. But we're still using a lot of code written under the assumptions of 20 years ago -- assumptions that were not neccesarily wrong cost/benefit analyses 20 years ago. And yeah, some people are still WRITING code under the security assumptions of 20 years ago too, oops.
At the same time, we have a LOT MORE code _sharing_ than we had 20 years ago. (internet open source has changed the way software is written, drastically) And ruby community is especially 'advanced' at code sharing, using each other's code as dependencies in a complex multi-generation dependency graph. That greatly increases the danger of unexpected interactions of features creating security exploits that would not have been predicted by looking at any part in isolation. But we couldn't accomplish what we have all accomplished without using other people's open source code as more-or-less black box building blocks for our own, we can't do a full security audit of all of our dependencies (and our dependencies' dependencies etc).
Of course, you could argue that developers should always be thinking about and searching for security related issues in whatever field they're working in, but that doesn't appear to be the norm at the moment.
I thought you could unpickle untrusted input in Python? Sure there's a great big red warning message on the documentation, and hence it's currently rare for people to do it, but it is technically allowed, right?
This is master level, "Captain Obvious"-style trolling, beyond me how this is the top comment in a place like HN.
Someone implemented a YAML parser that could serialize and de-serialize arbitrary objects referenced by class name.
It was not obvious that this meant it 'executed code', let alone that this meant it could execute _arbitrary_ code, so long as there was a predictable class in the load path with certain characteristics, which there was in Rails.
In retrospect it is obvious, but I think you over-estimate the obviousness without hindsight. It's always easy to say everyone should have known what nobody actually did but which everyone now does.
As others have pointed out, an almost identical problem existed in Spring too (de-serializing arbitrary objects leads to arbitrary code execution). It wasn't obvious to them either. Maybe it _should_ have been obvious _after_ that happened -- but that vulnerability didn't get much publicity. Now that the YAML one has, maybe it hopefully WILL be obvious next time!
Anyhow, that lack of obviousness applies to at least your first two points if not first three. It was not in fact obvious to most people that you could execute (arbitrary) code with YAML. If it was obvious to you, I wish you had spent more time trying to 'paul revere' it.
> The issue was reported to RubyGems multiple times and they did nothing.
Now, THAT part, yeah, that's a problem. I think 'multiple times' is 'two' (yeah, that is technically 'multiple'), and only over a week -- but that still indicates irresponsibility on rubygems.org maintainers part. A piece of infrastructure that, if compromised, can lead to compromise to almost all or rubydom -- that is scary, that needs a lot more responsibilty than it got. We're lucky the exploit was in fact publisizied rather than kept secret and exploited to inject an attack into the code of any ruby gem an attacker wanted -- except of course, we can't know for sure if it was or not.
Is that seriously what happened? It sounds oddly similar to the Rails issue from about a year ago (the one in which the reporter was able to commit to master on Github), even though I believe that was a separate set of developers altogether.
If so, then that might suggest a larger community/cultural issue, which makes me wonder what other exploits exist but haven't been reported (publicly) yet...
Surprisingly, yes: https://github.com/tenderlove/psych/issues/119
And the RubyGems folks are trying to handle this with whitelisting specific classes that the YAML parsing will still be allowed to instantiate:
Hey, yes, yaml bug is _very_ similar. Whitelist is better than no list at all
Er, there would have been trouble on that end too ...
Indeed. It's the "fallacy of gray". Nothing is black or white, hence everything is gray. Nothing is 100% secure, nothing is 100% insecure, hence everything is "semi-secure": it's bad, but not too bad, because every language / API / server can be attacked.
You've effectively substituted a black/white dichotomy with something even worse: instead of having only two options (black or white), you now only have one: gray.
It is probably one of the most intellectually dishonest logical fallacy of all times and we keep seeing it more and more.
It's really concerning.
There are many developers who are not presently active on a Ruby on Rails
project who nonetheless have a vulnerable Rails application running on
localhost:3000. If they do, eventually, their local machine will be
compromised. (Any page on the Internet which serves Javascript can, currently,
root your Macbook if it is running an out-of-date Rails on it. No, it
does not matter that the Internet can’t connect to your
localhost:3000, because your browser can, and your browser will follow
the attacker’s instructions to do so. It will probably be possible to
eventually do this with an IMG tag, which means any webpage that can
contain a user-supplied cat photo could ALSO contain a user-supplied
remote code execution.)
That reminded me of an incredible presentation WhiteHat did back in 2007 on cracking intranets. Slides[1] are still around, though I couldn't readily find the video.[1]: https://www.whitehatsec.com/assets/presentations/blackhatusa...
https://community.rapid7.com/community/metasploit/blog/2013/...
In addition to common port numbers and stuff like redmine, their tipoffs include looking for Rails-style session cookies, and HTTP response headers emitted by Rails or support machinery. These include "X-Rack-Cache:" and the "X-Powered-By:" header that Phusion Passenger tosses in even if you've configured Apache itself to leave version numbers and component identifiers out of the response. (I'm not sure there's any better way to suppress this stuff than adding mod_headers to the Apache config and using "Header unset")
There is also a lot less headaches once you've decided to move it into production.
You see this in things such as security issues being marked as wontfix until they are actively exploited (e.g. the Homakov/GitHub incident), in the attitude that developer cycles are more expensive than CPU cycles, and on a more puerile level in the tendency towards swearing in presentations.
I've always had the impression that the Rails ecosystem favours convenience over security, in an Agile Manifesto kind of way (yes, we value the stuff on the right, but we value the stuff on the left even more). One of the attractions of Rails is that it is very easy to get stuff up and running with it, but some of the security exploits that I've seen cropping up recently with it make me pretty worried about it. I get especially concerned when I see SQL injection vulnerabilities in a framework based on an O/R mapper, for instance.
Many start-ups are built by well-meaning people who have no formal CS or even engineering background and thus are somewhat out of touch with what it means to build a robust system. It's natural for people to focus on "what's important" and ignore boundary/edge conditions, while in reality 90% of sound engineering is getting boundary/edge cases right.
And as most of such start-ups use Ruby/Rails due to the easiness of "getting it up and running", and thus they inject the Ruby/Rails ecosystem with this "focus on what's important" mindset, important boundary issues, including security, are neglected.
I think in 2006/2007, there was a simplicity to the basic "get up and running" aspect, but Rails 3.x+ is a pretty large ecosystem with quite a lot of decision points to educate yourself on to do any sized project beyond 'hello world'.
The exploits have happened in ways that have exposed and hammered home the myriad places many applications expose unexpected side channels and larger attack surfaces than you'd think. These issues have opened a broader range of people to vulnerability, and I think opened a lot of people's eyes to the need for a sense of security and what that really means.
Top that with the level of explanation we've seen in at least the Rails and Ruby exploits, it's been a tremendous educational opportunity for a lot of people who will benefit greatly from it, and by proxy their users.
When the idea of a "SQL Injection" first became really prevalent, we saw an uptick in concern for security amongst framework developers, as far as I could tell. I think this will help get some momentum going again.
Speaking as a non-expert on the subject, security is all about a healthy sense of paranoia, across the board :)
I was going to post something similar. Also we often see people insulting others when they post exploits too early or describe exploits in depth too early. Posting stuff like: "You're an .ssh.le, wait a few days before posting that".
I don't think so. I think exploits should be publicly posted as soon as possible and affecting as many people as people. Maybe even damaging exploits, actively deleting users data or servers data.
The bigger the havoc, the sooner the entire industry is going to realize security is a very real concern.
People are still considering buffer overflow, SQL injection, query parameters objects instantiation through deserialization exploits, etc. to be "normal" because "everybody creates bugs" and "a lot of bugs can be exploited".
I think it's the wrong mindset. Security if of uttermost importance and should be thought of from the start.
For example I'm amazed by the recent seL4 microkernel which makes buffer overflow provably impossible (inside the microkernel) or even the Java VM (the JVM) which makes buffer overflow in Java code impossible. It's not perfect (we've seen lots of major Java exploits, but zero were buffer overrun/overflow in Java code... Some in 3rd party C libs, but zero in Java code. Some other Java exploits too of course, but zero buffer overrun/overflow).
So security exploits are not a fatality.
All we need is people, from the very start, to conceive systems more resilients to attacks.
The more attacks, the more exploits, the more bad reputation and shame on clueless developers, the better.
I actually start to love these exploits, because they fuel healthy research by the white-hat community.
And one day we'll have more secure microkernels, more secure OSes, more secure VMs, more secure protocols, etc.
Let them security exploits come.
If you are like me, you would expect that YAML was used in the configuration files and nowhere else. A small framework like Sinatra wouldn't have been big enough to hide an issue like this.
I understand the appeal of "magic" to solve issues when you are under a deadline. It is just that trusting it is dangerous.
What technology is he talking about here?
When I first read your blog post I got the impression that you were saying that the YAML vulnerability were found with some new code scanning technology that lets us find bugs in Rails faster. Or are you just saying discovering the existence of the YAML.load() class of vulnerability is "new security technology?"
Or are you talking about the ronin support module people are using in some of the PoCs?
I think that to suggest that this bug could not have been found before is wrong, but the reason we're seeing such a cascade is because security almost never happens in a bubble.
Previously you had to send something to rails and find a way to cause rails to execute that. Not so easy.
Now? You just have to send some YAML to rails.
I had been meaning to get some context for the recent spate of security problems and this provided that in spades. Thanks for taking the time to write it up and post it.
Who was the first reported compromise of a production system?
1) Is it currently safe to "bundle update" and be confident that only verified Gems will be provided? I don't mind errors on any unverified ones but don't want to download them.
2) Is there a drop in replacement for RubyGems? The problems that have occurred this month would have been multiplied if RubyGems was unavailable at the time Rails had an apocalyptic bug.
1. I wouldn't say so. Not until they're all the way through.
2. Not at the moment, but general guidance is that we should all have local gem repos that we maintain ourselves and only rely on external sources when needed. It is something I'm going to look into ASAP.
[1] https://docs.google.com/document/d/10tuM51VKRcSHJtUZotraMlrM...
It's a shame that they seem to have put the service back up in an unsafe mode, I would have hoped that they could have quarantined the unverified Gems.
Edit: Looking at the status page the API is down so it can't be accessed from Bundler so they are doing it the good/safe way.
Obviously then it is up to you to verify everything, including that you're using the right versions and what not.
I hope they learn from this and stop chanting "convention over configuration" when told that explicit is better than implicit.
Or should I basically just not run Rails on any machine ever anymore, get a different web server, and start implementing my own request routing and ORM without any sort of YAML-parsing magic?
>One of my friends who is an actual security researcher has deleted all of his accounts on Internet services which he knows to use Ruby on Rails. That’s not an insane measure.
So anyone who uses Twitter, for example, could have their passwords and other data stolen through this exploit?
Long story short: There's a variety of things that can be done to mitigate this vulnerability and an active conversation on which is the best option. My go-to suggestion would be having Rails ship with either a non-stdlib YAML serialization/deserialization parser or have it modify the stdlib one, with the major point of departure being "Raise an exception immediately if the YAML encodes any object not on a configurable whitelist, and default that whitelist to ~5 core classes generally considered to be safe."
Or should I basically just not run Rails on any machine ever anymore, get a different web server, and start implementing my own request routing and ORM without any sort of YAML-parsing magic?
That is astonishingly unlikely to be a net-win for your security.
So anyone who uses Twitter, for example, could have their passwords and other data stolen through this exploit?
I'd expect that Twitter (in particular) has a better handle on it than your average startup, but successful exploitation of this means the attacker owns the server, if the attacker owns the server they probably get all the servers, and they will tend to gain control of any information on all of the servers. That can include, but is certainly not upper-bounded by, passwords/hashes stored in the database. It is absolutely possible, and indeed likely, that many people will be adversely affected by this vulnerability without themselves running Rails or even, for that matter, knowing what Rails is.
>That is astonishingly unlikely to be a net-win for your security.
In the long run, you are probably right. Once this gets fixed, which will probably be soon considering how much attention is on it.
But in the short run, is there anything worse than a vulnerability that allows a remote attacker to automatically detect, penetrate, and execute arbitrary code on your machine? To the point where it's not even safe to run the framework on localhost on your dev box?
By making that the default schema, developers would have to explicitly request the dangerous "ruby" schema that makes arbitrary Ruby objects.
My question: do these security issues affect Sinatra apps?
Why are you running Rails as the root user? This is a bad idea.
EDIT: I'm not really into client-side JavaScript these days, but when did browsers start allowing JavaScript to connect to anything except the server from which it came? That would be yet another Bad Idea.
1. You load the evil JavaScript.
2. That JavaScript adds an image with a URL pointing at localhost:3000.
3. When you load that URL, it causes code execution, causing your computer to open a connection somewhere and start taking instructions.
4. The instructions that arrive includes downloading and installing software that takes advantage of known local root vulnerabilities in OS X.
5. Congratulations! Someone rooted your machine!
Nothing in this path required Rails to be run as root, or JavaScript to directly connect anywhere.
There are several tricks that can be used by JavaScript to connect to non-origin servers, in limited ways.
To create a GET, inject an <img>, <script>, <iframe>, or <style> tag. (Or several others.)
To create a POST, inject a <form> tag, and call form.submit()
I am still convinced that configs and templates should be treated as executable code and are best implemented in the same language they're used from. At least it makes certain things blatantly obvious. (It also makes a lot of other things possible without any extra coding/learning.)
So I think it only helps if you are likely to need to deploy additional/alternative servers of the same versions. For significant deployed services this makes sense but if you are only in development/testing or using a service like Heroku it doesn't really help you very much does it?
At least your deployments will be consistent. This is a great starting point. Now all you have to do is check your cache against the backdoored version, and you instantly and verifiably know where your deployment stands.
bundle package
will cache all of your deps in vendor/cache.
You can install from this cache using: bundle install --localThere is also https://github.com/dtao/safe_yaml (hat tip @patio11, who also points out that this has not been audited for completeness/correctness)
I could see it as a service company that shares blacklist info between sites and can even find new exploits from the "bad" requests.
There was a time when anyone who claimed to have the ability could design and build things like bridges and buildings. After enough of them collapsed due to repeated, avoidable mistakes, we said no, you can't do that anymore, you need to be licensed to design and build buildings, and furthermore you have to follow some basic minimum conventions that are proven to work. And you and your firm has to take on personal liability when you certify that your design and construction follows those basic best practices.
It would be good if all this was a clarion call to the Ruby community to improve things holistically, rather than the current trend of band-aid fixes they seem to apply.
Every popular technology goes through this. (C, Java, PHP, etc)
What is encouraging to me is the speed with which these issues get patched in Ruby and Rails, and how the ecosystem is paying attention to these lessons and learning from them.
Contrast this with the length of time recent Java flaws took to get patched (6 months or more) or some of the bugs reported in TOSSA got fixed years later.
The deal is to learn from each of these incidents.
Very few people want to take the trouble to write and use correct programs. We, as an industry, would rather Ship Early and Often. It takes a lot of energy and time endeavor to write correct programs. Very few do that. Three that come to mind are Dijkstra, Knuth, DJB.
Because other frameworks are rock-solid. Yup. None of this happens anywhere else on the internet.
It does happen everywhere. It should be stopped everywhere. But it happens more frequently in some places. There are special conditions that permit it to happen in some places.And if it is a serious concern of yours, knowing where it is and isn't most likely likely to happen again is important.
The Perl YAML warning is less obvious but they at least mention in their LoadCode docs (http://search.cpan.org/~mstrout/YAML-0.84/lib/YAML.pm) that you have to specifically enable code deserialization since untrusted evaluation is a bad idea.
Python's YAML is only slightly worse, with an available safe_load method that refuses to run code (and a failure to use appropriately led to vulns in popular Django plugins a little more than a year ago).
There's no easy equivalent to safe_load or UseCode for Ruby's YAML (http://apidock.com/ruby/Psych) as far as I can tell, at least while still using the high-level parser. And I'll note that the API docs I provided are for the new YAML parser introduced with 1.9.3. I would like to think that by 2010 there would be a general awareness of the risk of using deserializers/code emitters on untrusted input.
In Common Lisp, for example, as far as I know you can set a flag so that the reader is set to "no evaluation ever" (if I understand things correctly) and, hence, if you're not using eval yourself specifically, nothing is ever going to be evaluated.
But how would that work in Clojure? And what about other languages? Ruby? Haskell? Java? C#?
I think the ability to execute code became the most important security issue (more than buffer overflow/overrun which can now be prevented --even sometimes provably impossible to happen thanks to theorem provers).
More thoughts should be put into explaining how/when a language / API can execute code and how it should/can be used to prevent such a thing from happening.
As someone who loves Rails, to someone who presumably likes Rails, it is imperative that you understand how serious this issue is. If you use Rails, you need to have addressed this already. If you have not, drop what you're doing and go fix it right now.
The Fear seems appropriate.