But at one point was the source of everything for Heroku.
Over time things were moved out, so this isn't an everything that exists has been leaked, but it is not a guarantee that attacker didn't move from one area to another.
As someone with some apps on Heroku, having worked there, but no knowledge of the details of the incident more than others... I would:
1. Rotate all creds
2. Ensure logging all connections to the DB (I can't recall how much you can do this on Heroku)
3. Extra heavily audit Github commits and Heroku releases
4. Maybe keep rotating all creds?
The incident notification seems like the customers who are using GitHub integration are the ones who are compromised, If the attacker has gained access to other accounts then it needs to be clarified so that we could take repository level mitigations as you've mentioned; Else most will just reset account passwords and be done with it.
This is the equivalent of saying "the car was stolen because the car keys were laying on the kitchen table." They still don't know how they got into the house to get the car keys.
GitHub was just one branch that the attacker took to further access, another being the download of the accounts database. We don't know how many other things they did.
I've done security work for multiple cloud service providers and know a lot of people in the industry. I'm not really privy to give details.
I can say: dev teams face limits on what they can build securely, platform teams face limits on what secure by default and monitoring features they have time to implement, security operations teams have a lot of data points to look at, and in theory even changes in personnel in a couple of teams can have an impact on the threat posture for a given set of a company.
People are trying. But if you do the math for introduction of attack surface over time versus risk mitigation effort over time for that attack surface you can derive some estimates for likelihood of attack.
If your cloud provider isn't providing you that data as a customer, it's not simple to make a determination about likelihood of risk introduction and breach of your cloud provider.
This hasn't really been something people talk about because there was a tacit assumption that the biggest companies are mostly getting this right.
I'm not trying to say the cloud is inherently broken, I run a lot of workloads in the cloud and trust sensitive data to the cloud. But I do wish there were better ways for customers to have data points with which they could evaluate platforms besides extrapolating the history of public breach reports.
> dev teams face limits on what they can build securely, platform teams face limits on what secure by default and monitoring features they have time to implement, security operations teams have a lot of data points to look at, and in theory even changes in personnel in a couple of teams can have an impact on the threat posture for a given set of a company.
I couldn't agree more. It's too bad, because I believe most companies should be solving this by building on a battle-tested platform that provides a safe path for devs. In theory, platforms like Heroku improve cloud security by reducing margin for error. In practice though (as we're seeing), these platforms can introduce new security vulnerabilities in the layer they introduce on top of IaaS.
I also very much appreciate your comment about having a better way to evaluate the security of platforms without relying on public breach reports, or implicitly trusting what platforms say. I think the best thing is for platforms to be 100% transparent in how they implement security, namely by:
1. Running alongside IaaS services instead of layering a black box on top of them (coordinating, not fully abstracting)
2. Providing clear accountability for security defaults: every security default enforced by the platform should be represented in a validation that end users can view (if not alter)
Isn't Aptible another layer on top of IaaS?
Privy means "sharing in the knowledge of (something secret or private)", but it has nothing to do with sharing that knowledge with others.
Sure they take security further (I hope) than I would provisioning my own hardware, but I also worry about how large a target they are and how much more complex their systems are (increasingly large attack surface).
It's appealing to just let someone else handle my hosting, including security, but I also wonder if I'd be better off running colocated metal. I mean I still have to worry about the hosts network and physical security when colocating, but it's a much smaller attack surface, and also a far smaller target than massive hosts like AWS.
I think cloud provider's might need to level up their security practices, even if it incurs some friction with their customers and staff. I don't even know if it is feasible to evolve as fast as potential attackers from a cost perspective over the long term for many (or all) of them.
Not to single anyone out, but take Digital Ocean as an example. They're constantly adding features to their platform. I assume this increases their attack service? How much weight are they giving to security as they add features? I assume quite a lot, but I don't really know! Another example is AWS, who are constantly growing their features / services (to the point where it's near impossible to navigate all of their offerings!). Every cloud provider is doing the same.
There's also no real accountability. If my database on Heroku (or another provider) is compromised it could have a massive impact on my business, but at most it might mean losing a customer for my hosting provider, and maybe some bad press (not really enough though IMO). So the incentives aren't perfect.
I mean, there's been numerous incidents of a random developer having copies of customer data on their systems, or accidentally opening up a database or an Elasticsearch instance to the world.
And I'm afraid the only way to help mitigate that is to restrict what an individual can access and do on the one hand, and bureaucracy on the other. And full-time staff whose only job is to maintain security and juggle access rights around.
Either cloud providers need to assume more responsibility for security or a Federal Agency like the FBI or NIST need to be more proactively engaged in improving the security posture of cloud hosted US corps.
So no different from every VC funded startup (or startup seeking VC funding) then?
The sentiment of imposing tighter regulations around data security feels counter to the general idea that the lack of regulations around data security (e.g. strong data protection laws) are what allows the US tech industry to dominate compared to its EU equivalents.
I'm not disagreeing with this, I'm just pointing out the contradiction and wondering how those who believe the latter would reconcile that belief with demanding the former.
Can some experienced security professionals weigh in on the cultural and organizational factors that allow this kind of major breach to go unnoticed for a week, that too in a reputed company like Heroku?
I'm not asking this rhetorically or in bad faith. It's a genuine question I have based on a project I did. I researched cybersec tech like SOAR, XDR, security logging, and SIEM in depth. On paper, the marketing for such tech gives the impression that by using them, such breaches can be detected and prevented in real-time. But there seems to be a mismatch between the claims and ground realities. If so, why?
In doing so, they typically lose everyone that setup the SIEM and run the SecOps center. Everything "security" ends looking the same.
They don't pay well, executives have pulled talks and fired speakers who do things they disagree with (the same executives are promoted and remain there - no accountability), they've got a pretty bad wrap within the industry.
Some questions:
1) Are these tech not enough to enable others - perhaps less experienced, or experienced but not on a particular product - to take over while maintaining the same posture?
2) What kind of additional (perhaps intangible) security does an experienced team add to the posture that gets lost when they leave?
3) As I understand them, things like risk frameworks, NIST CSF, security assessments are all supposed to anticipate people problems (resignations, malicious insiders, etc) and make the posture as independent of them as possible, probably relying on automated tools like XDR and SOAR to do their thing regardless of who's sitting at the console. Does it not work like that in reality?
Btw, thank you for your reply and insights (and to everyone else who replies)! Pardon my probably naive questions. I'm an outsider looking in and having trouble understanding this phenomenon of data breaches in the face of all the tech marketing.
Can you please make an example? I am genuinely curious about what things
It's beyond a joke how the "cyber" industry operates.
These new leaders are typical VPs who have long lost any technical chops and its a huge task to explain any complex technical topic to them. On top of that they don't bother understanding the fundamental business model and just want to push their agenda on to everyone. So they have added "security processes" which requires checking boxes. The more boxes you check the more metric it generates the better the leader looks. These security leaders are so disconnected from the ground reality that they don't even realize that all they are doing is adding hurdles in the path of engineers without improving any security.
Agreed, there's a lot of interesting stuff that came out of the Security (or related orgs) at Salesforce. Red team tools, chaos tools and JA3 which we use at my current work as well for SSL/TLS fingerprinting.
The three days after being notified to actually revoke the tokens isn't ideal either. Surely if GitHub comes to you and warns you of suspected unauthorised access you'd spend a very limited amount of time and then revoke the credentials to be on the safe side.
Heroku _used_ to have their own security team which was quite good and had some scary talented people on it. However, over the last 3 years or so Salesforce has been forcing Heroku to adopt Salesforce's operations practices, and this has not only wrecked productivity but completely destroyed morale and caused many, many of those talented people to quit. I for one decided to quit after only working there for around 8 months due to a horrific overreach by Salesforce into Heroku's operations.
Among other things, Salesforce forced us to adopt:
- their internal ticket tracking system, which _runs in an instance of salesforce_ (barf)
- their slack instance, which lost us many of our customizations and broke a bunch of integrations for weeks (I wouldn't be altogether surprised if this was one of the causes of the delay in notifying Herokai as to what was going on)
- their incident management process, which requires us to notify "Salesforce ops HQ" anytime there's an outage that meets certain criteria.
This last one was especially bad, and meant that we no longer had full agency to act during incident response situations. I had one incident I responded to where the problem got worse while we waited for Salesforce IM to spin up, so that we ended up having what would have been a 10 minute outage turn into a 2 hour outage because the issue got out of control.
In short, the problem isn't the people trying to administer Heroku; they're great folks under a lot of pressure with very few resources. The problem is, and has always been, Salesforce's "leadership" deciding what's best for a cloud platform they couldn't give less of a damn about.
On 17 May 2019, Salesforce performed maintenance on their databases that clear permission sets for users. My team was able to piece together that the incident happened at about 0200 CDT, and Salesforce didn't take ANY noticeable action for at least 9 hours when they locked all customers out of the platform. Salesforce "fixed" the issue, which meant our Admins had to go in and reapply a bunch of profile settings...no big deal, right? Just a little bit of work for everyone to fix their own accounts. Salesforce acted like it wasn't a big deal.
Wrong.
If you were a Salesforce customer that built a tool using the Portal or Community tools Salesforce provides for external users, there was a 9 hour window when a customer could log in and instead of seeing the data you were sharing with them, they would see all data for all users. The permissions that indicated that a user should only be able to see their own data was gone.
The only reason we knew about this was because we were paying extra for advanced logging. We were able to see a few of our users logged in during this time and looked at customer records they should not have had access to.
Salesforce stood fast that exposing data through their Community and Portal tools this did not constitute a breach or even a violation of their SOC-2 Type II compliance. We were lucky that the only people that had access at the time were licensed partners. Nevertheless, our users lost their jobs and were stripped of their licenses.
Anyone that was using those tools at the time for any sort of direct customer interaction that shared order history, customer engagement, referral programs, etc. was not so lucky; doubly so if they weren't paying for advanced logging and/or didn't know what to look for. Salesforce was more concerned about covering up their mistakes than they were about telling their customers that there was a problem.
Seeing the Heroku notification page gives me PTSD. This looks all-too-familiar to me and I sympathize with those affected by this. I still feel like they were negligent back then, and I wish I knew who to tell to warn others.
Why? Maybe I'm assuming good intentions, but a) did they know they were seeing records they shouldn't be; b) did they report that? Even with yes and no, firing for that seems a little too much.
I don't recall there being any notes of material deficiency in their SOX reporting for the fiscal year either.
What else was in this database? Typically the password field is stored alongside the rest of the user record. So was this the entire customer database that was stolen? Usernames, emails, salted/hashed passwords, what else?
I feel like at some point they're just going to go completely radio silent because the extent of the breach will become such that they'll have no choice but to just lawyer up.
> According to GitHub, the threat actor began enumerating metadata about customer repositories with the downloaded OAuth tokens on April 8, 2022. On April 9, 2022, the attacker downloaded a subset of the Heroku private GitHub repositories from GitHub, containing some Heroku source code.
>I feel like at some point they're just going to go completely radio silent because the extent of the breach will become such that they'll have no choice but to just lawyer up.
I feel like they are heading this route as well. Possibly even withholding information in order to save the company from mass exodus due to the incident. I'm sure they'll be fine.
While the nature of what limited things they had disclosed to date pointed to this situation part of me wanted to believe it wasn't as bad as I was assuming. And now the trendline on this suggests I should have already done everything I've outlined above. And I'm low confidence anybody is going to be proactive in telling me until it's absolutely obvious that these things have been compromised and exploited.
This is gonna suck.
That is where you are supposed to have encrypted as many data fields of the record as possible - in addition to the conventional database encryption which encrypts the database as a whole.
If that's the case...yikes.
They can't, they surely have EU customers so have to follow GDPR disclosure rules, which they might already be afoul of.
Who knows how long Heroku's internal systems were compromised.
A lot has changed at Heroku in the past 8 years since I left, particularly in the direction of being subsumed into the greater Salesforce org. My working assumption is that everything left there is being done "the Salesforce way" at this point. Take that to mean what you will, but it seems pretty clear we're long past the days of openly communicating with customers as quickly as you have relevant/important information to share with them.
>On April 7, 2022, a threat actor obtained access to a Heroku database and downloaded stored customer GitHub integration OAuth tokens. Access to the environment was gained by leveraging a compromised token for a Heroku machine account. According to GitHub, the threat actor began enumerating metadata about customer repositories with the downloaded OAuth tokens on April 8, 2022. On April 9, 2022, the attacker downloaded a subset of the Heroku private GitHub repositories from GitHub, containing some Heroku source code.
...
> Separately, our investigation also revealed that the same compromised token was leveraged to gain access to a database and exfiltrate the hashed and salted passwords for customers’ user accounts. For this reason, Salesforce is ensuring all Heroku user passwords are reset and potentially affected credentials are refreshed. We have rotated internal Heroku credentials and put additional detections in place. We are continuing to investigate the source of the token compromise.
I’m keen to get off Heroku, but waiting for one of the newer alternatives (Render/Fly+others) to implement WAL point in time restore for Postgres. It’s the only thing keeping me on Heroku now, but is indispensable.
Anyone here from them have any update on when we could see that feature made available?
The only weird hitch with Google Cloud Run (and their other serverless products) is you either need to either use public IPs for e.g. memory store cache or other things in your network (minus your database), or set up a VPC Access Connector [0]. Admittedly, that was easy once I realized I needed it, but was very annoying to figure out (because django's default redis socket timeout is never...).
You also don't get SSH access, as it's fully managed. K8s is still the easiest version where you get that, unfortunately.
[0]: https://cloud.google.com/vpc/docs/configure-serverless-vpc-a...
From what I’ve seen in the market, fly, render, and railway are the only real Heroku competitors. All seem to be missing a few critical pieces of functionality that is preventing me from migrating. Railway doesn’t appear to allow me to remotely run Python commands, they also don’t have automated nightly backups. Fly and Render respectively have a handful of missing features. It would be amazing if one of these services came out with 1:1 feature parity with Heroku. I’d move in a heartbeat.
https://cloud.google.com/run/docs/quickstarts
https://cloud.google.com/appengine/docs/standard
Unfortunately some things are more difficult than necessary =/ OTOH the GCP ecosystem, network, and data-center presence are vast.
They also have managed postgres db with very powerful tools.
Coherence (disclosure- I’m a cofounder) - https://www.withcoherence.com - is one option. A defined workflow for production-quality full-stack web apps with dev and production built in alongside automated test environments, including CI/CD and cloud IDEs - all configured with one high-level YAML. We’re in a very early private beta on google cloud right now - if you’re interested, please check out our site above and let us know!
I'm not saying it would have been impossible to detect the first action, but it's substantially easier to detect the second.
I'm pretty confident google will pick-up someone trying to brute force a 6 character password. That google will notice connections from new / different IPs or browsers. That's because google asks for my 2FA in various situations but doesn't annoy me by asking for 2FA all the time.
I use one govt system that has something like a 14 character password requirement. For even more security if you don't log in for 90 days your account goes inactive and the password EXPIRES! Very secure you say? Well, to regain access you have to provide the answer to a security question - favorite pet! That's a 5 letter word that doesn't change (and is probably pretty guessable).
Here is another example:
"(b) Information systems must be designed to require passwords to be changed not less frequently than every sixty (60) days." - SBA IT Security Policy - 90 47 4
I'm sure for some it may be an unwieldy amount of work but for others (depending on tech stack etc.) it ought to be fairly doable. In the long run it'll save money too.
We (like many others I assume) pay more for Heroku than AWS as it allows us to “outsource” our dev ops. We are a small team (sub 15) with a decent sized, decade old app. We’ve had it on AWS before (and used platforms like BuildKite) but both required much more overhead (in terms of employee salary). Anecdotally I’ve heard the same from friends, though I understand AWS works well and is cheaper if you know AWS well.
I looked at Render but they move data out of the chosen region (so out of EU) and that is a huge issue for our clients. They also proxy through Cloudflare which is another big problem when you are dealing with sensitive data.
I haven't heard _anything_ from Heroku on this, my colleague has been getting updates since this started...
We are both admins of our companies account.
Given how much of a web of interdependent and undocumented pain heroku is, I'm not surprised it's taken this long. A security team without any context to Heroku must have had to trace through everything system by system. Especially if core-db got popped.
https://www.digitalocean.com/products/app-platform
Railway.app is pretty nice, very slick interface, this has the most "heroku-feel" -
It's been solid since I got up and running but took me about a day to get a simple CRUD app moved over to it.
If they can improve the documentation it'll be much better. Right now I'm not considering it for future projects though just because platforms like Fly.io and Render seem to have better docs and additional functionality that DO Apps doesn't have yet.
^ Note: I use to work for Aptible. Great company. Great people. Now working for one of their spin outs.
Lots of awesome products here, I'd argue that only replit is a true 10x change from the Heroku innovations in terms of providing a next-gen developer experience.
I'm the cofounder of a new company called Coherence, that we think creates a new direction and offers a better platform for the next leap forward. By integrating from dev to prod and capturing the whole SDLC, as well as by operating in your own cloud, we're focused on delivering the best developer experience possible, without compromising anywhere else. Check us out at https://www.withcoherence.com. We're in an early closed beta so not yet a fit for all teams, but feedback is welcome!
On a scale of Slack to Oracle on breach notifications, this was definitely closer to Oracle.
I went ahead and reset my planetscale passwords and moved my app out of Heroku. But I only had two apps hosted there I feel for anyone with a large number of apps on there.
Salesforce is doing a terrible job managing this lucrative platform. I have no idea why muck up a good service like that. They have some plugins and a postgres connector but the drive to innovate, the drive to even care stops there. All these news act as a reminder that I should move my code to a "real" cloud provider.
Any idea if this involves AWS EC2 Instance Roles? It’s incredibly convenient, but has got to be the scariest feature to enable on a platform that allows arbitrary user code to execute.
Newer integrations like Github Apps are more granular and can restrict the scope , also ssh deploy keys are an option for other purposes, but specifically the tokens issued for the Heroku Dashboard can write to the public repos of a user or org.
This morning, I was unable to log into my account and had to reset again. And update our services again.
> Due to the nature of this issue, you may be required to reset your passwords again in the future.
We are a small team and were hoping to migrate all our infra to Heroku in the upcoming quarter.
Not all hackers are threat actors.