Ask HN: Our AWS account got compromised after their outage

395 pointskinj287mo ago95 comments

Could there be any link between the two events?

Here is what happened:

Some 600 instances were spawned within 3 hours before AWS flagged it off and sent us a health event. There were numerous domains verified and we could see SES quota increase request was made.

We are still investigating the vulnerability at our end. our initial suspect list has 2 suspects. api key or console access where MFA wasn’t enabled.

95 comments

timdev27mo ago

I would normally say that "That must be a coincidence", but I had a client account compromise as well. And it was very strange:

Client was a small org, and two very old IAM accounts had suddenly had recent (yesterday) console log ins and password changes.

I'm investigating the extent of the compromise, but so far it seems all they did was open a ticket to turn on SES production access and increase the daily email limit to 50k.

These were basically dormant IAM users from more than 5 years ago, and it's certainly odd timing that they'd suddenly pop on this particular day.

tcdent7mo ago

Smells like a phishing attack to me.

Receive an email that says AWS is experiencing an outage. Log into your console to view the status, authenticate through a malicious wrapper, and compromise your account security.

SoftTalker7mo ago

Good point. Phishers would certainly take advantage of a widely reported outage to send emails related to "recovering your services."

Even cautious people are more vulnerable to phishing when the message aligns with their expectations and they are under pressure because services are down.

Always, always log in through bookmarked links or typing them manually. Never use a link in an email unless it's in direct response to something you initiated and even then examine it carefully.

roblabla7mo ago

You can also use phishing-resistant login/2FA like passkeys/FIDO keys, where it is available (and I'm pretty sure amazon supports it), to minimize the risk of accidentally login into a phishing website while under pressure.

2 more replies

Sebb7677mo ago

> Always, always log in through bookmarked links or typing them manually. Never use a link in an email unless it's in direct response to something you initiated and even then examine it carefully.

If you still want to avoid the comfort of typing in stuff manually or navigating the webinterface, logging in on a new tab and then clicking on the link is also an option.

1 more reply

plaidfuji7mo ago

What if the outage and phishing attack were coordinated at a higher level? There’s a scary thought.

1 more reply

Scoundreller7mo ago

A phisher that did their homework would send out a tone deaf email with a subject line like this that aws sent me during their outage:

> You could win $5,000 in AWS credits at Innovate

timdev27mo ago

These were accounts that shouldn't have had console access in the first place, and were never used by humans to log in AFAICT. I don't know exactly what they were originally for, but they were named like "foo-robots", were very old.

At first I thought maybe some previous dev had set passwords for troubleshooting, saved those passwords in a password manager, and then got owned all these years later. But that's really, really, unlikely. And the timing is so curious.

portaouflop7mo ago

Why keep accounts like this around anyway? Sounds like a breach was just waiting to happen…

1 more reply

highfrequencyy7mo ago

I second this, pretty much immediately after my organization got hit with a wave of phishing emails.

jbverschoor7mo ago

Or maybe it wasn't DNS, but they simply pulled the plug bc of some breach?

LeonardoTolstoy7mo ago

Almost this exact thing happened to me about a year ago. Very old account login, SES access with request to raise the email limit. We were only quickly tipped off because they had to open a ticket to get the limit raised.

If you haven't check newly made Roles as well. We quashed the compromised users pretty quickly (including my own, the origin we figured out), but got a little lucky because I just started cruising the Roles and killing anything less than a month old or with admin access.

To play devil's advocate a bit. In our case we are pretty sure my key actually did get compromised although we aren't precisely sure how (probably a combination of me being dumb and my org being dumb and some guy putting two and two together). But we did trace the initial users being created to nearly a month prior to the actual SES request. It is entirely possible whomever did your thing had you compromised for a bit, and then once AWS went down they decided that was the perfect time to attack, when you might not notice just-another-AWS-thing happening.

timdev27mo ago

Thanks for sharing. After digging in, it appears that something very similar happened here, after all. It looks like an access key with admin role leaked some time ago. At first, they just ran a quiet GetCallerIdentity, then sat on it. Then, on outage day, they leveraged it. In our case, they just did the SES thing, and tried to persist access by setting up IAM Identity Center.

orblivion7mo ago

I wonder if a few cases of compromise right after the outage can also be a coincidence. If we have a lot of reports of the same, then it gets interesting.

(The particulars of your case being strange is a separate question though.)

CaptainOfCoit7mo ago

Is it possible that people who already managed to get access (that they confirmed) has been waiting for any hiccups in AWS infrastructure in order to hide among the chaos when it happens? So maybe the access token was exposed weeks/months ago, but instead of going ahead directly, idle until there is something big going on.

Certainly feels like an strategy I'd explore if I was on that side of the aisle.

iainctduncan7mo ago

Absolutely. I'm in diligence and we are hearing about attackers even laying the ground work and then waiting for company sales. The sophisticated ones are for sure smart enough to take advantage of this kind of thing and to even be prepping in advance and waiting for golden opportunities.

jinen837mo ago

I am from the same team & i can concur with what you are saying. I did see a warning about the same key that was used in todays exploit about 2 years ago from some random person in an email. but there was no exploutation till yesterday.

LeonardoTolstoy7mo ago

This is it. I had the same thing happen to me a year ago and there was a month between the original access to our system and the attack. And similarly they waited until a perceived lull in what might be org diligence (just prior to thanksgiving) to attack.

shadowpho7mo ago

Wouldn’t this be a terrible time because everyone is looking/logging into AWS?

If my company used AWS I would be hyper aware about anything that it’s doing right now

LorenPechtel7mo ago

I think the idea is that after an outage you would expect unusual patterns and thus not be sensitive to them.

CaptainOfCoit7mo ago

> Wouldn’t this be a terrible time because everyone is looking/logging into AWS?

Yes and no I suppose, it has trade-offs. On one hand, what you're saying is true for sure. But on the other hand, if you're currently trying to rescue a failing service, come across something that looks weird and you have a hunch you should investigate, but you're in the middle of fire-fighting, maybe you're more likely to ignore it at least until the fires been put out?

djeastm7mo ago

Might be, but also could be the opposite. With peoples' heads swimming just to get back online they might de-prioritize something else that just looks odd where under normal times they'd have the time/energy to go investigate.

sousastep7mo ago

couple folks on reddit said while they were refreshing during the outage, they were briefly logged in as a whole different user

gwbas1c7mo ago

Years ago I worked for a company where customers started seeing other customers' data.

The cause was a bad hire decided to do a live debugging session in the production environment. (I stress bad hire because after I interviewed them, my feedback was that we shouldn't hire them.)

It was kind of a mess to track down and clean up, too.

__turbobrew__7mo ago

Maybe dynamodb was inconsistent for a period and as that backs IAM credentials were scrambled? Do you have references to this, because if it is true that is really really bad.

aeyes7mo ago

AWS IAM doesn't use or depend on DynamoDB

afandian7mo ago

Got references? This is crazy.

blast7mo ago

I saw a link to https://old.reddit.com/r/webdev/comments/1obtbmg/aws_site_re... at one point but then it was deleted

perpil7mo ago

This is not about the AWS Console. It is talking about the customer's site hosted on CloudFront. It is possible to cross wires with user sessions when using CloudFront if you haven't set caching granular enough to be specific to an end user. This scenario is customer error, not AWS.

1 more reply

CodesInChaos7mo ago

electricity_is_life's comment on reddit seems to explain it:

> Not sure if this is what happened to you, but one thing I ran into a while back is that even if you return Cache-Control: no-store it's still possible for a response to be reused by CloudFront. This is because of something called a "collapse hit" where two requests that occur at the same time and are identical (according to your cache key) get merged together into a single origin request. CloudFront isn't "storing" anything, but the effect is still that a user gets a copy of a response that was already returned to a different user.

> https://stackoverflow.com/a/69455222

> If your app authenticates based on cookies or some other header, and that header isn't part of the cache key, it's possible for one user to get a response intended for a different user. To fix it you have to make sure any headers that affect the server response are in the cache key, even if the server always returns no-store.

---

Though the AWS docs seem to imply that no-store is effective:

> If you want to prevent request collapsing for specific objects, you can set the minimum TTL for the cache behavior to 0 and configure the origin to send Cache-Control: private, Cache-Control: no-store, Cache-Control: no-cache, Cache-Control: max-age=0, or Cache-Control: s-maxage=0.

https://docs.aws.amazon.com/AmazonCloudFront/latest/Develope...

1 more reply

duk3luk37mo ago

This isn't about an aws account, this is about the auth inside the project that user is running.

CaptainOfCoit7mo ago

> couple folks on reddit said while they were refreshing during the outage, they were briefly logged in as a whole different user

Didn't ChatGPT have a similar issue recently? Would sound awfully similar.

sunaookami7mo ago

Steam also had this, classic caching issue.

mbo7mo ago

This happened to me on Twitter maybe like, 9 years ago? What's the mechanism of action that causes this to happen?

1 more reply

TZubiri7mo ago

A security incident like this would dwarf in comparision to partial unavailability of services.

liviux7mo ago

A friend of a friend knows a friend who logged in to Netflix root account. Source: trust me bro

ThreatSystems7mo ago

Cloudtrail events should be able to demonstrate WHAT created the EC2s. Off the top of my head I think it's the runinstance event.

ThreatSystems7mo ago

I'm officially off of AWS so don't have any consoles to check against, but back on a laptop.

Based on docs and some of the concerns about this happening to someone else, I would probably start with the following:

1. Check who/what created those EC2s[0] using the console to query: eventSource:ec2.amazonaws.com eventName:RunInstances

2. Based on the userIdentity field, query the following actions.

3. Check if someone manually logged into Console (identity dependent) [1]: eventSource:signin.amazonaws.com userIdentity.type:[Root/IAMUser/AssumedRole/FederatedUser/AWSLambda] eventName:ConsoleLogin

4. Check if someone authenticated against Security Token Service (STS) [2]: eventSource:sts.amazonaws.com eventName:GetSessionToken

5. Check if someone used a valid STS Session to AssumeRole: eventSource:sts.amazonaws.com eventName:AssumeRole userIdentity.arn (or other identifier)

6. Check for any new IAM Roles/Accounts made for persistence: eventSource:iam.amazonaws.com (eventName:CreateUser OR eventName:DeleteUser)

7. Check if any already vulnerable IAM Roles/Accounts modified to be more permissive [3]: eventSource:iam.amazonaws.com (eventName:CreateRole OR eventName:DeleteRole OR eventName:AttachRolePolicy OR eventName:DetachRolePolicy)

8. Check for any access keys made [4][5]: eventSource:iam.amazonaws.com (eventName:CreateAccessKey OR eventName:DeleteAccessKey)

9. Check if any production / persistent EC2s have had their IAMInstanceProfile changed, to allow for a backdoor using EC2 permissions from a webshell/backdoor they could have placed on your public facing infra. [6]

etc. etc.

But if you have had a compromise based on initial investigations, probably worth while getting professional support to do a thorough audit of your environment.

[0] https://docs.aws.amazon.com/awscloudtrail/latest/userguide/c...

[1] https://docs.aws.amazon.com/awscloudtrail/latest/userguide/c...

[2] https://docs.aws.amazon.com/IAM/latest/UserGuide/cloudtrail-...

[3] https://docs.aws.amazon.com/awscloudtrail/latest/userguide/s...

[4] https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credenti...

[5] https://research.splunk.com/sources/0460f7da-3254-4d90-b8c0-...

[6] https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_R...

jinen837mo ago

this is helpful. i will look for the logs.

Also some more observations below:

1) some 20 organisations were created within our Root all with email id with same domain (co.jp) 2) attacker had created multiple fargate templates 3) they created resources in 16-17 AWS regions 4) they requested to raise SES,WS Fargate Resource Rate Quota Change was requested, sage maker Notebook maintenance - we have no need of using these instances (recd an email from aws for all of this) 5) in some of the emails i started seeing a new name added (random name @outlook.com)

ThreatSystems7mo ago

It does sound like you've been compromised by an outfit that has got automation to run these types of activities across compromised accounts. A Reddit post[0] from 3 years ago seems to indicate similar activities.

Do what you can to triage and see what's happened. But I would strongly recommend getting a professional outfit in ASAP to remediate (if you have insurance notify them of the incident as well - as often they'll be able to offer services to support in remediating), as well as, notify AWS that an incident has occurred.

[0] https://www.reddit.com/r/aws/comments/119admy/300k_bill_afte...

sylens7mo ago

RunInstances

jmward017mo ago

If I were an attacker I would choose when to attack and a major disruption happening leaving your logging is in chaos seems like it could be a good time. Is it possible you had been compromised for a while and they took that moment to take advantage of it? Or, similarly, they took that moment to use your resources for a different attack that was spurred by the outage?

defraudbah7mo ago

weird, can you send me your API key so I can verify it's not in the list of compromised credentials?

darkamaul7mo ago

I know this is just a playful joke, but I wanted to gently flag something important. Even in humor, we should never casually discuss sharing API keys or credentials.

You never know when or if someone might misinterpret a message like this.

bigDinosaur7mo ago

It's not our responsibility to avoid jokes because some people are awful at their jobs and/or idiots. How on earth would people who would send an API key in response to a joke fare against a genuinely malicious social engineering attempt...?

kstrauser7mo ago

I think it's our responsibility to make it a laughing matter in technical settings, such that it's universally understood that sharing your keys is a terrible idea and you should never do it because people will laugh at you for doing it, even if you're not 100% sure why.

Around non-technical people, explain why it's a bad idea, and be empathetic so that your friends, family, and coworkers feel comfortable asking you questions about things like that. Among your techie friends, absolutely, laugh away.

dijit7mo ago

Agreed, both the joke and the warning are valid.

Someone will learn from this, so it's totally worthwhile and I hope nobody got offended.

If they did, we have bigger issues potentially.

nashashmi7mo ago

It is not my job so stuff like this is helpful to know.

1 more reply

wiether7mo ago

Now that we have people browsing with an "AI browser", it could become quite interesting though

1oooqooq7mo ago

win-win

jy148987mo ago

I'm interpretting your message as you asking me to share my API keys

jeffrallen7mo ago

You are absolutely right!

yfiapo7mo ago

Highly likely to be coincidence. Typically an exposed access key. Exposed password for non-MFA protected console access happens but is less common.

AtNightWeCode7mo ago

Not uncommon that machines get exposed during trouble-shooting. Just look at the Crowdstrike incident just the other year. People enabled RDP on a lot machines to "implement the fix" and now many of these machines are more vulnerable than if if they never installed that garbage security software in the first place.

didip7mo ago

During time of panic, that’s when people are most vulnerable to phishing attacks.

Total password reset and tell your AWS representative. They usually let it slide on good faith.

itsnowandnever7mo ago

i cant imagine it's related. if it is related, hello Bloomberg News or whoever will be reading this thread because that would be a catastrophic breach of customer trust that would likely never fully return

jddj7mo ago

You say that, but azure and okta have had a handful of these and life over there has more or less gone on.

Inertia is a hell of a drug

testfrequency7mo ago

Similarly, everyone is back to using CS and their stock is just fine

kondro7mo ago

us-east-1 is unimaginably large. The last public info I saw said it had 159 datacenters. I wouldn't be surprised if many millions of accounts are primarily located there.

While this could possibly be related to the downtime, I think this is probably an unfortunate case of coincidence.

Scramblejams7mo ago

159! Staggering. Got a source?

kondro7mo ago

Sorry, 158: https://baxtel.com/data-center/aws-us-east-n-virginia

geor9e7mo ago

If I was a burgler holding a stolen key to a house, waiting to pick a good day, a city-wide blackout would probably feel like a good day.

what7mo ago

That’s likely a pretty bad day to burgle. People are probably going to be at home. You should wait for garbage day and see who hasn’t put their bins out.

bthrn7mo ago

This guy burgles

rcbdev7mo ago

Sir, you must be confused. This is not reddit.com.

brador7mo ago

Lot of keys and passwords being panic entered on insecure laptops yesterday.

Do not discount the possibility of regular malware.

tylergetsay7mo ago

Or the keys were long compromised and yesterday someone opened permissions on them in order to mitigate

Traubenfuchs7mo ago

It makes me very uncomfortable to know I got my CC in GCP, AWS and oracle cloud and that I have access to 3 corporate AWS accounts with bills on the level of 10's of millions per month.

Why don't cloud providers offer IP restrictions?

I can only access GitHub from my corporate account if I am in the VPN and it should be like that for every of those services with the capability to destroy lives.

bdcravens7mo ago

Any chance you did something crazy while troubleshooting downtime (before you knew it was an AWS issue)? I've had to deal with a similar situation, and in my case, I was lazy and pushed a key to a public repo. (Not saying you are, just saying in my case it was a leaked API key)

WesleyJohnson7mo ago

Our Alexa had a random person "drop in" yesterday. We could hear a child talking on the other end, but no idea who it was. It may just be a coincidence, but it's never happened before so it's easy to imagine it might be related to the AWS issues.

mrktf7mo ago

More on technical side I'm interesting what is plausible explanation for this type "glitches"?: it inconsistent backend router state between processing nodes, processing application restart and screw up in shared memory segment (i can imagine to decrease load times - use "persistent" shared memory block for outstanding data), or just plain hash table collision and lack of empty slots (i mean: https://en.wikipedia.org/wiki/Hash_collision).

whoknew11227mo ago

The AWS issue related to DNS entries. And IAM doesn't use Dynamo DB. It wasn't related, other than an outage gives a good way to obfuscate TTPs.

more_corn7mo ago

How can you not know what credentials were used. A simple cloud trail search on the affected infrastructure will tell you.

uoflcards227mo ago

https://www.reddit.com/r/webdev/comments/1obtbmg/aws_site_re...

7mo ago

klysm7mo ago

Sounds like a coincidence to me

mr_windfrog7mo ago

Considering AWS’s position as the No.1 cloud provider worldwide, their operational standards are extremely high. If something like this happened right after an outage, coincidence is the most plausible explanation rather than incompetence.

j / k navigate · click thread line to collapse