On the one hand, this is obviously the right decision. The number of giant data breeches caused by incorrectly configured S3 buckets is enormous.
But... every year or so I find myself wanting to create an S3 bucket with public read access to I can serve files out of it. And every time I need to do that I find something has changed and my old recipe doesn't work any more and I have to figure it out again from scratch!
Even if you have a terrible and permissive bucket policy or ACLs (legacy but still around) configured for the S3 bucket, if you have Block Public Access turned on - it won't matter. It still won't allow public access to the objects within.
If you turn it off but you have a well scoped and ironclad bucket policy - you're still good! The bucket policy will dictate who, if anyone, has access. Of course, you have to make sure nobody inadvertantly modifies that bucket policy over time, or adds an IAM role with access, or modifies the trust policy for an existing IAM role that has access, and so on.
Yeah, what month?
Once I have that I can also ask it for the custom tweaks I need.
I now have the daunting challenge of deploying an Azure Kubernetes cluster with... shudder... Windows Server containers on top. There's a mile-long list of deprecations and missing features that were fixed just "last week" (or whatever). That is just too much work to keep up with for mere humans.
I'm thinking of doing the same kind of customised chatbot but with a scheduled daily script that pulls the latest doco commits, and the Azure blogs, and the open GitHub issue tickets in the relevant projects and dumps all of that directly into the chat context.
I'm going to roll up my sleeves next week and actually do that.
Then, then, I'm going to ask the wizard in the machine how to make this madness work.
Pray for me.
You're braver than me if you're willing to trust the LLM here - fine if you're ready to properly review all the relevant docs once you have code in hand, but there are some very expensive risks otherwise.
Hasn't it been this way for many years?
>Spot instances used to be much more of a bidding war / marketplace.
Yeah because there's no bidding any more at all, which is great because you don't get those super high spikes as availability drops and only the ones who bid super high to ensure they wouldn't be priced out are able to get them.
>You don’t have to randomize the first part of your object keys to ensure they get spread around and avoid hotspots.
This one was a nightmare and it took ages to convince some of my more pig headed coworkers in the past that they didn't need to do it any more. The funniest part is that they were storing their data as millions and millions of 10-100kb files, so the S3 backend scaling wasn't the thing bottlenecking performance anyway!
>Originally Lambda had a 5 minute timeout and didn’t support container images. Now you can run them for up to 15 minutes, use Docker images, use shared storage with EFS, give them up to 10GB of RAM (for which CPU scales accordingly and invisibly), and give /tmp up to 10GB of storage instead of just half a gig.
This was/is killer. It used to be such a pain to have to manage pyarrow's package size if I wanted a Python Lambda function that used it. One thing I'll add that took me an embarrassingly long time to realize is that your Python global scope is actually persisted, not just the /tmp directory.
Sorry, this is absolutely still the case if you want to scale throughput beyond the few thousand IOPS a single shard can serve. S3 will automatically reshard your key space, but if your keys are sequential (eg leading timestamp) all your writes will still hit the same shard.
Source: direct conversations with AWS teams.
I had a theory (based on no evidence I'm aware of except knowing how Amazon operates) that the original Glacier service operated out of an Amazon fulfillment center somewhere. When you put it a request for your data, a picker would go to a shelf, pick up some removable media, take it back, and slot it into a drive in a rack.
This, BTW, is how tape backups on timesharing machines used to work once upon a time. You'd put in a request for a tape and the operator in the machine room would have to go get it from a shelf and mount it on the tape drive.
https://www.reddit.com/r/DataHoarder/comments/12um0ga/the_ro...
Which is basically exactly what you described but the picker is a robot.
Data requests go into a queue; when your request comes up, the robot looks up the data you requested, finds the tape and the offset, fetches the tape and inserts it into the drive, fast-forwards it to the offset, reads the file to temporary storage, rewinds the tape, ejects it, and puts it back. The latency of offline storage is in fetching/replacing the casette and in forwarding/rewinding the tape, plus waiting for an available drive.
Realistically, the systems probably fetch the next request from the queue, look up the tape it's on, and then process every request from that tape so they're not swapping the same tape in and out twenty times for twenty requests.
It's been a long time, and features launched since I left make clear some changes have happened, but I'll still tread a little carefully (though no one probably cares there anymore):
One of the most crucial things to do in all walks of engineering and product management is to learn how to manage the customer expectations. If you say customers can only upload 10 images, and then allow them to upload 12, they will come to expect that you will always let them upload 12. Sometimes it's really valuable to manage expectations so that you give yourself space for future changes that you may want to make. It's a lot easier to go from supporting 10 images to 20, than the reverse.
It feels odd that this is some sort of secret. Why can't you talk about it?
Then the existing pickers would get special instructions on their handhelds: Go get item number NNNN from Row/shelf/bin X/Y/Z and take it to [machine-M] and slot it in, etc.
Even worse, if you run self hosted NAT instance(s) don't use a EIP attached to them. Just use a auto-assigned public IP (no EIP).
NAT instance with EIP
- AWS routes it through the public AWS network infrastructure (hairpinning).
- You get charged $0.01/GB regional data transfer, even if in the same AZ.
NAT instance with auto-assigned public IP (no EIP)
- Traffic routes through the NAT instance’s private IP, not its public IP.
- No regional data transfer fee — because all traffic stays within the private VPC network.
- auto-assigned public IP may change if the instance is shutdown or re-created so have automations to handle that. Though you should be using the network interface ID reference in your VPC routing tables.My understanding is that transfer gets charged on both sides as well. So if you own both sides you'll pay $0.02/GB.
I have the opposite philosophy for what it’s worth: if we are going to pay for AWS I want to use it correctly, but maximally. So for instance if I can offload N thing to Amazon and it’s appropriate to do so, it’s preferable. Step Functions, lambda, DynamoDB etc, over time, have come to supplant their alternatives and its overall more efficient and cost effective.
That said, I strongly believe developers don’t do enough consideration as to how to maximize vendor usage in an optimal way
Truly a marketing success.
Because it's not straightforward. 1) You need to have general knowledge of AWS services and their strong and weak points to be able to choose the optimal one for the task, 2) you need to have good knowledge of the chosen service (like DynamoDB or Step Functions) to be able to use it optimally; being mediocre at it is often not enough, 3) local testing is often a challenge or plain impossible, you often have to do all testing on a dev account on AWS infra.
AWS can be used in a different, cost effective, way.
It can be used as a middle-ground capable of serving the existing business, while building towards a cloud agnostic future.
The good AWS services (s3, ec2, acm, ssm, r53, RDS, metadata, IAM, and E/A/NLBs) are actually good, even if they are a concern in terms of tracking their billing changes.
If you architect with these primitives, you are not beholden to any cloud provider, and can cut over traffic to a non AWS provider as soon as you’re done with your work.
If you don't use the E(lasticity) of EC2, you're burning cash.
For prod workloads, if you can go from 1 to 10 instances during an average day, that's interesting. If you have 3 instances running 24/7/365, go somewhere else.
For dev workloads, being able to spin instances in a matter of seconds is a bliss. I installed the wrong version of a package on my instance? I just terminate it, wait for the auto-scaling group to pop a fresh new one a start again. No need to waste my time trying to clean my mess on the previous instance.
You speak about Step Functions as an efficient and cost effective service from AWS, and I must admit that it's one that I avoid as much as I can... Given the absolute mess that it is to setup/maintain, and that you completely lock yourself in AWS with this, I never pick it to do anything. I'd rather have a containerized workflow engine running on ECS, even though I miss on the few nice features that SF offers within AWS.
The approach I try to have is:
- business logic should be cloud agnostic
- infra should swallow all the provider's pills it needs to be as efficient as possible
There's a sweet spot somewhere in between raw VPSes and insanely detailed least-privilege serverless setups that I'm trying to revert to. Fargate isn't unmanageable as a candidate, not sure it's The One yet but I'm going to try moving more workloads to it to find out.
Want to set up MFA ... login required to request device.
Yes, I know, they warned us far ahead of time. But not being able to request one of their MFA devices without a login is ... sucky.
> I understand your situation is a bit unique, where you are unable to log in to your AWS account without an MFA device, but you also can't order an MFA device without being able to log in. This is a scenario that is not directly covered in our standard operating procedures.
The best course of action would be for you to contact AWS Support directly. They will be able to review your specific case and provide guidance on how to obtain an MFA device to regain access to your account. The support team may have alternative options or processes they can walk you through to resolve this issue.
Please submit a support request, and one of our agents will be happy to assist you further. You can access the support request form here: https://console.aws.amazon.com/support/home
That last URL? You need to login to use it ...
https://docs.aws.amazon.com/vpc/latest/privatelink/vpc-endpo...
(Disclaimer: I work for AWS, opinions are my own.)
They spent the effort of branding private VPC endpoints "PrivateLink". Maybe it took some engineering effort on their part, but it should be the default out of the box, and an entirely unremarkable feature.
In fact, I think if you have private subnets, the only way to use S3 etc is Private Link (correct me if I'm wrong).
It's just baffling.
S3 can use either, and we recommend establishing VPC Gateway endpoints by default whenever you need S3 access.
(Disclaimer: I work for AWS, opinions are my own.)
People who are probably shouldn't be on aws - but they usually have to for unrelated reasons, and they will work to reduce their bill.
This just sounds like a polite way of saying "we're taking peoples' money in exchange for nothing of value, and we can get away with it because they don't know any better".
Hideous.
They should be, of course, at least when the destination is an AWS service in the same region.
[edit: I'm speaking about interface endpoints, but S3 and DynamoDB can use gateway endpoints, which are free to the same region]
S3 can use either, and we recommend establishing VPC Gateway endpoints by default whenever you need S3 access.
(Disclaimer: I work for AWS, opinions are my own.)
Ultimately AWS doesn’t have the right leadership or talent to be good at GenAI, but they do (or at least used to) have decent core engineers. I’d like to see them get back to basics and focus there. Right now leadership seems panicked about GenAI and is just throwing random stuff at the wall desperately trying to get something to stick. Thats really annoying to customers.
TGW is... twice as expensive as vpc peering?
But unlike peering TGW traffic flows through an additional compute layer so it has additional cost.
WTH?
It was fine, until there started to be ways of wiring up networks between accounts (eg PrivateLink endpoint services) and you had to figure out which AZ was which so you could be sure you were mapping to the the same AZs in each account.
I built a whole methodology for mapping this out across dozens of AWS accounts, and built lookup tables for our internal infrastructure… and then AWS added the zone ID to AZ metadata so that we could just look it up directly instead.
The canonical AZ naming was provided because, I bet, they realized that the users who needed canonical AZ identifiers were rarely the same users that were causing hot spots via always picking the same AZ.
From my understanding, I don't think this is completely accurate. But, to be fair, AWS doesn't really document this very well.
From my (informal) conversations with AWS engineers a few months ago, it works approximately like this (modulo some details I'm sure the engineers didn't really want to share):
S3 requests scale based on something called a 'partition'. Partitions form automatically based on the smallest common prefixes among objects in your bucket, and how many requests objects with that prefix receive. And the bucket starts out with a single partition.
So as an example, if you have a bucket with objects "2025-08-20/foo.txt" and "2025-08-19/foo.txt", the smallest common prefix is "2" (or maybe it considers the root as the generator partition, I don't actually know). (As a reminder, a / in an object key has no special significance in S3 -- it's just another character. There are no "sub-directories"). Therefore a partition forms based on that prefix. You start with a single partition.
Now if the object "2025-08-20/foo.txt" suddenly receives a ton of requests, what you'll see happen is S3 throttle those requests for approximately 30-60 minutes. That's the amount of time it takes for a new partition to form. In this case, the smallest common prefix for "2025-08-20/foo.txt" is "2025-08-2". So a 2nd partition forms for that prefix. (Again, the details here may not be fully accurate, but this is the example conveyed to me). Once the partition forms, you're good to go.
But the key issue here with the above situation is you have to wait for that warm up time. So if you have some workload generating or reading a ton of small objects, that workload may get throttled for a non-trivial amount of time until partitions can form. If the workload is sensitive to multi-minute latency, then that's basically an outage condition.
The way around this is that you can submit an AWS support ticket and have them pre-generate partitions for you before your workload actually goes live. Or you could simulate load to generate the partitions. But obviously, neither of these is ideal. Ideally, you should just really not try and store billions of tiny objects and expect unlimited scalability and no latency. For example, you could use some kind of caching layer in front of S3.
Hit it when building an iceberg Lakehouse using pre existing data. Using object prefixes fixed the issue.
Ex: your prefix is /id=12345. S3, under the hood, generates partitions named `/id=` and `/id=1`. Now, your id rolls over to `/id=20000`. All read/write activity on `/id=2xxxx` falls back to the original partition. Now, on rollover, you end up with read contention.
For any high-throughput workloads with unevenly distributed reads, you are best off using some element of randomness, or some evenly distributed partition key, at the root of your path.
As of when? According to internal support, this is still required as of 1.5 years ago.
The auto partitioning is different. It can isolate hot prefixes on its own and can intelligently pick the partition points. Problem is the process is slow and you can be throttled for more than a day before it kicks in.
They can do this with manual partitioning indeed. I've done it before, but it's not ideal because the auto partitioner will scale beyond almost anything AWS will give you with manual partitioning unless you have 24/7 workloads.
> you can be throttled for more than a day before it kicks in
I expect that this would depends on your use case. If you are dropping content you need to scale out to tons of readers, that is absolutely the case. If you are dropping tons of content with well distributed reads, then the auto partitioner is The Way.
They actually used to have the upstream docs in GitHub, and that was super nice for giving permalinks but also building the docs locally in a non-pdf-single-file setup. Pour one out, I guess
Everything you know is wrong.
Weird Al. https://www.youtube.com/watch?v=W8tRDv9fZ_c
Firesign Theatre. https://www.youtube.com/watch?v=dAcHfymgh4Y
My recent interactions with them would probably have been better if they were an LLM.
Perhaps they trained the LLM using that data though.
(Small customer though: yearly AWS spend around 80k. Support is 10% of that.)
The cloud moves fast. Compliance processes need to keep up. Manual annual reviews aren't enough when your infrastructure is changing constantly.
This is also why we built automated compliance monitoring - because what worked last quarter might not work today.
Ideally it should be a stream of important updates that can be interactively filtered by time-range. For example, if I have not been actively consuming AWS updates firehose for last 18 months, I should be able to "summarize" that length of updates.
Why this is not already a feature of "What's New" section of AWS and other platforms -- I dont know. Waiting to be built -- either by OEM or by the Community.
Turns out there're many incorrect implementations of Happy Eyeballs that cancel the ipv4 connection attempts after the timeout, and then switch to trying the AAAA records and subsequently throwing a "Cannot reach host" error. For context, in Happy Eyeballs you're supposed to continue trying both network families in parallel.
This only impacts our customers who live far away from the region they're accessing, however, and there's usually a workaround - in Node you can force the network family to be v4 for instance
No. They break existing customer expectations.
There are heaps of dualstack API endpoints https://docs.aws.amazon.com/general/latest/gr/rande.html#dua... if that's what the client wants.
The amazonaws.com domain endpoints did not introduce ipv6/AAAA directly is (mostly) due to access control. For better or worse there are a lot of "v4 centric" IAM statements, like aws:SourceIp, in identity/resource/bucket policies. Introducing a new v6 value is going to break all of those existing policies with either unexpected DENYs or, worse, ALLOWs. Thats a pretty poor customer experience to unexpectedly break your existing infrastructure or compromise your access control intentions.
AWS _could_ have audited every potential IAM policy and run a MASSIVE outreach campaign, but something as simple as increasing (opaque!) instance ID length was a multi year effort. And introducing backwards compatibility on a _per policy_ basis is its own infinite security & UX yak shaving exercise as well.
So thats why you have opt-in usage of v6/dualstack in the client/SDK/endpoint name.
Does anyone have experience running Spot in 2025? If you were to start over, would you keep using Spot?
- I observe with pricing that Spot is cheaper
- I am running on three different architectures, which should limit Spot unavailability
- I've been running about 50 Spot EC2 instances for a month without issue. I'm debating turning it on for many more instances1. Spot with autoscaling to adjust to demand and a savings plan that covers the ~75th percentile scale
2. On-demand with RIs (RIs will definitely die some day)
3. On-demand with savings-plans (More flexible but more expensive than RIs)
3. Spot
4. On-demand
I definitely recommend spot instances. If you're greenfielding a new service and you're not tied to AWS, some other providers have hilariously cheap spot markets - see http://spot.rackspace.com/. If you're using AWS, definitely auto-scaling spot with savings plans are the way to go. If you're using Kubernetes, the AWS Karpenter project (https://karpenter.sh/) has mechanisms for determining the cheapest spot price among a set of requirements.
Overall tho, in my experience, ec2 is always pretty far down the list of AWS costs. S3, RDS, Redshift, etc wind up being a bigger bill in almost all past-early-stage startups.
Oh yeah, we were in the same row!
Wouldn't this always depend on the length of the queue to access the robotic tape library? Once your tape is loaded it should move really quickly:
https://www.ibm.com/docs/en/ts4500-tape-library?topic=perfor...
Your assumption holds if they still use tape. But this paragraph hints at it not being tape anymore. The eternal battle between tape versus drive backup takes another turn.
Not strictly true.
Please see the documentation: https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimi...
This 2024 re:Invent session "Optimizing storage performance with Amazon S3 (STG328)" which goes very deep on the subject: https://www.youtube.com/watch?v=2DSVjJTRsz8
And this blog that discusses Iceberg's new base-2 hash file layout which helps optimize request scaling performance of large-scale Iceberg workloads running on S3: https://aws.amazon.com/blogs/storage/how-amazon-ads-uses-ice...
"If you want to partition your data even better, you can introduce some randomness in your key names": https://youtu.be/2DSVjJTRsz8?t=2206
FWIW The optimal way we were told was to partition our data was to do this: 010111/some/file.jpg.
Where `010111/` is a random binary string which will please both the automatic partitioning (503s => partition) and manual partitioning you could ask AWS. Please as in the cardinality of partitions grows slower at each characters vs prefixes like `az9trm/`.
We were told that the later version makes manual partitioning a challenge because as soon as you reach two characters you've already created 36x36 partitions (1,296).
The issue with that: your keys are no more meaningful if you're relying on S3 to have "folders" by tenants for example (customer1/..).
If key prefixes don’t matter much any more, then it’s a very recent change that I’ve missed.
I just started working with a vendor who has a service behind API Gateway. It is a bit slow(!) and times out at 30 seconds. I've since modified my requests to chunk subsets of the whole dataset, to keep things under the timeout.
Has this changed? Is 30 secs the new or the old timeout?
Not true for GPU instances, they're stuck 5 minutes in a stopping state because they run some GPU health checks.
When this was changed? I think this is still an issue, I've had some such errors quite recently.
> DynamoDB now supports empty values for non-key String and Binary attributes in DynamoDB tables. Empty value support gives you greater flexibility to use attributes for a broader set of use cases without having to transform such attributes before sending them to DynamoDB. List, Map, and Set data types also support empty String and Binary values.
https://docs.aws.amazon.com/amazondynamodb/latest/developerg...
That's a reasonable approach, but the fact this post exists shows that this practice is a reputational risk. By all means do this if you think it's the right thing to do, but be aware that first impressions matter and will stick for a long time.
Then you can self-cloud. Several startips are in this space. It gets you the best of both worlds: scaling, freedom, cost-control.
And no marketing jargon that you need to learn, and then unlearn!