The new pricing page for lambda ("Example 2") shows the cost for a 100M invocation/month workload with provisioned capacity for $542/month. For that same cost you could run ~61 Fargate instances (0.25 CPU, 0.5GB RAM) 24/7 for the same price, or ~160 instances with spot. For context I have ran a simple NodeJS workload on both Lambda and Fargate, and was able to handle 100M events/mo with just 3 instances.
Serverless developers take note: its time to learn Docker and how to write a task-definition.json.
Fargate is a great product, but it doesn't completely remove all operational work to the degree that Lambda does.
Now this is like throwing your hands up and saying the users bursts are too big for AWS.
Yes only paying for the compute you actually use is great, but so is having basically limitless compute power (your wallet willing) without the ops overhead and system maintenance.
Cold starts have been a problem fire a while, and while there many be a better way than this long term, ultimately to some degree the solution will always be keeping a function warm. And that’s ultimately compute, and aws is not likely to give that away.
They also support autoscaling the provisioned concurrency. See here:
https://docs.aws.amazon.com/autoscaling/application/userguid...
Answer could help us try something new. We currently use large google app engine 'apps' after failing to get functions to scale quick enough (and hit limits). we have SUPER bursty traffic that needs to scale up to hundreds of instances very fast.
AWS 2014: "Run your work loads on serverless so you don't have to deal with those pesky EC2 instances 24/7 anymore."
AWS 2019: "Click a checkbox and you can have your serverless workloads get dedicated EC2 instances 24/7!"
"Click a checkbox and we'll run your code for you, take care of OS security updates, compliance requirements, autoscaling, load balancing, AZ resiliency, getting logs of your box, restarting unhealthy processes, ..."
It would only be used for user impacting APIs.
There are a few types of processes that I have had to create.
1. A Windows service that processed a queue. We have 20x more messages at peak. Of course since it was tied to Windows, lambda wasn’t an option. I had to create an autoscaling group based on queue length. That also involves CloudWatch alarms to trigger scaling and now we either have one instance running all the time (production) or we have a min of zero and only launch an instance when there is a message in the queue (non prod). Not only is the process slower to scale, but because it’s Windows AWS does hourly billing.
Of course the deployment process and Cloudformation template are a lot more complicated than lambda.
2. Same sort of process on lambda. The CloudFormation template using SAM is much simpler and the process is faster to scale in and out.
Also, you can configure everything on the web and export the template.
3. A Node/Express API using lambda proxy integration behind API Gateway.
Again this was easy to set up but cold start times were killing us and we knew that we were going to have to move it off of lambda because of the 6MB request/response limit.
4. The same API as above running in Fargate.
Since we knew advance that this was the direction we wanted to go, I opted to use Node/Express for the lambda. So we didn’t require any code changes. But creating a registry, Docker containers, services, clusters, load balancers, autoscaling groups, etc took a lot longer to get right and then automating everything with CloudFormation was more complicated.
Provisioned Concurrency (PC) is an interesting feature for us as we've gotten so much feedback over the years about the pain point of the service over head leading to your code execution (the cold start). With PC we basically end up removing most of that service overhead by pre-spinning up execution environments.
This feature is really for folks with interactive, super latency sensitive workloads. This will bring any overhead from our side down to sub 100ms. Realistically not every workload needs this, so don't feel like you need this to have well performing functions. There are still a lot of thing you need to do in your code as well as knobs like memory which impact function perf.
- Chris Munns - https://twitter.com/chrismunns
1. the time to initialize the VM
2. the time to create an ENI if you are connecting to a VPC[1](until the NAT alternative rolls out globally)
3. the time to initialize your language runtime (Java seems to be the worse, scripting languages the best)
4. any program initialization done outside of your handler that runs once per cold start of your lambda runtime.
A fully “warm” instance avoids all four when run.
Is my understanding correct that a “provisioned” runtime that isn’t “warm” will only avoid the first two?
What state is a “provisioned” instance in?
[1] I refuse to use the colloquial but incorrect statement that the lambda is “running inside your VPC”.
This covers you straight through 4.
Now it's possible that your execution environment could be sitting for sometime waiting for any action and so pre-handler DB connections and things like that might need to be tweaked in this model.
Thanks, - munns
The pricing examples include using PC on a limited duty cycle, and billing is defined to start from the moment it's enabled (rather than from when it's ready), so it'd be reasonable to expect there's some level of certainty that the concurrency level is ready within a defined timeframe. What might that timeframe be, and to what level of certainty?
https://docs.aws.amazon.com/autoscaling/application/userguid...
I also find it deeply ironic that their solution to cold starts is to keep the function running 24/7...
Could I include openssh and Apache in my Lambda instance? Maybe run a Minecraft server? :P
As others have said, the previous workaround was a cron event that would invoke a function every few minutes to keep it warm. This is a lot better than that.
They're still working to get cold starts as fast as possible, but this helps a lot in the meantime.
Unless I am terribly mistaken, it doesn't seem like allowing AWS to handle this and not doing it in code (warmup plugin, cron job, etc.) is worth the cost.
When you do that, it only keeps one instance warm. If you have 10 concurrent requests, even if one is warm, the other 9 requests will still experience a cold start.
The only way around this is to send a request that holds the connection open long enough to make sure concurrent requests start a new lambda instance. While you are keeping the request open, that lambda instance isn’t available for a real call.
If the entire purpose of lambda is to make things easier, once you start down the Rube Goldberg path of trying to keep enough instances warm, it kind of defeats the purpose. Just spend the money and the time to set up an autoscaling group of the smallest instances of EC2 or use Fargate if you don’t want the cold start times.
The timed pings are just a hack and don't solve all the issues.
> @ben11kehoe @kondro @mwarkentin You pay for the configured Provisioned Concurrency with a flat hourly charge. Lambda usage gets billed the same, but with a discount on unit pricing ($0.035/GB-hour vs $0.06 on "on demand").
http://twitter.com/ajaynairthinks/status/1202125357144391680
People used to write cron jobs to keep their functions warm, which besides being ugly didn't even work well -- you could at best keep one instance warm with infrequent pinging, i.e. a provisioned concurrency of 1. So this feature addresses that use case in a much more systematic way.
There's some precedent for features like this -- provisioned IOPS and reserved instances come to mind. In both those cases you tradeoff elasticity and get some predictability in return (performance in one case, cost in another).
If you have a reliable base-load of a few requests a second and you don't have some constraint that forces you to use lambda, you are going to get much better value running your application on ecs/ec2.
I've always hated the term "serverless", but its usage in this context is even more ridiculous.
Also presumably more reliable. With the 5 minute ping the underlying container will be reprovisioned every few hours. At which point it’s a race to see whether the next ping comes before the real user request to swallow the cold start.
Very unlike other AWS APIs and very annoying.
* https://docs.aws.amazon.com/systems-manager/latest/APIRefere...
Huh. Turns out fewer of the ones I use than I thought provide that level of detail. That just happens to be the closest approximation of what I'm currently trying to automate.
Anyway, there are several API endpoints in Lambda which supply "LastModified" but none that I can find supply "LastModifiedUser".
* https://docs.aws.amazon.com/lambda/latest/dg/API_GetFunction...
* https://docs.aws.amazon.com/lambda/latest/dg/API_GetFunction...
* https://docs.aws.amazon.com/lambda/latest/dg/API_ListFunctio...
1 Retrieve all layers with list_layers and index them by ARN
2 Retrieve all function metadata
3 For each function metadata item, extract all layer version ARNs
4 For each layer version ARN, call get_layer_version_by_arn
5 Extract the layer ARN from that result
6 Use that layer ARN to retrieve the name from the data we retrieved in step 1