Lambda is usually very cost-effective, you only pay when they run (with ms billing) and they scale to zero.
But a combination of misconfiguration and a high throughput use case can give you a nasty surprise when you get your AWS bill, esp if you don't have billing alerts set up.
A few of my clients have experienced this type of problem before. For example, when a function that is invoked millions of times a day is allocated with way too much memory. Or when provisioned concurrency is enabled on a function with a lot of memory.
Cloud cost control is a complex topic and these types of mistakes are easy to make. I have seen many folks get caught out by the cost of their ECS clusters (bad scaling configuration) or VPC (VPC endpoints, NAT gateway, etc.), or maybe API Gateway and CloudWatch!
As far as Lambda is concerned, the most effective way to cut costs is to make sure you're not over-allocating memories.
@alex_casalboni's Lambda power tuning project is the go-to solution for right-sizing Lambda functions.
@alex_casalboni But, most of the time, there's no point in optimizing because there are no meaningful cost savings to be had - saving 30% on $0.01/month is a wasted afternoon...
The trick is in finding functions that are worthwhile optimising for cost.
@alex_casalboni You should always right-size functions that use provisioned concurrency because the uptime cost is tied to memory allocation.
for 1 provisioned concurrency:
128MB function = $1.40 per month
10GB function = $111.61 per month
scary when you can be 2 orders of magnitude wrong!
@alex_casalboni You should also optimize functions that are executed often (e.g. millions of times a month) and: 1. have a long avg execution time, and/or 2. has high memory allocation
If you use cost tagging, you can find functions with high individual costs in AWS billing.
@alex_casalboni For @Lumigo customers, you can do this easily by sorting your functions by cost (desc order) and then look at the top of the list.
Look for functions that: 1. have a high cost; and 2. low avg. memory
Per ms, it's about 25% cheaper compared to x86 functions.
But, you need to test your workload on ARM vs x86 and make sure there's no significant perf difference.
@alex_casalboni@Lumigo In the launch post, AWS mentions a 19% better performance in their tests.
Since the launch, there's been a lot of report from others that found their workload to be significantly worse performing on ARM. In the worst case, I saw 60% longer exec time running on ARM.
@alex_casalboni@Lumigo IO-heavy functions are usually good candidates for ARM. Function spends most of its time idle, waiting for API response, so CPU cycles are wasted anyway, and CPU perf likely does not vary greatly in these cases either.
Take that 25% saving and say thank you :-)
@alex_casalboni@Lumigo Lastly, a counter-intuitive one... you can save money by using provisioned concurrency if you have a busy function and you know what you're doing!
PC has uptime cost, but duration cost is ~70% cheaper than on-demand.
@alex_casalboni@Lumigo If provisioned concurrencies are kept busy ~60% of the time, then you reach break-even, and beyond that, cost savings.
Risky approach - steep penalty if you get it wrong because of uptime cost. I wouldn't recommend this, use PC for eliminating cold starts instead (as intended).
@alex_casalboni@Lumigo If you prefer reading thin a long-form blog post instead, you can find it here:
Great question from my current cohort of students, paraphrased:
"Should you always use Step Functions to chain together a few Lambda functions? Are there patterns to simplify this? How about using SQS between the functions?"
Here are my thoughts 🧵
Firstly, on the broader topic of orchestration vs choreography, I've written my thoughts before. TL;DR is that I prefer orchestration for intra-service workflows, and use events for inter-service communication.
As the Lambda service becomes more mature and fully featured, there's also more confusion around when to use these new features. Function URL came up in a conversation today, so let's talk about that!
🧵
Let me start by saying that I'm quite excited by its release and I think it's great that it's now an option.
But I also think it shouldn't be the default for most of the people who are using Lambda today.
The no. 1 question I get about #serverless is around testing - how should I test these cloud-hosted functions? Should I use local simulators? How do I run these in my CI/CD pipeline?
Here are my thoughts on this 🧵
There's value in testing YOUR code locally, but don't bother with simulating AWS locally, too much effort to set up and too brittle to maintain. Seen many teams spend weeks trying to get localstack running and then waste even more time whenever it breaks in mysterious ways 😠
Much better to use temporary environments (e.g. for each feature, or even each commit). Remember, with serverless components you only pay for what you use, so these environments are essentially free 🤘
If you want to learn about the internal details of Lambda, then check out @MarcJBrooker's session "Deep dive into AWS Lambda security: Function isolation"
"For amazon.com we found the "above the fold" latency is what customers are the most sensitive to"
This is an interesting insight, that not all service latencies are equal and that improving the overall page latency might actually end up hurting the user experience if it negatively impacts the "above the fold" latency as a result. 💡
This is far more complex than the most complex CD pipeline I have ever had! Just cos it's complex, doesn't mean it's over-engineered though. Given the blast radius, I'm glad they do releases carefully and safely.
If you look closely, beyond all the alpha, beta, gamma environments, it's one-box in a region first then the rest of the region, I assume starting with the least risky regions first.