12,399 views

Yan Cui

@theburningmonk

, 14 tweets, 4 min read

My Authors

A really good Q during the last Q&A session of my Production-Ready Serverless workshop. Thought I'd share it here.

Q: "I have a nightly task that load some data from RDS and then write it to S3, should I have 2 functions for this with API between them?"

#serverless #aws

1/14

I always start from the simplest solution and go up from there, and stop at the least complicated solution that meets all my criteria.

In this case, if everything can be done within the 15 mins limit, I see no reason to split the functions.

Follow the KISS principle.

2/

But, there are many reasons to split the steps into multiple functions.

E.g. to increase parallelism, you may split the task up into many small tasks, and fire off an invocation (separate function) to handle each.

Which means you need some sort of queue, but which?

3/

For maximum throughput, SNS is your best bet: there's no limit on max no. of reqs you can send to SNS, and it'll try to process them with as much Lambda concurrency as possible; and you get built-in retry and DLQ support.

4/

This can consume A LOT of concurrency quickly. Great if you're after max throughput, but it can cause other user-facing API functions to get throttled if you hit the Lambda regional concurrency limits, or the burst capacity limit.

To mitigate, run these in separate account.

5/

What about EventBridge?

It'd work, but 2x the cost per million msgs compared to SNS, and has a default limit (soft) of 2400 msg/sec in.

In this case, I don't see an upside, but there are good reasons to choose EB over SNS in many scenarios: lumigo.io/blog/5-reasons…

6/

What if you need to talk to a legacy DB and don't want to overwhelm it with a sudden spike of 3000 (burst limit) Lambda invocations?

Then consider a queue where scaling behaviour is more gradual.

7/

E.g. SQS starts with 5 pollers and goes up by 60 per min, so offers much slower ramp up in concurrency.

Kinesis lets you precise control of max concurrency with no. of shards and parallelism per shard.

You can play games with these options.

8/

Ok. That's great, lots of different options for fanning out. But what if I need to perform some aggregation logic at the end, like a fan-in step?

In that case, how about Step Function's Map state? Great for map-reduce type workload.

9/

The thing to keep in mind though, is the 32KB limit on state size. To workaround it, you might have to save the output from each task to S3/DynamoDB and return only a key so the reducer and load it.

10/

Another way to cut up the problem is to create nested workflows: parent wf->map->child wf->map->reduce->(parent wf) reduce

There are some powerful stuff you can do with nested workflows, but shouldn't be your first call! Go for simpler solutions first.

11/

Back to reasons for splitting that original function.

If some of the steps are prone to error, then another good reason to split is to be able to retry individual steps without failing the whole thing.

Putting a queue between each step gives you that.

12/

But when it's within one bounded context, as is the case here, I default to orchestration and would use Step Functions to model the workflow.

Between bounded contexts, use choreography (i.e. events).

13/

In summary, lots of ways to tackle this depending on your criteria. Start with the simplest solution. If I don't need a 2nd function, why pay for it in eng time (writing & configuring) and $ (extra invocation, and worse if the 1st function has to wait for response).

end/

Enjoying this thread?

Try unrolling a thread yourself!

Enjoying this thread?

Try unrolling a thread yourself!

Related hashtags

More from @theburningmonk see all

Embed code for your website

Did Thread Reader help you today?