Dealing with third party rate limits

Question

I have been working on AWS Serverless Lambdas and now having issues with third party rate limits.

Context

A Lambda invocation triggered by an event:
calls a third-party API 4 times,
runs some logic,
writes to my database.

As volume increases, we started getting 429s from the third party.

How I handle it today

Lambda is triggered by SQS.
On failures I use partial batch item failures so SQS will retry with exponential backoff.
Inside the Lambda I only retry when I see a 429 from the third party.

Problems I am having now

This is reactive: it depends on the vendor returning a clean 429. Under load we also see transport errors (e.g., socket hangups) that don’t 429, so they slip through to my deadletter.
I tried using an internal in-process queue (e.g., p-queue) to serialize calls, but that only works inside one invocation. With multiple Lambdas running concurrently (from SQS concurrency), each process has its own queue → they still collectively exceed the vendor’s global limit.
A simple in-memory cache like Redis won’t solve it either because each invocation is isolated and I’m updating different entities per call.

What I want is an approach to solve this. Essentially what I want is a cross-invocation limiter that guarantees we never exceed the third party API's limits across all concurrent lambdas preferably something which is serverless friendly. The limits I need to take into account for the API are:

1. can't do more than 60 calls per minute,
2. can't do more than 5000 calls per day,
3. can't have more than 3 calls resolving concurrently.

What are the best solutions for this approach? I thought of using a centralised dynamodb table which has counter for each but handling concurrency of resolving no more than 3 at the same time is giving me issues in that approach.

Answer 1 · 2025-11-15 17:12:47Z

I wonder if you can throttle a process that pushes to SQS queue?

Collectives™ on Stack Overflow

Dealing with third party rate limits

2 Replies 2