n8n and 429s – Dealing with API Rate Limits
You’re running an n8n workflow that processes 500 items through an external API. Everything runs smoothly until item 347 hits a 429 “Too Many Requests” error. Existential dread sets in… it’s over.

Here are the most common outcomes I see when throttling issues happen:
- Stop on Error – this is the default for n8n
- Retry on Fail – if you configure this, the node will try a few more times and then give up
- Continue on Error – execution chugs along but the 429s hard fail somewhere downstream
I have a microcap trading bot I built on n8n (no, I do not recommend this) and it frequently runs into 429s pulling pricing information. Load all of the tickers from the NYSE and NASDAQ and then try to get the prices at a fairly high frequency – you’re bound to run into 429s. I thought I would share a pattern I use to handle rate limiting.
Why Built-in n8n Error Handling Falls Short
n8n provides solid error handling features out of the box. The “Retry on Fail” setting handles transient errors well. The Error Trigger workflow catches catastrophic failures. However, there’s a gap between “retry a few times” and “the whole workflow crashed.” We need to handle gracefully, not crash, and still get the data.
Consider these scenarios:
- An API has sporadic rate limiting that clears after 60 seconds – longer than the max 5-retry window
- A batch of 1000 items has 3 malformed records that will never succeed
- An external service goes down for 10 minutes during your 2-hour batch job
In each case, you need a way to:
- Capture the failed items with full context
- Continue processing the remaining items
- Retry the failures later, automatically
- Escalate items that keep failing after multiple attempts
For this, I use the Dead Letter Queue pattern (DLQ).
What is a Dead Letter Queue?
A Dead Letter Queue (DLQ) is a holding area for messages or items that couldn’t be processed successfully. The term comes from message queue systems like RabbitMQ and AWS SQS, but the pattern works anywhere you’re processing batches of data.
Main Workflow:
├─ Success → Normal processing
└─ Failure → Send to DLQ table
Retry Workflow (runs hourly):
├─ Pull items from DLQ
├─ Attempt processing
├─ Success → Delete from DLQ
└─ Failure → Increment attempt counter
└─ If attempts > 5 → Flag for manual review
Instead of losing failed items or blocking your entire workflow, failures get captured in a database table (since I use Postgres as the backend for n8n itself and holding the trading data). A separate workflow periodically retries them. Items that fail repeatedly stay in the queue for manual investigation.
Lab Setup
I like a clean lab that clearly demonstrates the concept.
- A Postgres database (it’s what I already have – feel free to use whatever queuing mechanism makes sense in your environment)
- If your n8n environment is running on SQLite and you don’t have Postgres in the env, switch over now before you run into SQLite issues
- Two n8n workflows: the main processing workflow and a retry workflow
- A chaos-testing endpoint to simulate real-world API failures – for the lab, this is in Cloudflare. It’s just a worker that triggers 429s randomly
Step 1: Create a DLQ Table
If you already have a database on Postgres outside of the n8n database where you are already storing information – great! Put the new table there. If not, create a new database and then put the table there.
CREATE DATABASE workflow_data;
Then, select your database and create the table there:
CREATE TABLE workflow_dlq (
id SERIAL PRIMARY KEY,
workflow_name VARCHAR(100) NOT NULL,
item_payload JSONB NOT NULL,
error_message TEXT,
error_code VARCHAR(10),
attempts INTEGER DEFAULT 1,
max_attempts INTEGER DEFAULT 5,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
last_attempt_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
resolved_at TIMESTAMP,
status VARCHAR(20) DEFAULT 'pending'
);
CREATE INDEX idx_dlq_status ON workflow_dlq(status);
CREATE INDEX idx_dlq_workflow ON workflow_dlq(workflow_name);
Key fields:
- item_payload: The complete original item as JSON, so you have everything needed to retry
- attempts: How many times we’ve tried to process this item
- status: “pending”, “retrying”, “resolved”, or “failed” (exceeded max attempts)
Now, all we need is the credential within n8n that allows for connecting to our new table from within a workflow.

And then the actual creds. In my lab, n8n and Postgres are running in Docker containers. The name of the Docker container for Postgres is literally “postgres” which means I use that for the host name. Make sure you have a route you understand from n8n to your DB instance.

Step 2: Configure Your Main Workflow for Error Routing
The critical change: enable the error output on nodes that call external APIs. Some of you may already have this – if not, enable it.
- Open your HTTP Request or API node
- Go to Settings
- Optionally set Retry on Fail (recommended)
- Set On Error to “Continue (using error output)”

This gives the node two outputs:
- Output 1: Items that succeeded
- Output 2: Items that failed (with error details attached)
When you process 100 items and 3 fail, 97 items flow out of Output 1 and 3 items flow out of Output 2. Each failed item carries its original data plus the error information. The “Retry on Fail” in combination with the DLQ path (On Error – Continue) gives us the best chance for success in getting the data from the target APIs.
Now, from the Error output for the HTTP Request node, we can attach a Postgres Insert node:

Now, when a fail happens it should get inserted into the database:


Success!
Step 3: Build the Retry Workflow
Here’s the layout for the DLQ workflow. It’s a little more cautious than the original flow. We only process 1 item at a time and we only kick the DLQ processing off like once per hour (or at a slower cadence than the original flow at the very least). The goal with this flow is to successfully process the queue – we’re already delayed so there’s no reason to make it worse.

- Schedule Trigger – at a cadence less than that of the original
- GET DLQ – a query to our workflow table to get the current DLQ
SELECT * FROM workflow_dlq
WHERE status = 'pending'
AND attempts < max_attempts
ORDER BY created_at ASC
LIMIT 50;
- Loop – we’re going to process each item in the DLQ 1 at a time – no concurrency
- HTTP Request – exactly the same as the original request. Configured exactly the same with retries and on error continue
- Resolve on Success – flag the item in the DLQ from pending -> resolved if the HTTP request is successful
UPDATE workflow_dlq
SET status = 'resolved', resolved_at = CURRENT_TIMESTAMP
WHERE id = {{ $('Loop Over Items').item.json.id }}
- Increment Error Attempts – updates the DLQ attempts column if we have another fail
UPDATE workflow_dlq
SET attempts = attempts + 1,
last_attempt_at = CURRENT_TIMESTAMP,
status = CASE WHEN attempts >= max_attempts - 1 THEN 'failed' ELSE 'pending' END
WHERE id = {{ $('Loop Over Items').item.json.id }}
- Get Attempts – queries the DB for attempts for that specific id/record
select id,attempts,max_attempts from workflow_dlq where id = {{ $('Loop Over Items').item.json.id }};
- IF – checks if attempts >= max_attempts
- If the IF is true, we send a fail message
Nothing too magical here – we’re just going to process each item in the DLQ one at a time and give it the highest chance for success. Hopefully, we’re able to get our data and the DLQ will move to resolved.
You can download the sample workflows here: https://github.com/scomurr/blog_post_assets/raw/refs/heads/main/DLQ.zip
Additionally, here is the Cloudflare worker code I used for the chaos endpoint:
export default {
async fetch(request) {
const now = Math.floor(Date.now() / 1000);
const cyclePosition = now % 20;
const inFlurry = cyclePosition >= 12 && cyclePosition <= 17;
const random = Math.random();
const should429 = inFlurry ? random < 0.8 : random < 0.05;
if (should429) {
return new Response(JSON.stringify({
error: "Too Many Requests",
retry_after: Math.floor(Math.random() * 5) + 1
}), {
status: 429,
headers: {
"Content-Type": "application/json",
"Retry-After": "3"
}
});
}
const url = new URL(request.url);
const id = url.searchParams.get("id") || "unknown";
return new Response(JSON.stringify({
id: id,
processed: true,
timestamp: new Date().toISOString()
}), {
status: 200,
headers: { "Content-Type": "application/json" }
});
}
};
Real-World Example: Batch LLM Processing
Here’s how this pattern applies to a real workflow for me: scoring 500 stock tickers using multiple LLM providers.
Each ticker is one item. The workflow sends all 500 items through the same nodes – there’s no branching per ticker. But when one item fails (rate limit, timeout, bad data), I need to capture it without stopping the other 499.
The Challenge:
- OpenAI, Gemini, and local Llama models all have different rate limits
- Processing takes a bit of time – usually an hour or so; can’t afford to lose progress
- Some payloads have malformed data that will always fail
The Solution:
- Main workflow sends each ticker to the LLM with error routing enabled
- Successful scores go directly to the results table
- Failed items (429s, timeouts, malformed data) go to the DLQ
- Retry workflow runs every 30 minutes
- Items failing 5+ times get flagged for manual review
Result: A batch that previously required babysitting now runs overnight. Failed items get captured and retried automatically. The handful of truly broken records get flagged instead of silently disappearing.
When to Use the DLQ Pattern
Good candidates:
- Batch processing workflows (100+ items)
- External API calls with rate limits
- Long-running workflows where partial failure is acceptable
- Workflows where data loss is unacceptable
I obviously wouldn’t use this pattern if the workflow is all or nothing – if the data is too time sensitive going and getting it later might not work. If the workflow is going to never process more than a handful of items (say 10) at a time, there’s no reason to implement something like this either.
Summary
The Dead Letter Queue pattern fills a gap in n8n’s built-in error handling: what to do with items that fail after retries but shouldn’t be lost.
By routing failed items to a database table and processing them with a separate retry workflow, you get:
- Zero data loss: Every failed item is captured with full context
- Automatic recovery: Transient failures get retried without manual intervention
- Visibility: You know exactly what failed, when, and why
- Graceful degradation: Main workflow keeps processing instead of blocking on failures
This is a pattern that scales. Whether you’re processing 100 items or 100,000, the DLQ keeps your workflows resilient without adding complexity to your main processing logic.
Related Resources:
