BullMQ: The Background Job Queue Your Node Backend Has Been Begging For
If you have ever written a web request handler that needs to send a welcome email, generate a PDF, resize an upload, or call some maddeningly slow third-party API, you already know the feeling: the user is staring at a spinner while your server does something that has nothing to do with rendering a response. The fix is to get that work out of the request/response cycle and into the background, where it can run on its own schedule, retry when it fails, and scale across as many machines as you like. That is exactly the problem BullMQ solves.
bullmq is a Redis-backed distributed job queue for Node.js. It is the TypeScript rewrite and spiritual successor to the older bull package, rebuilt from the ground up by the same team at Taskforce.sh for better atomicity, stability, and a cleaner typed API. It is not a React library, but it is the kind of thing every developer who ships the backend behind a React or Next.js app eventually needs. It powers background work at companies like Microsoft, and frameworks like NestJS, Vendure, Novu, and Langfuse lean on it. With roughly 5.6 million weekly downloads and an MIT license, it is about as standard as a job queue gets.
Why a Queue Instead of Just Awaiting It
The mental model is small and hard to outgrow. You have a producer (your API) that adds jobs to a named queue, and one or more workers (separate processes or containers) that pull jobs off and run them. Redis sits in the middle, holding job state in durable data structures and using Lua scripts for atomic, race-condition-free transitions. Because the state lives in Redis rather than in your process memory, jobs survive restarts, can be shared across machines, and are fully inspectable.
That single architecture buys you a lot:
- Delayed jobs that run after a delay or at a specific timestamp.
- Repeatable cron jobs for nightly reports, cleanup, and polling.
- Automatic retries with backoff so transient failures don't lose work.
- Priorities so urgent jobs jump ahead of the rest.
- Rate limiting to respect a third-party API's quota across your whole fleet.
- Concurrency to process many jobs in parallel per worker.
- Flows that model parent/child dependencies between jobs.
- Stalled-job recovery that re-queues work whose worker died mid-process.
All of it is atomic, all of it is typed, and all of it is coordinated through Redis.
Getting It Installed
BullMQ runs on Node and needs a Redis server (or a compatible store like Dragonfly) reachable. Install the package with your favorite manager:
npm install bullmq
yarn add bullmq
You will also want Redis running locally for development. One important production note up front: configure Redis with maxmemory-policy=noeviction. If Redis is allowed to evict keys under memory pressure, it can silently drop your job data and corrupt the queue.
Producing Work: Queues and Jobs
The producer side is cheap. You create a Queue by name, hand it a Redis connection, and call add(). The first argument is the job's name (useful for routing different kinds of work through one queue), the second is the data payload, and the third is an options object.
import { Queue } from 'bullmq';
const connection = { host: 'localhost', port: 6379 };
const emailQueue = new Queue('emails', { connection });
await emailQueue.add('welcome', { userId: 42, to: 'user@example.com' });
await emailQueue.add(
'welcome',
{ userId: 42, to: 'user@example.com' },
{
attempts: 5,
backoff: { type: 'exponential', delay: 1000 },
removeOnComplete: true,
removeOnFail: 1000,
},
);
That second call is where BullMQ starts earning its keep. attempts: 5 with an exponential backoff means a flaky SMTP server gets retried five times with growing delays before the job is declared failed. removeOnComplete: true keeps Redis tidy by discarding successful jobs, while removeOnFail: 1000 retains the last thousand failures so you can inspect them. Because Queue is generic, you can type the data and return value too: new Queue<EmailJob, EmailResult>('emails', { connection }).
Consuming Work: Workers and Processors
A Worker takes the same queue name and a processor function. It pulls jobs, runs your code, and treats whatever you return as the job's result and whatever you throw as a failure (which feeds into the retry logic).
import { Worker } from 'bullmq';
const worker = new Worker(
'emails',
async (job) => {
await job.updateProgress(10);
await sendEmail(job.data);
await job.updateProgress(100);
return { sent: true };
},
{
connection,
concurrency: 5,
},
);
worker.on('completed', (job, result) => {
console.log(`${job.id} done`, result);
});
worker.on('failed', (job, err) => {
console.error(`${job?.id} failed`, err.message);
});
worker.on('error', (err) => console.error(err));
concurrency: 5 lets a single worker process up to five jobs in parallel, which is exactly what you want for I/O-bound work like sending email or calling APIs. Add more worker processes or containers and Redis coordinates them automatically. That last error listener is not optional decoration: an unhandled worker error can crash your process, so always attach it.
One connection detail worth knowing: workers use blocking Redis commands, so when you build your own ioredis connection for a worker, set maxRetriesPerRequest: null so it retries forever and survives transient connection loss. For producers, keep the default so an add() fails fast and you can surface the error to your caller. And never set ioredis's keyPrefix option, since BullMQ does its own key prefixing.
Scheduling Work for Later
Two of the most common requests for any backend are "do this in a little while" and "do this every night." BullMQ handles both.
Delayed Jobs
Pass a delay (in milliseconds) and the job sits in the delayed state until its timer fires.
await emailQueue.add('reminder', { id: 1 }, { delay: 5000 });
const target = new Date('2035-07-03T10:30:00');
await emailQueue.add(
'reminder',
{ id: 1 },
{ delay: Number(target) - Date.now() },
);
Keep in mind that delays are a floor, not a guarantee: a job fires no earlier than its scheduled time, but actual execution still depends on worker availability and load. If you came from older Bull or early BullMQ, you may remember needing a separate QueueScheduler to move delayed jobs back into the wait list. That class was removed in BullMQ v2; the Worker now handles it internally, so any sample showing new QueueScheduler(...) is outdated.
Cron Schedulers
For recurring work, use upsertJobScheduler. The "upsert" naming is the whole point: it creates or updates a scheduler idempotently, so you can call it on every boot without spawning duplicate cron jobs.
await emailQueue.upsertJobScheduler('heartbeat', { every: 1000 });
await emailQueue.upsertJobScheduler(
'nightly-report',
{ pattern: '0 15 3 * * *' },
{
name: 'report',
data: { kind: 'daily' },
opts: { attempts: 5, backoff: 3, removeOnFail: 1000 },
},
);
The first scheduler fires every second; the second runs daily at 03:15 using a cron pattern and stamps out report jobs from a template. This replaced the older add(..., { repeat }) API and is the preferred way to register recurring jobs today.
Respecting Other People's Limits
When your jobs hammer an external API, you need to throttle them, and you need that throttle to apply across every worker, not just one process. BullMQ's limiter does exactly that.
const worker = new Worker('api-calls', processor, {
connection,
limiter: { max: 10, duration: 1000 },
});
This caps the queue at ten jobs per second globally, no matter how many worker instances you run. Rate-limited jobs simply wait; they are not failed. For APIs that tell you when you have hit a wall, you can rate-limit dynamically from inside the processor:
const worker = new Worker(
'api-calls',
async (job) => {
const { isRateLimited, retryAfter } = await callExternalApi(job.data);
if (isRateLimited) {
await worker.rateLimit(retryAfter);
throw Worker.RateLimitError();
}
},
{ connection, limiter: { max: 1, duration: 500 } },
);
When the external service returns a 429, you pause the queue for retryAfter and throw Worker.RateLimitError(), which re-queues the current job without counting it as a failure. The job stays in waiting and resumes once the window passes.
Orchestrating Multi-Step Workflows with Flows
Sometimes a job is not one thing but a tree of things: fetch every region's data, then aggregate the results into one report. FlowProducer models exactly that parent/child relationship, where the parent only runs after all of its children complete.
import { FlowProducer } from 'bullmq';
const flow = new FlowProducer({ connection });
await flow.add({
name: 'aggregate-report',
queueName: 'reports',
data: { month: '2026-05' },
children: [
{ name: 'fetch-region', queueName: 'reports', data: { region: 'us' } },
{ name: 'fetch-region', queueName: 'reports', data: { region: 'eu' } },
],
});
The two fetch-region children run first, and only when both finish does aggregate-report move out of the waiting-children state and execute. Inside the parent's processor, job.getChildrenValues() hands you the child results so you can combine them. This is how you build fan-out/fan-in pipelines without inventing your own coordination logic.
Watching the Whole Queue with QueueEvents
A worker only sees the jobs it processes. If you want a global, real-time view of a queue's lifecycle, from a dashboard process that runs nowhere near your workers, QueueEvents subscribes to a Redis stream of queue-wide events.
import { QueueEvents } from 'bullmq';
const queueEvents = new QueueEvents('emails', { connection });
queueEvents.on('completed', ({ jobId, returnvalue }) => {
/* update a dashboard, emit a websocket message, etc. */
});
queueEvents.on('failed', ({ jobId, failedReason }) => {
/* alert on failures */
});
queueEvents.on('progress', ({ jobId, data }) => {
/* stream live progress to a UI */
});
This is what powers monitoring dashboards like bull-board, and it is the clean way to push live job progress to a React front end over WebSockets without coupling your UI to your worker code.
A Few Habits That Keep You Out of Trouble
BullMQ gives you at-least-once delivery, which is the right tradeoff for reliability but has consequences you should design for. Because of retries and stalled-job recovery, a job can run more than once, so make your processors idempotent: re-running them should be safe. On shutdown, always await worker.close() and await queue.close() in your SIGTERM and SIGINT handlers; close() waits for in-flight jobs to finish, which prevents work from being abandoned in the active state and re-run later as a stalled job. For CPU-heavy or crash-prone processors, consider sandboxed processors, which run your code in a separate Node process by passing a file path instead of a function, isolating crashes from the main worker.
If you ever outgrow the open-source feature set, there is a commercial BullMQ Pro tier that adds observables, group-based rate limiting, and batches, but the vast majority of applications never need it.
Where BullMQ Fits
Among Node job queues, BullMQ is the de-facto modern choice. It is the actively maintained successor to bull, far richer than the minimal bee-queue, and the obvious upgrade from long-dead options like kue. If you already run Postgres and value having one fewer piece of infrastructure, queues like pg-boss are genuinely appealing and offer transactional enqueue in the same database. But once you have Redis available and you want the fullest feature set, delayed jobs, cron schedulers, retries with backoff, priorities, rate limiting, concurrency, flows, and deduplication, all atomic and all TypeScript-first, BullMQ is the most capable option on the table.
The honest catch is that you have to run and operate Redis, write idempotent jobs, and shut down gracefully. Accept those three responsibilities and you get a queue that scales horizontally just by adding workers, that has been proven at the largest scales, and that has a mental model small enough to teach in an afternoon. For any full-stack JavaScript developer who has ever wished their request handler could just hand the slow stuff to someone else, BullMQ is that someone else.