How I use PostgreSQL as a job queue and why you probably should too
Every guide says use Redis or BullMQ for background jobs. I used PostgreSQL SKIP LOCKED and it handled thousands of jobs per day with zero additional infrastructure.
The first time I needed a background job queue at FinanceOps, I did what every developer does: I searched “node.js job queue” and got a wall of results pointing at BullMQ, Redis, and managed queue services. All of them require additional infrastructure. Redis needs a server, monitoring, persistence configuration, and connection management. Managed queues cost money and add another vendor dependency. I was running a single PostgreSQL instance that was handling our entire application. Why would I add a second stateful system for a job queue?
PostgreSQL has had advisory locks since version 8.2 and SKIP LOCKED since version 9.5. Together, they give you a fully functional job queue with exactly-once processing, retry logic, and the ability to join job data with your business tables in a single query. I have run this in production for six months and it handles thousands of jobs per day without breaking a sweat.
In that first stretch at FinanceOps, I was still learning how to wear the Head of Engineering title without hiding behind it. It also builds on what I learned earlier in “Building a real-time notification system with zero external dependencies.” The only credibility that mattered was whether the decision survived contact with real money, ugly edge cases, and the next person I would eventually hire. That same bias toward strict boundaries later shaped how I approached ftryos and pipeline-sdk: make correctness boring before you make the API clever.
The Schema
The job queue is a single table. Nothing fancy. The status column tracks the job lifecycle, and the scheduled_for column supports delayed execution.
CREATE TABLE job_queue ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), queue_name TEXT NOT NULL, payload JSONB NOT NULL, status TEXT NOT NULL DEFAULT 'pending' CHECK (status IN ('pending', 'processing', 'completed', 'failed', 'dead')), attempts INT NOT NULL DEFAULT 0, max_attempts INT NOT NULL DEFAULT 3, scheduled_for TIMESTAMPTZ NOT NULL DEFAULT NOW(), started_at TIMESTAMPTZ, completed_at TIMESTAMPTZ, error TEXT, created_at TIMESTAMPTZ NOT NULL DEFAULT NOW());
CREATE INDEX idx_job_queue_pending ON job_queue (queue_name, scheduled_for) WHERE status = 'pending';The partial index on status = ‘pending’ is essential. Without it, the dequeue query scans every row in the table including completed and failed jobs. With it, the planner only looks at jobs that are actually ready to run. As completed jobs accumulate, the index stays small.
The Dequeue Query
The magic is in the dequeue query. SKIP LOCKED lets multiple worker processes poll the same table concurrently without blocking each other. Each worker grabs the next available job, locks it, and moves on. No two workers ever process the same job.
UPDATE job_queueSET status = 'processing', started_at = NOW(), attempts = attempts + 1WHERE id = ( SELECT id FROM job_queue WHERE queue_name = $1 AND status = 'pending' AND scheduled_for <= NOW() ORDER BY scheduled_for ASC FOR UPDATE SKIP LOCKED LIMIT 1)RETURNING *;The inner SELECT finds the oldest pending job with FOR UPDATE SKIP LOCKED. If another worker has already locked that row, it skips to the next one. The outer UPDATE atomically transitions the job to processing and records the start time. The entire operation is a single round trip to the database.
The Worker Loop
Each worker runs a simple poll loop. It queries for a job, processes it, marks it complete or failed, and repeats. When no jobs are available, it sleeps for a configurable interval before polling again.
async function processQueue(queueName: string) { while (true) { const job = await dequeueJob(queueName); if (!job) { await sleep(1000); // No jobs, back off continue; } try { await handlers[queueName](job.payload); await markComplete(job.id); } catch (error) { if (job.attempts >= job.max_attempts) { await markDead(job.id, error); } else { await markFailed(job.id, error); // Exponential backoff: retry after 2^attempts seconds await reschedule(job.id, new Date(Date.now() + 2 ** job.attempts * 1000) ); } } }}Failed jobs get exponential backoff. The first retry is after 2 seconds, the second after 4 seconds, the third after 8 seconds. If a job fails all three attempts, it moves to the dead status where I can inspect it manually. In six months, we have had exactly four dead jobs, all caused by malformed webhook payloads from a payment processor.
Why This Works Better Than Redis for Us
- Transactional consistency: We can enqueue a job in the same transaction that creates the payment record. If the payment insert fails, the job is never created. With Redis, you need two-phase patterns to keep the queue and the database in sync.
- Queryable history: Failed jobs are rows in a table. I can JOIN them against payment records to find patterns. Try doing that with a Redis list.
- No additional infrastructure: Zero new servers, zero new monitoring, zero new connection pools. The job queue shares the same PostgreSQL instance as everything else.
- Familiar tooling: I debug job issues with psql and SQL queries. No need for a Redis CLI, a BullMQ dashboard, or a queue-specific monitoring tool.
- Exactly-once semantics: SKIP LOCKED plus transactional status updates guarantee that no job is processed twice. Redis-based queues require careful at-least-once patterns with idempotency keys.
The tradeoff is throughput. PostgreSQL as a job queue handles hundreds of jobs per second comfortably. If we needed tens of thousands per second, Redis would be the right choice. But at our scale, PostgreSQL is not just sufficient. It is simpler, more reliable, and more debuggable than any alternative.
The builder phase was less glamorous than people imagine. It was mostly a series of stubborn, unfashionable choices that kept future-me out of 2 a.m. incident calls. I still make the same kind of choices inside ftryos and pipeline-sdk.
You do not need a dedicated queue system until you do. PostgreSQL SKIP LOCKED is production-ready, battle-tested, and eliminates an entire category of infrastructure. Use it until your scale proves otherwise.