Pull to refresh

PgQue: Zero-Bloat Postgres Queue (github.com)

by gmcabrita 37 comments 146 points
Read article View on HN

37 comments

[−] halfcat 26d ago
So if I understand this correctly, there are three main approaches:

1. SKIP LOCKED family

2. Partition-based + DROP old partitions (no VACUUM required)

3. TRUNCATE family (PgQue’s approach)

And the benefit of PgQue is the failure mode, when a worker gets stuck:

- Table grows indefinitely, instead of

- VACUUM-starved death spiral

And a table growing is easier to reason about operationally?

[−] samokhvalov 26d ago
Taxonomy is correct. But the benefit isn't "table grows indefinitely vs. vacuum-starved death spiral"

in all three approaches, if the consumer falls behind, events accumulate

The real distinction is cost per event under MVCC pressure. Under held xmin (idle-in-transaction, long-running writer, lagging logical slot, physical standby with hot_standby_feedback=on):

1. SKIP LOCKED systems: every DELETE or UPDATE creates a dead tuple that autovacuum can't reclaim (xmin is frozen). Indexes bloat. Each subsequent FOR UPDATE SKIP LOCKED scans don't help.

2. Partition + DROP (some SKIP LOCKED systems already support it, e.g. PGMQ): old partitions drop cleanly, but the active partition is still DELETE-based and accumulates dead tuples — same pathology within the active window, just bounded by retention. Another thing is that DROPping and attaching/detaching partitions is more painful than working with a few existing ones and using TRUNCATE.

3. PgQue / PgQ: active event table is INSERT-only. Each consumer remembers its own pointer (ID of last event processed) independently. CPU stays flat under xmin pressure.

I posted a few more benchmark charts on my LinkedIn and Twitter, and plan to post an article explaining all this with examples. Among them was a demo where 30-min-held-xmin bench at 2000 ev/s: PgQue sustains full producer rate at ~14% CPU; SKIP LOCKED queues pinned at 55-87% CPU with throughput dropping 20-80% and what's even worse, after xmin horizon gets unblocked, not all of them recovered / caught up consuming withing next 30 min.

[−] pierrekin 26d ago
I think there are two kinds of partition based approach which may cause some confusion if lumped together in this kind of comparison.

Insert and delete with old partition drop vs insert only with old partition drop.

The semantics of the two approaches differ by default but you can achieve the same semantics from either with some higher order changes (partitioning the event space, tracking a cursor per consumer etc).

How does PgQue compare to the insert only partition based approach?

[−] samokhvalov 26d ago
1. partitions are never dropped – they got TRUNCATEd (gracefully) during rotation

2. INSERT-only. Each consumer remembers its position – ID of the last event consumed. This pointer shifts independently for each consumer. It's much closer to Kafka than to task queue systems like ActiveMQ or RabbitMQ.

When you run long-running tx with real XID or read-only in REPEATABLE READ (e.g., pg_dump for long time), or logical slot is unused/lagging, this affects performance badly if you have dead tuples accumulated from DELETEs/UPDATEs, but not promptly vacuumed.

PgQue event tables are append-only, and consumers know how to find next batch of events to consume – so xmin horizon block is not affecting, by design.

[−] adhocmobility 26d ago
Why insist on calling this a queue when it doesn't really have queue semantics? Queues do the job of load balancing between different workers. When workers acknowledge tasks, they get deleted, and there are visibility timeouts.

This is a log.

It's not really solving the problems you claim it solves. It's not, for instance, a replacement for SKIP LOCKED based queues.

[−] saberd 26d ago
I don't understand the latency graph. It says it has 0.25ms consumer latency.

Then in the latency tradeof section it says end to end latency is between 1-2 seconds.

Is this under heavy load or always? How does this compare to pgmq end to end latency?

[−] mind-blight 26d ago
The vacuum pressure is real. Using a system with the skip locked technique + polling caused massive DB perf issues as the queue depth grew. The query to see the current jobs in the queue ended up being the main performance bottleneck, which cause slower throughput, which caused a larger queue depth, which etc.

Scaling the workers sometimes exacerbates the problem because you run into connection limits or polling hammering the DB.

I love the idea of pg as a queue, but I'm a more skeptical of it after dealing with it in production

[−] odie5533 26d ago
Postgres durability without having to run Kafka or RabbitMQ clusters seems pretty enticing. May reach for it when I next need an outbox pattern or small fan out.
[−] ozgrakkurt 26d ago
What do you think about trusting something LLM coded with your production data?
[−] cout 26d ago
I think it's great that projects like this exist where people are building middleware in different ways than others. Still, as someone who routinely uses shared memory queues, the idea of considering a queue built inside a database to be "zero bloat" leaves me scratching my head a bit. I can see why someone would want that, but once person's feature is bloat to someone else.
[−] andrewstuart 26d ago
Postgres is not the only database that does queues.

Any database that supports SKIP LOCKED is fine including MySQL, MSSQL, Oracle etc.

Even SQLite makes a fine queue not via skip locked but because writes are atomic.

[−] wewewedxfgdf 26d ago
How many message per second does this do I wonder?
[−] bfivyvysj 26d ago
Cool
[−] killingtime74 26d ago
I got Claude to analyze the code and it's not really comparable to SKIP LOCKED queues. It's more like Kafka. There's no job queue semantics with acks, workers taking from same job pool.

It's Kafka like one event stream and multiple independent worker cursors.

It's more SNS than SQS or Kafka than Rabbitmq/Nats