The way CTRL-C in Postgres CLI cancels queries is incredibly hack-y (neon.com)

by andrenotgiant 47 comments 139 points
Read article View on HN

47 comments

[−] rlpb 54d ago
TCP has an "urgent data" feature that might have been used for this kind of thing, used for Ctrl-C in telnet, etc. It can be used to bypass any pending send buffer and received by the server ahead of any unread data.
[−] mike_hearn 54d ago
Fun fact: Oracle implements cancellation this way.

The downside is that sometimes connections are proxied in ways that lose these unusual packets. Looking at you, Docker...

[−] ralferoo 54d ago
Just googling it now and TCP urgent data seems to be a mess.

Reading the original RFC 793 it's clear that the intention was never for this to be OOB data, but to inform the receiver that they should consume as much data as possible and minimally process it / buffer it locally until they have read up to the urgent data.

However, the way it was historically implemented as OOB data seems to be significantly more useful - you could send flow control messaging to be processed immediately even if you knew the receiving side had a lot data to consume before it'd see an inline message.

It seems nowadays the advice is just to not use urgent data at all.

[−] ZiiS 54d ago
A good write up explaining how assumptions of network and security design have changed so much over the years. Also you have to give credit nowadays for not overly sensationalizing 'heebie-jeebies level 6'. I certainly continue reusing a connection I assumed was TLS after a cancel so was vulnerable to a DoS; but equally if the next statement was canceled I would switch to a new connection no harm no foul.
[−] jtwaleson 54d ago
From the title I was hoping for this being hacky on the server application side, like how it aborts and clears the memory for a running query.

Still an interesting read. Just wondering, why can't the TCP connection of the query not be used to send a cancellation request? Why does it have the be out of band?

[−] michalc 54d ago
I think I can understand why this wasn’t addressed for so long: in the vast majority of cases if your db is exposed on a network level to untrusted sources, then you probably have far bigger problems?
[−] kelnos 53d ago

>

There are architectural reasons why psql doesn’t yet use libpq’s encrypted cancellation functions (it “would need a much larger refactor to be able to call them due to the new functions not being signal-safe”)

This surprised me. I was like, "surely socket()/connect()/send()/recv() aren't async signal safe!" But after a quick trip to man signal-safety, it turns out they are, which surprised me. I guess it shouldn't, perhaps: likely all of those functions are little more than wrappers around the corresponding syscalls, so there isn't any libc state to possibly corrupt or deadlock you if you use them in a signal handler. And I assume the kernel needs to keep itself in a consistent, non-deadlockable state before it calls a signal handler anyway.

(And I'm not at all surprised that whatever TLS library they're using calls things or is itself not async signal safe.)

Either way, wow! In 2026 it feels absolutely bonkers that a software dev team would continue to put out something like this. Honestly, once psql got TLS support, when you make a TLS connection it should have put up a big warning and ask you, "This program cannot cancel queries over a secure channel; do you still want to enable query cancellation?" Or hell, just disable query cancellation in those cases and not even give an option.

I guess this is "just" a DoS, though, and only in cases where someone authorized is poking around using psql while connected to a server exposed to the public internet. Hopefully that situation isn't common. And even if it is, there's no opportunity for data exfiltration or RCE, so... the author's "heebie-jeebies level 6" feels appropriate.

(And there's an easy mitigation if you know the issue: once you cancel a query with ctrl+c, quit the psql session and start a new one. That will give you the process a new "cancellation key", and the old one from the old process won't work for an attacker anymore.)

[−] kardianos 54d ago
In general I love postgres. There are to problems with postgresql in my book: the protocol (proto3) and no great way to directly query using a different language.

The protocol has no direct in-protocol cancellation, like TDS has. TDS does this by making a framed protocol, at the application protocol level it can cancel queries. It has two variants (text and binary) and can cause fragmentation, and at the query and protocol level only supports positional parameters, no named parameters.

One a query is on the server, it doesn't support directly acting on a language mode. I don't want to go into SQL mode and create a PL/SQL proc, I just want direct PL/SQL. Can't (really) do that well. Directly returning multiple result sets (eg for a matrxi, separate rows, columns, and fields) or related queries in a single round trip is technically possible, but hard to do. So frustrating.

[−] gpderetta 54d ago
TLS is not async signal safe. But having a dedicated thread whose responsibility is to only send cancel tokens via a TLS connection and is woken up by a posix semaphore seems a small, self contained change that doesn't require any major refactoring.
[−] i18nagentai 54d ago
[flagged]