The elephant in the room here is that there are hundreds of millions of embedded devices that cannot be upgraded easily and will be running vulnerable binaries essentially forever. This was a problem before of course, but the ease of chaining vulnerabilities takes the issue to a new level.
The only practical defense is for these frontier models to generate _beneficial_ attacks to innoculate older binaries by remote exploits. I dubbed these 'antibotty' networks in a speculative paper last year, but never thought things would move this fast! https://anil.recoil.org/papers/2025-internet-ecology.pdf
No, the elephant in the room is that even bad actors will now have easier to find vulnerabilities in, maintained or not, widely or in critical places used software. Unmaintained and remotely accessible devices should be discarded as soon as possible, you can't stay waiting till some of the good guys decide to give some time to your niche but critical unmaintained piece of software. Because if there is a possibility of taking profit of it, it will be checked and exploited.
And you can't assume that whatever vulnerability they have will let good guys to do the extra (and legally risky) work of closing the hole.
And this is precisely why so many of these devices should not be connected to the Internet.
Things like an Internet-connected central heating seem absolutely insane to me, yet people look at me like I'm crazy when I say so. Do you really want your home' heating entirely controller by a publicly accessible device that likely will never be upgraded in case of security issues?
> The only practical defense is for these frontier models
Another practical defence for many of these devices would be to just disconnect them... I feel like an old man yelling at a cloud, but too much is connected to the Internet these days.
I'd love to see them point at a target that's not a decades old C/C++ codebase. Of the targets, only browsers are what should be considered hardened, and their biggest lever is sandboxing, which requires a lot of chained exploits to bypass - we're seeing that LLMs are fast to discover bugs, which means they can chain more easily. But bug density in these code bases is known to be extremely high - especially the underlying operating systems, which are always the weak link for sandbox escapes.
I'd love to see them go for a wasm interpreter escape, or a Firecracker escape, etc. They say that these aren't just "stack-smashing" but it's not like heap spray is a novel technique lol
> It autonomously obtained local privilege escalation exploits on Linux and other operating systems by exploiting subtle race conditions and KASLR-bypasses.
I think this sounds more impressive than it is, for example. KASLR has a terrible history for preventing an LPE, and LPE in Linux is incredibly common. Has anything changed here? I don't pay much attention but KASLR was considered basically useless for preventing LPE a few years ago.
> Because these codebases are so frequently audited, almost all trivial bugs have been found and patched. What’s left is, almost by definition, the kind of bug that is challenging to find. This makes finding these bugs a good test of capabilities.
This just isn't true. Humans find new bugs in all of this software constantly.
It's all very impressive that an agent can do this stuff, to be clear, but I guess I see this as an obvious implication of "agents can explore program states very well".
edit: To be clear, I stopped about 30% of the way through. Take that as you will.
My two cents is LLMs are way stronger in areas where the reward function is well known, such as exploiting - you break the security, you succeed.
It's much harder to establish whats a usable and well architected, novel piece of software, thus in that area, progress isn't nearly as fast, while here you can just gradient descent your way to world domination, provided you have enough GPUs.
Interestingly, it sounds like OpenBSD held up very well:
> This was the most critical vulnerability we discovered in OpenBSD with Mythos Preview after a thousand runs through our scaffold. Across a thousand runs through our scaffold, the total cost was under $20,000 and found several dozen more findings.
The vulnerability in question is a DOS one in the TCP implementation, which is nasty but it's far from the multiple local privilege escalations found in the Linux kernel.
A very good outcome for AI safety would be if when improved models get released, malicious actors use them to break society in very visible ways. Looks like we're getting close to that world.
53 comments
The only practical defense is for these frontier models to generate _beneficial_ attacks to innoculate older binaries by remote exploits. I dubbed these 'antibotty' networks in a speculative paper last year, but never thought things would move this fast! https://anil.recoil.org/papers/2025-internet-ecology.pdf
And you can't assume that whatever vulnerability they have will let good guys to do the extra (and legally risky) work of closing the hole.
Things like an Internet-connected central heating seem absolutely insane to me, yet people look at me like I'm crazy when I say so. Do you really want your home' heating entirely controller by a publicly accessible device that likely will never be upgraded in case of security issues?
> The only practical defense is for these frontier models
Another practical defence for many of these devices would be to just disconnect them... I feel like an old man yelling at a cloud, but too much is connected to the Internet these days.
I'd love to see them go for a wasm interpreter escape, or a Firecracker escape, etc. They say that these aren't just "stack-smashing" but it's not like heap spray is a novel technique lol
> It autonomously obtained local privilege escalation exploits on Linux and other operating systems by exploiting subtle race conditions and KASLR-bypasses.
I think this sounds more impressive than it is, for example. KASLR has a terrible history for preventing an LPE, and LPE in Linux is incredibly common. Has anything changed here? I don't pay much attention but KASLR was considered basically useless for preventing LPE a few years ago.
> Because these codebases are so frequently audited, almost all trivial bugs have been found and patched. What’s left is, almost by definition, the kind of bug that is challenging to find. This makes finding these bugs a good test of capabilities.
This just isn't true. Humans find new bugs in all of this software constantly.
It's all very impressive that an agent can do this stuff, to be clear, but I guess I see this as an obvious implication of "agents can explore program states very well".
edit: To be clear, I stopped about 30% of the way through. Take that as you will.
System Card: Claude Mythos Preview [pdf] - https://news.ycombinator.com/item?id=47679258
Project Glasswing: Securing critical software for the AI era - https://news.ycombinator.com/item?id=47679121
I can't tell which of the current threads, if any, should be merged - they all seem significant. Anyone?
It's much harder to establish whats a usable and well architected, novel piece of software, thus in that area, progress isn't nearly as fast, while here you can just gradient descent your way to world domination, provided you have enough GPUs.
> This was the most critical vulnerability we discovered in OpenBSD with Mythos Preview after a thousand runs through our scaffold. Across a thousand runs through our scaffold, the total cost was under $20,000 and found several dozen more findings.
The vulnerability in question is a DOS one in the TCP implementation, which is nasty but it's far from the multiple local privilege escalations found in the Linux kernel.