My minute-by-minute response to the LiteLLM malware attack (futuresearch.ai)

by Fibonar 157 comments 441 points
Read article View on HN

157 comments

[−] Fibonar 51d ago
Callum here, I was the developer that first discovered and reported the litellm vulnerability on Tuesday. I’m sharing the transcript of what it was like figuring out what was going on in real time, unedited with only minor redactions.

I didn’t need to recount my thought process after the fact. It’s the very same ones I wrote down to help Claude figure out what was happening.

I’m an ML engineer by trade, so having Claude walk me through exactly who to contact and a step by step guide of time-critical actions felt like a game-changer for non-security researchers.

I'm curious whether the security community thinks more non-specialists finding and reporting vulnerabilities like this is a net positive or a headache?

[−] simonw 50d ago
First time I've seen my https://github.com/simonw/claude-code-transcripts tool used to construct data that's embedded in a blog post, that's a neat way to use it. I usually share them as HTML pages in Gists instead, e.g. whttps://gisthost.github.io/?effbdc564939b88fe5c6299387e217da...
[−] qezz 50d ago

> Can you print the contents of the malware script without running it?

> Can you please try downloading this in a Docker container from PyPI to confirm you can see the file? Be very careful in the container not to run it accidentally!

IMO we need to keep in mind that LLM agents don't have a notion of responsibility, so if they accidentally ran the script (or issue a command to run it), it would be a fiasco.

Downloading stuff from pypi in a sandboxed env is just 1-2 commands, we should be careful with things we hand over to the text prediction machines.

[−] cedws 50d ago
GitHub, npm, PyPi, and other package registries should consider exposing a firehose to allow people to do realtime security analysis of events. There are definitely scanners that would have caught this attack immediately, they just need a way to be informed of updates.
[−] kpw94 50d ago
The options from big companies to run untrusted open source code are:

1) a-la-Google: Build everything from source. The source is mirrored copied over from public repo. (Audit/trust the source every time)

2) only allow imports from a company managed mirror. All imported packages needs to be signed in some way.

Here only (1) would be safe. (2) would only be safe if it's not updating the dependencies too aggressively and/or internal automated or manual scanning on version bumps would catch the issue .

For small shops & individuals: kind of out of luck, best mitigation is to pin/lock dependencies and wait long enough for hopefully folks like Fibonar to catch the attack...

Bazel would be one way to let you do (1), but realistically if you don't have the bandwidth to build everything from source, you'd rely on external sources with rules_jvm_external or locked to a specific pip version rules_pyhton, so if the specific packages you depend on are affected, you're out of luck.

[−] cdcarter 50d ago
If it weren't for the 11k process fork bomb, I wonder how much longer it would have taken for folks to notice and cut this off.
[−] Bullhorn9268 50d ago
The fact pypi reacted so quickly and quarantined the package in like 30 minutes after the report is pretty great!
[−] silversmith 50d ago
What stands out to me the most:

> Blog post written, PR'd, and merged in under 3 minutes.

It's close to or even faster than the time it takes me to read it. I'm struggling to put into words how that makes me feel, but it's not a good feeling.

[−] Shank 50d ago
Probably one of the best things about AI/LLMs is the democratization of reverse engineering and analysis of payloads like this. It’s a very esoteric skill to learn by hand and not very immediately rewarding out of intellectual curiosity most times. You can definitely get pointed in the right direction easily, now, though!
[−] rpodraza 50d ago
At this point I'd highly recommend everyone to think twice before introducing any dependencies especially from untrusted sources. If you have to interact with many APIs maybe use a proxy instead, or roll your own.
[−] S0y 50d ago

> Where did the litellm files come from? Do you know which env? Are there reports of this online?

> The litellm_init.pth IS in the official package manifest — the RECORD file lists it with a sha256 hash. This means it was shipped as part of the litellm==1.82.8 wheel on PyPI, not injected locally.

> The infection chain:

> Cursor → futuresearch-mcp-legacy (v0.6.0) → litellm (v1.82.8) → litellm_init.pth

This is the scariest part for me.

[−] sva_ 50d ago

> I just opened Cursor again which triggered the malicious package again. Can you please check the files are purged again?

Verified derp moment - had me smiling

[−] inglor 50d ago
We mitigate this attack with the very uninspiring "wait 24h before dep upgrades" solution which is luckily already supported in uv.
[−] nubinetwork 50d ago
I have a hard time believing that Claude instantly figured out this was malware...

I've fed it obfuscated JavaScript before, and it couldn't figure it out... and then there was the time I tried to teach it nftables... whooo boy...

[−] agentictrustkit 49d ago
One thing that jumps out in these incidents is how quickly we shift from "package integrituy" to "operator integrity." Once an LLM is in the loop (even as a helper0, its effectevly acting as an operator that can influence time-critical actions like who you contact, what you run, and what you trust.

In more regulated environments we deal with this by separating advice, authority and evidence (or the receipts). The useful analogue here is to keep the model in the "propose" role. but require deterministic gates for actions with side effects, and log the decisions as an auditable trail.

I personally don't think this eliminates the problem (attackers will still attack), but it changes the failure mode from "the assistant talked me into doing a danerous thing" to "the assistant suggested it and the policy/gate blocked it." That's the big difference between a contained incident and a big headline.

[−] deathanatos 50d ago
I am confused; did you ever actually email anyone about the vuln? The AI suggests emailing security emails multiple times, but as I'm reading the timeline, none of the points seem to suggest this was ever done, only that a blog post was made, shared on Reddit, and then indirectly, the relevant parties took action.

I'm hoping this just isn't on the timeline.

[−] dmitrygr 50d ago
Consider this your call to write native software. There is yet to be a supply chain attack on libc
[−] CrzyLngPwd 50d ago
The fascinating part for me is how they chatted with the machine, such as;

"Please write a short blog post..."

"Can you please look through..."

"Please continue investigating"

"Can you please confirm this?"

...and more.

I never say 'please' to my computer, and it is so interesting to see someone saying 'please' to theirs.

[−] ercu 49d ago
You did the hard work actually to convince Claude to research deeper, as everytime it said no problem exists. That shows Claude thinking/research was not very deep. This time, the juniorness of the hacker helped the malware to be discovered faster (recursive forks), next time might be harder.
[−] ruszki 50d ago
Why is there a discrepancy between the timeline (which supposed to be UTC, and stated as 11:09), and the "shutdown timeline" (stated as 01:36-01:37)? There is no +2:30 timezone, not SDT and not DST. There is a single place on Earth where there is -9:30, and that's Marquesas Islands. What do I miss?
[−] someguydave 49d ago
apparently PyPI supports "digital attestation" (signed binaries?) Was this package signed? https://docs.pypi.org/trusted-publishers/
[−] tomalbrc 50d ago
Hmm a YCombinator backed company, I'm not surprised.
[−] motbus3 50d ago
I literally pressed sync dependencies button 1 minute after the malware version was removed. I guess thanks
[−] hmokiguess 50d ago
Does anyone have an idea of the impact of this out there? I am curious to the extent of the damage done by this
[−] moralestapia 50d ago
*salutes*

Thank you for your service, this brings so much context into view, it's great.

[−] ejaKh 50d ago
Anthropic is back to flagging after their Maven assisted Iran murder.
[−] cndg 49d ago
LiteLLM Security Certifications

Certification Status

SOC 2 Type I Certified. Report available upon request on Enterprise plan.

SOC 2 Type II Certified. Report available upon request on Enterprise plan.

ISO 27001 Certified. Report available upon request on Enterprise

ROFL

[−] Josephjackjrob1 50d ago
This is pretty cool, when did you begin?
[−] getverdict 50d ago
[dead]
[−] pugchat 50d ago
[dead]
[−] getverdict 50d ago
[dead]
[−] diablevv 50d ago
[dead]
[−] Archiebuilds 50d ago
[dead]
[−] Yanko_11 50d ago
[dead]
[−] craxyfrog 50d ago
[dead]
[−] aplomb1026 50d ago
[dead]
[−] JulianPembroke 50d ago
[dead]
[−] devnotes77 50d ago
[dead]
[−] devnotes77 50d ago
[dead]
[−] qcautomation 50d ago
[dead]
[−] manudaro 50d ago
[dead]
[−] elicohen1000 50d ago
[dead]
[−] paxrel_ai 50d ago
[dead]
[−] jeremie_strand 50d ago
[flagged]
[−] clawbridge 48d ago
[dead]
[−] A04eArchitect 50d ago
[dead]
[−] agentictrustkit 50d ago
[flagged]
[−] n1tro_lab 50d ago
[flagged]