Python Supply-Chain Compromise

This is news:

A malicious supply chain compromise has been identified in the Python Package Index package litellm version 1.82.8. The published wheel contains a malicious .pth file (litellm_init.pth, 34,628 bytes) which is automatically executed by the Python interpreter on every startup, without requiring any explicit import of the litellm module.

There are a lot of really boring things we need to do to help secure all of these critical libraries: SBOMs, SLSA, SigStore. But we have to do them.

Posted on April 8, 2026 at 6:25 AM8 Comments

Comments

Python User April 8, 2026 9:29 AM

If you wanna screw somebody – just use python – I do it all the time. And mutual satisfaction is guaranteed every single time! After all – imagine penetration testing or the penetration without testing, sans python? Much better with python. Thing is, it’s free to boot.

Anselm April 8, 2026 12:19 PM

The problem here is with .pth files, which are used in Python to tell the Python interpreter about the location of various resources. These can, in principle, contain arbitrary Python code and are sourced by the Python interpreter whenever it starts up. Apparently a more declarative approach that functions without arbitrary Python is in the works, in order to counter such attacks in the future.

Ray Dillinger April 8, 2026 1:42 PM

I have been avoiding most “modern” programming languages because each program tends to come with a transitive closure of dependencies imported from so many sources all over the Internet that it seems flatly impossible that these sources are all trustworthy and flatly impossible that there is any unified team trying to make sure that they are all trustworthy.

And that dependency tree can change after the program is written. Usually these dependencies, if missing, get dynamically loaded from some random Internet site, and run without so much as a notification to the user that this is happening. Some library maintainer decides it’s okay to change their dependencies, and suddenly the project you wrote three years ago and haven’t changed is now downloading and running code that’s entirely outside the set of things you have accepted the risk to run, from a source you have no idea whether to trust or not, without your permission.

And these libraries are for the most part undocumented – or at least undocumented in any way that would be discoverable and verifiable by any standard procedure or by someone who’s not already downloading and using them. I mean, oh, yeah, there’s a website somewhere. But all these websites are organized in different ways and may omit crucial information or hide it six levels down in a click-random-things hierarchy most of which doesn’t have much to do with the crucial information you’re after.

The organization or hosting of these websites can change on somebody’s whim, the URLs are random hosts you never heard of, any of them could be run by anybody including attackers, and neither the libraries nor their documentation is even signed (other than some which is “trust me bro” self-signed by the authors). None of it is resident on your system where you’ll know if it has changed. None of it is resident on my system where I could use it in places or circumstances where I do not want the machine connected to the internet. None of it is even resident on my system where I could rely on being notified if it had been changed.

Just, no. I try building a program and watch, aghast, as the machine goes and downloads code from a thousand different completely unknown and unvetted hosts, fails to install locally resident documentation for any of it, fails to provide any uniform interface to whatever documentation exists, and spits out something I am expected to run.

There is not even an accepted stable specification for all these libraries, and nobody who is testing new versions of each to see that it meets any such specification. Nor is there a central repository with a gatekeeper and a unified bugtracker for all of this stuff. Finally none of it is even packaged for, signed by, and served by for any consistent identifiable signing authority like a distro maintainer that tracks and announces security issues and bugs. Therefore no distribution maintainer is even testing new specific versions of it in combination with specific versions of other things accepted into the same distro as dependencies.

Nobody other than the packager/author is tracking bugs or security issues or signing new releases, so each one basically all comes down to “trust me bro” with no reference to anybody whose work depends (and the value of whose signature depends) on all of it being tested and clean.

No fecking way am I ever going to take the having all that on my own system to develop software. And if I did build software that way I’d wind up with a program I could not in good conscience ask any client, user, or customer to ever trust.

Seth Wells April 8, 2026 5:48 PM

The dependency problem is also an authority problem. Every layer in the chain that can interpret, authorize, or execute becomes a trust surface. I’ve been working on a constraint architecture where the intermediary layer is stateless, holds no secrets, makes no decisions, and has zero dependencies — 1.7KB total. It just binds and forwards. Authority stays at the provider boundary mechanically, not by policy. Live demo under sustained adversarial load: challenge.xer0trust.com

Clive Robinson April 9, 2026 2:34 AM

@ Bruce,

You say,

“There are a lot of really boring things we need to do to help secure all of these critical libraries … But we have to do them.”

@ Anselm and @ Ray Dillinger, give a partial over view of the problem at the source code level.

But the problem goes both lower than source code in a high level language and it also goes higher through the design of the language and specifications and theoretical design ideals.

That is Many high level languages require code on the other side of the ISA that can be not just machine architecture dependent but in many cases dependent on ASM and lower functionality that is not in the language design but represents strong security effectors in hardware. People finally woke upto this with the “Xmas gift that kept on giving” and as recent news shows still does with amongst other things RowHammer has “come a’knocking again”…

But there is also the issue of hardware security enhancements to be be correctly integrated. That are going to come in time such as “Memory Tagging Enhancment”(MTE) From UK CPU designers ARM through to the likes of Cambridge University “Capability Hardware Enhanced RISC Instructions”(CHERI). The high level languages need them “desigend in” from “day zero” not “bolted on” at some later date…

https://securityboulevard.com/2024/03/a-faster-path-to-memory-safety-cheri-memory-tagging-and-control-flow-integrity/

The fact that we’ve still got very real security issues arising from something so conceptually simple as “data serialisation” at the API level let alone at hardware levels indicates that we will have a very hard uphill climb ahead of us.

In the past I’ve produced my own version of a reduced C programing language compiler with my own extending library that enables entire sections of code be loaded OUT “sight unseen” with faster “long integer math” and yes I got some things wrong when crossing from one CPU architecture to another.

But I also got hit by a very low flying “Black Swan” I could not have seen.

It made me realise that there are basic instances of attacks that fall in classes of attacks and why some are predictable and some are not. Hence why I talk about Instances of attacks in Classes of attacks ad,

1, Known, Knowns.
2, Unknown, Knowns.
3, Unknown, Unknowns.

And show why certain things can be demonstrated from this simple model. Such as why “Current AI LLM and ML Systems” will find new “Known Knowns” variants and a very limited number of “Unknown, Knowns” but have little or no chance with by far the majority of “Unknown Knowns” and even greater “Unknown Unknowns” and actually can not do so “except by chance happenstance”. So why you should treat Google and Co’s announcements as “advertising fluff” more than actual reality. And more importantly keep pushing the required levels of professionalism in humans that can reason their way to “Black Swans”.

But there are other issues just one being how to explain,

“What is good to remove, what is desirable to remain, and what is shall we say multi-purpose and should be catered for even though it does not yet exist to build with”.

In our “new designs”.

Swede April 9, 2026 3:33 AM

I am fluent in many programming languages, but for the last ten years I only use Go. Many reasons, but related to this post is go.dev/blog/supply-chain

Leave a comment

Blog moderation policy

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.