Dependency Confusion: Another Supply-Chain Vulnerability

Alex Birsan writes about being able to install malware into proprietary corporate software by naming public code files the same as internal code files. From a ZDNet article:

Today, developers at small or large companies use package managers to download and import libraries that are then assembled together using build tools to create a final app.

This app can be offered to the company’s customers or can be used internally at the company as an employee tool.

But some of these apps can also contain proprietary or highly-sensitive code, depending on their nature. For these apps, companies will often use private libraries that they store inside a private (internal) package repository, hosted inside the company’s own network.

When apps are built, the company’s developers will mix these private libraries with public libraries downloaded from public package portals like npm, PyPI, NuGet, or others.

[…]

Researchers showed that if an attacker learns the names of private libraries used inside a company’s app-building process, they could register these names on public package repositories and upload public libraries that contain malicious code.

The “dependency confusion” attack takes place when developers build their apps inside enterprise environments, and their package manager prioritizes the (malicious) library hosted on the public repository instead of the internal library with the same name.

The research team said they put this discovery to the test by searching for situations where big tech firms accidentally leaked the names of various internal libraries and then registered those same libraries on package repositories like npm, RubyGems, and PyPI.

Using this method, researchers said they successfully loaded their (non-malicious) code inside apps used by 35 major tech firms, including the likes of Apple, Microsoft, PayPal, Shopify, Netflix, Yelp, Uber, and others.

Clever attack, and one that has netted him $130K in bug bounties.

More news articles.

Posted on February 23, 2021 at 6:18 AM23 Comments

Comments

Jeff February 23, 2021 9:47 AM

One of the advantages of go’s package management that uses explicit full path to the package, i.g. github.com/foo/bar.

Clive Robinson February 23, 2021 2:52 PM

@ Bruce, ALL,

Clever attack, and one that has netted him $130K in bug bounties.

Yes, but the vulnerability class that is behind it is ages old probably by six decades or more.

It’s an obvious consequence of,

1, The “Code is data” or “Data is code” comments you hear. Where infact both are just “information” held as a “bag of bits”[1].

2, “Bag of bits” files that are just infornation and can be treated as Code or Data.

3, File info / name having no real relationship to the file contents.

The most obvious to the eye examples of where people have tried to fix this are magic numbers at the begining of text and similar file contents, and file type extensions appended onto the files names. Both of which are so easy to fake, it’s hardly worth mentioning it, which is why the problem continues.

Likewise “signing” of a file will not realy solve this problem just make an attacker take a different attack route.

This problem might well be a “better mouse trap” problem and may not be solvable.

To any one who thinks they have a solution, I would suspect they are not “thinking hinky” enough, and almost certainly not like an attacker.

[1] For those not familiar with the “bag of bits” name, a “bag” is the most primitive of abstract aggregate data types. The only form it has is that it is a “set” of “things” in this case bits.

However to be of use in the case of a file, it has to have the concept of order of things as well. That is whilst many bits can have the same value, in a file you are interested,in the bit value at a given place in the file. It is in effect the same as the primitive tape in a Turing Engine.

It is upto the “user” of the bag of bits to define structure and meaning. Thus in the case of a “bag of bytes” not just the size of the bytes is defined but also which end is the most significant bit or start bit depending on the function of the bytes (numbers, characters, bit fields, etc).

https://www.d.umn.edu/~jallert/cs1/projects/TheBagADT.pdf

lurker February 23, 2021 4:11 PM

@Clive, Likewise “signing” of a file will not realy solve this problem just make an attacker take a different attack route.

Again displaying my slowness of wit: for years I have noted systems that religiously check the hash of downloaded files, but surely this is only to protect against transmission errors. If a bad guy can upload a fake file then he must also upload a [fake-file]hash.

Does having the file and its hash on the same server make this task any easier?

xcv February 23, 2021 10:06 PM

Another perennial supply-chain issue:

Executioners sanitized accounts of deaths in federal cases

A closer look at midazolam and pentobarbital

Whether it’s a federal or state execution, namely that of obtaining legal supplies of drugs for execution of inmates.

LOS ANGELES COUNTY SHERIFF’S DEPARTMENT
Los Angeles Deputy Says Colleagues are Part of Violent Gang

The allegations against the Compton deputies follow accusations of other gangs in the department — called the Spartans, Regulators, Grim Reapers and Banditos — that also share tattoos and a history of violence, the Times said.
A violent gang of Los Angeles County sheriff’s deputies who call themselves “The Executioners” control a patrol station in Compton through force, threats, work slowdowns and acts of revenge against those who speak out, a deputy alleges in a legal claim.

Mr C February 24, 2021 12:27 AM

@ Clive, ALL”
“This problem… may not be solvable.”

It appears you’re thinking of this problem so broadly I have trouble following. Taking the problem in narrow terms the solution is obvious.

This problem, defined narrowly as “build system grabs dependencies from the wrong repo” is readily solvable. Just eliminate the ambiguity. Like Jeff said, the build system should require unambiguous paths, meaning full urls for any resources that aren’t on the local machine.

The related, broader problem of “today’s clean v1.0 dependency becomes tomorrow’s poisoned v1.1 dependency” also has an obvious, if painful, solution. Clone a known clean version of the dependency to the local machine and point your build system to the local copy. Review commits fastidiously before updating the local copy. No one is going to like doing this, since it adds a huge labor cost to what they like to think of as a free product, but that’s the price to be paid if you want to be sure that free code is also safe code.

Two more general things that might have prevented this: (1) Don’t treat dependency names as if they’re secret; they’re not. (2) Don’t use javascript for anything important.

JonKnowsNothing February 24, 2021 1:42 AM

@Mr C @Clive @All

re: Eliminating Ambiguous Dependencies by fully qualified paths

Just eliminate the ambiguity. Like Jeff said, the build system should require unambiguous paths, meaning full urls for any resources that aren’t on the local machine.

I may have missed something but… what do you plan to do when the URL changes?

Lots of repositories re-org regularly. Re-sort, re-layer, re-tree and move servers. URLs are not permanent. Some URLs may get blocked by LEAs, Government rules or laws, TOS/EULA fights or just plain dueling mega corps. What worked today may not be in the same spot tomorrow. They are not permanent because someone has to pay for the name and if they don’t pay or get bought up, your path isn’t going to be where you thought it was.

It the Old DOS Days, we did fully qualified path names like:

  F:/Docs/My Docs/The Docs/That Docs/New Docs/FindMEaDoc

It didn’t work any better…

Clive Robinson February 24, 2021 2:52 AM

@ Mr C,

This problem, defined narrowly as “build system grabs dependencies from the wrong repo” is readily solvable. Just eliminate the ambiguity.

That’s a “moving the problem” reaction not a “Solving the problem” solution. It’s akin to playing “whack-a-mole”.

To do what you suggest requires things be set in concrete, but the network and the rest of the infrastructure are not for good reason.

So look at things in that light and you see the application gets told by the OS or Network that “Red is Green or Green is Red” and thus the application believes it because it has no way of telling.

Some how it is the “bag of bits” it’s self that has to prove to the application it is valid.

The obvious next step would be that the file is somehow encrypted and signed… But when loading multiple files of code from different places you have multiple signitures, and we know what fun can happen there with the problems with CA’s. Then there is the problem of how do you know the signiture you get is valid? Baking it in is likewise not a good idea especially when the latest thinking about PK certs is you have to change them once a year anyway…

I could go on but as I’ve said,

“To any one who thinks they have a solution, I would suspect they are not “thinking hinky” enough, and almost certainly not like an attacker.”

Clive Robinson February 24, 2021 3:05 AM

@ SpaceLifeForm, ALL,

Small fire. Just a flesh wound. Sound on.

Well atleast it looks like they had nice weather… Oh and no gas line got ruptured yet…

But a “Thank God” from the local Sheriff after the event is not a valid safety precaution, by any measure.

Oh and that black smoke is almost certainly full of carcinogens that are going to end up being breathed in or ending up in the local water course, plants and live stock etc. So there may well be a death toll down the line.

Clive Robinson February 24, 2021 3:48 AM

@ yosef,

As always such claims appear to be founded on the notion,

“Americans brilliant, every one else is stupid, therefore must be stolen US technology”

Ever heard of “coevolution” where two entirely different branches of spiecies respond to each other. The usual example is “flowers and bees” which is mutually benificial. However thorns and stings to deter predation might be a better example.

But evolution has other tricks, bays and birds independently developed flight in response to the environment.

Even the Wired article eventually all be it diminutively makes the point,

“Check Point’s findings aren’t the first time that Chinese hackers have reportedly repurposed an NSA hacking tool—or at least, an NSA hacking technique.”

But nobody has yet dared say,

“Hang on it’s a two way street”

Both the Chinese and Russians are quite advanced when it comes to software. Russia because it was behind in hardware did not have the luxury of just throwing code at a problem and assuming the users could just have more RAM and a faster CPU they learned how to live well within very restricted resources. The Chinese got to where they are via a different route. The point is they are both more than a match for the US and likewise the NSA.

There are two ways you can find a new attack,

1, Code analysis.
2, Random trials.

Both are actually probabalistic in nature. Which does not rule out two people throwing double six at the same time. Or for a lone player to throw six of them in a row.

The fact that China, Russia the US, or any other country has a zero day does not mean that the others can not also have found it, or even found it first.

So there is a very distinct probability that the NSA is stealing and using other peoples attacks.

I certainly would, as it would help with “deniability” of actions or “false flag” operations.

My father impressed on me four or five decades ago that,

“When there is trouble, the best place to be is somewhere else”

However if you think about it if you have skin in the game then,

“When there is trouble, the best person to be is someone else”.

It’s why you will see,

Attribution is Hard

Mentioned on this blog fairly repeatedly.

The most dangerous thing you can do when carrying out an investigation is “to make assumptions” which is by and large what a lot of these “It’s the Buttler wot dunit” claims are at the end of the day. Worse they are often made for political or patronage reasons, with the result the water is so muddy nobody can see even as far as the end of their nose, which is also convenient if you want to hide your activities.

Dave C February 24, 2021 7:28 AM

@Jeff:

One of the advantages of go’s package management that uses explicit full path to the package, i.g. github.com/foo/bar.

Further than that, as of Go 1.13 (Sept 2019) they by default use and check “an auditable checksum database” (here checksum==cryptographic checksum) for all dependencies. This is for reproducible builds as well as security.

For anyone interested, see sum.golang.org and the link there for “Secure the Public Go Module Ecosystem Proposal”.

The end result is that if I first add a dependency on, say github.com/foo/bar, then I get some exact version of that has been seen before and is permanently recorded by the public database (trust on first use) and thereafter, unless I manually make a change to the required version, anyone (inside or outside my organisation) that builds my package will, by default, use the exact same version of the dependency that I used (and hopefully have fully tested against). If someone puts a modified version on github then depending on how they do it (e.g. a man-in-the-middle substitution or a github account take over or whatever) various checks will fail or at absolute minimum (if I blindly ask the tool to “update” to the “latest” version) there will be a record of the bogus hash in the public database for later auditing by anyone. In practice, not perfect, but quite nice.

SpaceLifeForm February 24, 2021 5:58 PM

@ lurker

Does having the file and its hash on the same server make this task any easier?

I would ask SolarWinds.

MrC February 24, 2021 7:51 PM

@ JonKnowsNothing:

I may have missed something but… what do you plan to do when the URL changes?

The build fails. Then a human being has to sort out where the reorg moved it to and update the dependency list. “Failing hard/loud” is preferable to letting the build system make a guess about what to substitute for the missing files.

Clive Robinson February 24, 2021 10:56 PM

@ Mr C, JonKnowsNothing, lurker, SpaceLifeForm, ALL,

Then a human being has to sort out where the reorg moved it to and update the dependency list.

“Bang you are dead” or more politely QED.

That is a “Grade A1” exploitable vector via “social engineering” and other techniques.

As I indicated it’s a game of “Whack-o-Mole” and I suspect not one for which a realistic security solution exists, or is likely to exist any time soon.

But look at it another way,

1, All software has bugs.
2, We do not want buggy software.
3, All software needs patching.

Thus no matter what we do, a human has to go in there to make changes to existing code.

In essence, that’s the root of how SolarWinds got used.

SpaceLifeForm February 25, 2021 1:04 AM

@ MrC, Clive, JonKnowsNothing, lurker, ALL

When doing your software build, you want to FAILFAST.

You want to know there is a problem as soon as possible.

This is why you want the complete codebase on your build machine(s).

You do not want to encounter unexpected dependency failures.

You certainly do want want to rely upon pulling code on they fly over the internet. Why have a dependency on internet connection when you should not need it?

That is not the smart option. Pull the source. Manage your own repository. Then you can build offline.

FAILFAST.

SpaceLifeForm February 25, 2021 1:14 AM

@ MrC, Clive, JonKnowsNothing, lurker, ALL

FAILFAST

Worked really well before. 11 or 17 lines? Your call.

hx tps://arstechnica.com/information-technology/2016/03/rage-quit-coder-unpublished-17-lines-of-javascript-and-broke-the-internet/

ht xps://www.theregister.com/2016/03/23/npm_left_pad_chaos/

name.withheld.for.obvious.reasons February 25, 2021 1:25 AM

Today during the confirmation hearing for the Director of the CIA, William Burns, I couldn’t help but be struck by the inane and nearly pointless conversations with Senators about supply chain issues. How many years has this topic been beaten to death on this blog–over thousands. Everything from vendor and manufacturer issues and sourcing, design and integration, and implementation, process, and source controls beyond infinitum. It is as if I’d taking the wayback machine and Mr. Peabody was instructed to set the dial to 1982.

SpaceLifeForm February 25, 2021 2:06 AM

@ name.withheld.for.obvious.reasons

Got to get fresh blood in Congress. Too many dinosaurs that do not grok technology or supply chains.

My googlefu is failing me right now, but I definitely recall a congress-critter that dissed farmers, and said he would just go to the grocery store.

Clive Robinson February 25, 2021 6:15 AM

@ SpaceLifeForm, JonKnowsNothing, lurker, MrC, ALL,

Worked really well before. 11 or 17 lines? Your call.

The point that the articles miss out on is that the developer of the code had legal rights over his code that he had not in any way signed away and he excercised them.

What is also not mentioned is the person who coded a replacment has stolen the original developers intellectual property.

The code “might” be different but for it to work,

1, The name had to be the same.
2, The interface had to be the same.

Both of which have multiple millions of dollars in lawyers fees and court costs already invested in the arguing of them…

NPM’s almost certain “VC led Corporate Response” will be to change the rules to make the “little people” comply by making them give up the “right to remove” and similar…

At which point “free software” stops being “free”.

But the reality is what “doofus” uses such free software that way and why is it almosy always “web-tech developers” we hear gettingvtheir panties in a wad?

Basically the “tight fisted” / “can not be bothered” / “incapable” / “incompetent” and similar were the ones screaming loudest about their rights… Basically advertising to the world they are “doofuses” and venal ones at that.

I always knew this was going to be a problem way back in the late 1970’s early 1980’s when developing bits for “bulk upload” for BT services. Which is why when I put code up on bulletin boards and on sneeker-nets I had a header in each file that clearly stated “use at your own risk” and putting it unencumbered in the public domain, not even requiring the header with my name etc to be retained.

I’ve nothing against Free or Open source code, and I do use a lot of it. But I know what writable CD/DVDs are, and put everything on them in source form and “put them in the project file”.

Whilst it does not solve quiye a few of the problems such as the 16bit to 32bit transition issue, or the more recent 32bit to 64bit issue with underlying hardware, it does alow you to go through and rebuild things if you have a need to in most cases. However a device driver for hardware that’s long since superceeded is at best a guide. After all who still uses parallel port dot matrix desktop A4 width printers from Epsom any longer and then on MS-Dos not even Win 3.1?

Only two that I know of, one being someone with custom ICS kit, the other being the person who designed it and still supports it out of friendship more than a quater century later…

me February 25, 2021 6:49 AM

Funny also that github is owned by MS.

Can’t they just simply decide to deliver changed files to specific downloaders?

Master-of Supply-chain.

A choke point.

Would such an action even violate the TOS?
What’s to stop them?
The lack of due diligence on the receiving ends is appalling.

JonKnowsNothing February 25, 2021 10:38 AM

@me @All

re:lack of due diligence on the receiving ends is appalling

Back up the microscope and look at more of the elephant. It’s not appalling it’s deliberate. In countries that have certain economic systems, the consideration and value of such things is based on money.

Think of this as a balance. The more the money side tips towards the desired side then the policy will lean that way. If the desired side is “short term profits” or similar longer term national goals, the balance will be loaded to tip in that direction. Anything that tips the balance the other way will be removed or deleted.

If you do a thought experiment and put any project on the scale you can see which way the scale tips.

The confusion happens because a good number of people WANT to do a GOOD JOB. They WANT the software to WORK. They want the software to BE SECURE. Lots of examples in the blog here.

The scale does not care about “good job, good software, or security”. It cares only about money. That is the only thing on measure.

It’s the same concept as Herd Immunity Policy (HIP) that perpetuates the death counts from COVID-19 by trading the dead for economic advantages. It’s anathema to Health Care Workers and to 50% of the USA. The other 50% of the USA are perfectly fine letting others die for the economy and make decisions to continue the cycle as long as possible.

The same concepts are at play for software and hardware. The value of securing or providing less risk is not on the money side of the scale. As long as this is the only meaningful value, only a very few can slow the slide. It’s easy enough to sweep aside any opposition, you just buy up the competition and shelve the concepts under IP/Trademark/Patent Laws.

iirc(badly) When a now well know vacuum cleaner first came to the market using a new technology, there were stories about how the company was approached by the Big Dogs in the market place. They wanted to buy up the company, not because it had a better idea but to shelve the technology to prevent it from competing with the existing version.

iirc(badly) Some years ago, California passed a law requiring reduced car emissions and one of the Big Dog car makers made an all electric car. When the car manufacturers obtained a change of ruling, the EV1 was destroyed. Everything. At that time, the company said they burned the blueprints too. The EV1 was too competitive and would have altered the market place. Even though the company would have made money with the EV1, they made a lot more using the old tech.

iirc(badly) IBM did the same thing with their PCs. They made money, lots of it. They made more money from Big Iron. Big Iron won. The IBM PC division was sold.

ht tps://en.wikipedia.org/wiki/Dyson_(company)

ht tps://en.wikipedia.org/wiki/General_Motors_EV1

ht tps://en.wikipedia.org/wiki/IBM_Personal_Computer
(url fractured to prevent autorun)

Leave a comment

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.