Infecting Systems by Typosquatting Programming Language Libraries

Typosquatting is an old trick of registering a domain name a typo away from a popular domain name and using it for various nefarious purposes. Nikolai Philipp Tschacher just published a bachelor’s thesis where he does the same trick with the names of popular code libraries, and tricks 17,000 computers into running arbitrary code.

Ars Technica article.

Tags: academic papers, DNS, hacking

Posted on June 15, 2016 at 6:47 AM • 23 Comments

Comments

Mike Gerwitz • June 15, 2016 7:37 AM

There is a long-standing, disturbing trend for software authors/packagers/distributors to neglect signing packages or distributions, and for package managers to not provide support for a keyring.

blake • June 15, 2016 7:37 AM

@article

It’s not clear if the experiment broke ethical or even legal boundaries

It’s a good one. Would it be more ethical to suspect that there is a vulnerability and turn a blind eye? Would the most moral position to be to somehow not even think of it?

The ToS of the development communities probably also say things like code is obtained “as-is”, and don’t provide warranties of fitness or suitability of hosted code, etc, but probably also forbid uploading malicious code – though the article described his script as “benign”, but it’s in the details…

From the Thesis:

My acknowledgments belong to Donald Stufft, one of the
PyPi administrators, who was very cooperative and
allowed me to continue the typosquatting experiment

and then also

I am grateful for Robert Kerns warning to assign
descriptions to all typo packages to clarify the
intentions of the empirical experiment.

So yeah, IANAL, but I reckon he’s got ethical cover there.

Wm • June 15, 2016 8:10 AM

Agree with blake. Fortunately, the hacker is in Germany, so he is safe from the vicious and ravenous U.S. justice department and its minions.

z • June 15, 2016 8:14 AM

@ Mike Gerwitz

Yes, and another problem is that when signing actually is implemented, it’s frequently done wrong. Many times the public key and detached sig are hosted on the same site as the files, so if an attacker convinces you to go to his typosquatted site, he can just put his own signing key and sig files there. OpenBSD solves this by putting the keys in the base system, so as long as you were running a legitimate install when you started, all future upgrades can be safely verified. That’s a bit harder to do for 3rd party libraries, but illustrates the point.

I also have a bit of a gripe with GPG detached signatures because they make verification optional. I’d rather append/prepend the signature to the file so that the file can’t be used unless the sig goes away, which is easily done through successful verification. This is how signatures in NaCL work.

r • June 15, 2016 8:37 AM

@z,

That ‘same host’ policy you speak of with the detached signature files is something that frequently irritates me.

z • June 15, 2016 8:59 AM

@ r

Me too, and it’s one reason why I am a fan of ECC keys. You can fit an entire Ed25519 key (base64-encoded) in a single Tweet. It’s a cheap and easy way to distribute keys for verification that doesn’t use the same host and that most users are familiar with.

scotty • June 15, 2016 9:37 AM

It’s because everyone is stupid.

Just use open source and you won’t have any problems

It’s all Microsoft’s fault.

Just use a longer password and you won’t have any problems.

Just encrypt everything. Use PGP of course.

It’s all Google’s fault.

Just use Linux and you won’t have any problems.

It’s the NSA.

Robert • June 15, 2016 9:48 AM

@Wm

Surely you’ve heard of extradition? Yes, he may be a bit safer, because there would be a lot more wheels of bureaucracy to turn, but his safety is by no means guaranteed…

absolutely, the ‘same host’ principle is annoying, isn’t it…

Twitter is an improvement over that, admittedly… however, it’s also by no means “the answer”… because surely if certain large global organizations target you they’ll target your twitter account too… Safe distribution of keys is an ongoing issue…

@all

Face it: there are few or perhaps no real absolute guarantees of safety in this world. Even if you get rid of every human right and herd the entire US population into prison (you know, to keep them “safe” from terrorism), there’s always that pesky gas chamber coming next… Or even just plain old lightning that has always killed off more than terrorism anyway. Everyone needs to stop trying to depend on their government to keep them “safe” and just accept that life is inherently risky. The safest thing to do in such an unsafe world is to focus on keeping everyone safe FROM THE GOVERNMENT instead! The way the original founders of the US were doing! They knew what they were doing!

z • June 15, 2016 10:23 AM

@Robert

Oh I have no doubt that Twitter is not “the answer”. But it’s easier to tell my users to copy and paste from a pinned tweet from my account than to run a series of byzantine command line incantations to grab my key off a keyserver, and it’s obviously better than not even trying to keep it separate. I’d better secure my account, of course, and hope that Twitter does not swap in an attacker’s key on their own or get hacked.

One side benefit is that you get to use all kinds of buzzwords that impress your boss like “leveraging existing infrastructure” and “social media” etc., which is scientifically proven to be more important than any cryptographic advantages. 😉

albert • June 15, 2016 11:06 AM

It reminds me of the old Microsoft malware that used an extension like looked like ‘.dll’ but was actually ‘.dII’ or similar.

Who was the idiot who decided to use fonts designed for advertising in computer systems?

And we’re still stuck with sans serif fonts today.

Written in Courier,

. .. . .. — ….

K15 • June 15, 2016 1:59 PM

Is there something like a registry or wikipedia for best practices? (Would it be practical? Or would it just become the next attack target?)

I ask because businesses don’t do things that seem like they’d be obvious security measures. Maybe the businesses don’t know better.

Andrew • June 15, 2016 2:06 PM

I’ve just discovered that Skype on Windows 10 ignores any privacy settings (camera=off, microphone=off, skype=off) and can make beautiful video calls, it’s a “bug” of course. It’s normal win32 version, not the one from Store, I posted here maybe someone has more time to investigate.

Jeremy • June 15, 2016 3:23 PM

@K15

Well, there is OWASP. (I’ve run across them a couple times but don’t know a great deal.)
https://www.owasp.org/index.php/Main_Page

Dirk Praet • June 15, 2016 5:54 PM

Back in he days, Solaris used to have a really cool auditing tool called BART, which you could use to compare MD5’s with an online DB called the Solaris Fingerprint Database, which I think is pretty much self-explanatory.

In Solaris 11, this is integrated in the package manifests of IPS (image packaging system).

Thomas • June 15, 2016 5:56 PM

@Andrew
No wonder Microsoft is so eager to have that skypehost running on Win 10 machines even when Skype is not installed.

albert • June 15, 2016 6:40 PM

It reminds me of the old Microsoft malware that used an extension like looked like ‘.dll’ but was actually ‘.dII’ or similar.

Who was the idiot who decided to use fonts designed for advertising in computer systems?

And we’re still stuck with sans serif fonts today.

Written in Courier,

. .. . .. — ….

Zombie Bachs • June 15, 2016 6:44 PM

@Thomas:

No wonder Microsoft is so eager to SUCK YOUR BRAINS OUT WITH A STRAW.

FTFY.

Clive Robinson • June 16, 2016 2:52 AM

@ Scotty,

It’s a twist on “trusting trust” without testing.

Ask instead, why people are downloading code / libraries that are probably greater in size / lines of code, than they themselves have ever written?

Likewise why people are just “cut-n-pasting” “code snipits” from the internet?

The answer is of course what economists call “productivity”.

It’s an endless mantra you hear from team leaders, up to government ministers, that “productivity has to improve”, but do people actually think through what it means in reality?

Look at it this way, the faster you turn the handle on the wood chipper the more chips you get in an hour. But just how fast can that handle be turned? What happens to productivity when you get to the speed the chipper keeps breaking down?

But also as in this case what happens when the “Feed Stock” gets contaminated, look on it as the nail in the wood that the chipper blade shatters upon.

The thing about security is whilst it blows the whole system open, you generally don’t get pieces of it flying past your ears, telling you something has gone wrong.

Thus security flaws can hang around for a long time take BadTunnel it’s been around for atleast 20 years,

http://www.darkreading.com/vulnerabilities—threats/windows-badtunnel-attack-hijacks-network-traffic/d/d-id/1325875

Acording to Yang Yu, director of Xuanwu Lab of Tencent in Beijing, who discovered this NetBIOS bug “It not only can be exploited through many different channels, but also exists in all Windows versions released during the past 20 years. It can be exploited silently with a near perfect success rate.”

By “different channels” he means that as it’s a protocol attack it does not need a malware pay load thus “It can be exploited via all versions of Microsoft Office, Edge, Internet Explorer, and via several third-party apps on Windows” as well as via IIS and Apache Web servers, and removable media such as a thumb drive.

So if you still use XP, 2000 etc as many are then the fact Micro$haft is not supporting you with patches might be of concern to you.

As I’ve said many times in the past if I was the NSA, GCHQ et al it would be standards and protocols I’d be looking at attacking, as attacks via them remain good for many many years.

And the upside for the IC this “Demand for increased productivity” takes the average coder well away from such attacks as they are hidden buried deeply away in some library or code snippet the coder has blindly trusted and has not a clue about.

blake • June 16, 2016 6:31 AM

@Clive

Ask instead, why people are downloading code / libraries that are probably greater in size / lines of code, than they themselves have ever written?

I regularly run code on processes and operating systems for which I have no idea how they were designed or built, and would take more than the rest of my life to reproduce myself. In the best case, it’s the software equivalent of standing on the shoulders of giants. However, real world != best case.

The productivity mantra shouldn’t be taken too far, as you say, there are other concerns we should also balance. But it’s a spectrum, with “stack sort” at one end, the other end being constant reinvention of the (square) wheel and home-brewed cryptographic hash functions.

The trick would seem to be to make sure that we really are standing on the shoulders of giants, rather than just sitting on a crocodile to get across a river, because those two metaphors have quite different conclusions.

Clive Robinson • June 16, 2016 7:49 AM

@ Blake,

The trick would seem to be to make sure that we really are standing on the shoulders of giants, rather than just sitting on a crocodile to get across a river,

And there as they say is the problem, as even giants can have feet of clay.

It was once pointed out to me that “Being a genius, like being a murderer, takes an act recognised by others as such, however in both cases that first act may be your only act of it’s type”. Thus as they now say in financial adverts “Past performance is no indicator of future performance”.

Security is the same, no matter how loyal etc, thus trustworthy you have been does not in anyway indicate you will not turn traitor in the future. As you indicate it can take more than a lifetime to write and ensure the security of the equivalent of a small operating system.

Thus we appear to be stuck with blind trust or trying to find methods by which it is not subverted. Which is a very hard problem to solve on the face of it.

My viewpoint for a while has been “Don’t trust, verify” and how to do it when you can not test the foundations you are building on.

I’ve found that suprisingly there are ways that this can be done, and they turn out to be a lot simpler than blindly chasing down the rabbit hole, of trying to establish trust at each lower level.

The question then becomes one of achievable “mitigation” not probably impossible “verification”. The mitigation issue then becomes one of resources and how best to utilize them. And oddly we probably got it right in the late 1960’s and early 70’s then kind of forgot about it due to rapidly changing technology producing ever cheaper solutions that were seen as “More Productive” even though they probably were not.

Jesse Thompson • June 16, 2016 2:06 PM

@albert

Who was the idiot who decided to use fonts designed for advertising in computer systems?

I wouldn’t target fonts as any sort of strategic security bottleneck anymore, Unicode has become rather well supported and I guarantee that regardless of the font there exists a codepoint for “looks just like a lowercase letter lima, but isn’t”, and probably to support some benign and rather compelling use case somewhere.

Or, if there isn’t now, then there eventually will be. :/

@Robert

Well, few of us in thread are expecting to find absolute security, though security systems that reduce risk by lending themselves less easily to misunderstanding and ambiguity by tired or inexperienced brains are always helpful. Also few of us in thread are expecting anything about the government to involve itself in the improvements we are brainstorming, though it is true that centralizing of authentication in general (such as trusting repository maintainers or Twitter accounts to host key material and not themselves get pwnt) is merely shifting the weight of the problem into a different vulnerability.

I think that less distributed solutions like DNSSEC may be safer to build software signing systems upon. :3

Chase Johnson • June 20, 2016 9:59 AM

@Clive

Can you elaborate on the verification/mitigation techniques you’ve found effective? This is something I’ve been pondering a lot lately in the context of verifying knowledge/mitigating deception (for use in, e.g., politically controversial topics like global warming, genetic modification) since, as with a large piece of software, no person can know everything there is to know about every important topic area, but it’s still important to know which bodies of knowledge to trust. I’ve come to the conclusion that cheap verification techniques should be possible, but I haven’t figured out quite what they are yet. I’d love to hear more about what you have found on this topic.

hacktivist • June 20, 2016 10:51 AM

He deserves a good Bachelors grade for this – a very good exposé of how open source package repositories can be used as an attack vector. My typing has improved as a result..

Schneier on Security

Infecting Systems by Typosquatting Programming Language Libraries

Comments

Leave a comment Cancel reply