On the Insecurity of Software Bloat

Good essay on software bloat and the insecurities it causes.

The world ships too much code, most of it by third parties, sometimes unintended, most of it uninspected. Because of this, there is a huge attack surface full of mediocre code. Efforts are ongoing to improve the quality of code itself, but many exploits are due to logic fails, and less progress has been made scanning for those. Meanwhile, great strides could be made by paring down just how much code we expose to the world. This will increase time to market for products, but legislation is around the corner that should force vendors to take security more seriously.

Posted on February 15, 2024 at 7:04 AM24 Comments

Comments

Adam February 15, 2024 7:27 AM

I remember watching a video with Brian Snow (of NSA) and Dan Geer (of In-Q-Tel) and Brian talked about how they took a standard office package and were able to remove 80-90 % of the code and still maintain all the functionality. Because of inefficiencies in the code and poor working structure of the people who wrote it. It is in this video:
https://www.youtube.com/watch?v=vM2pcRtOb6Y

Fazal Majid February 15, 2024 7:40 AM

You only use 10% of the features of any non-trivial software package, but the catch is no two users use the same 10%, which is where the bloat comes from. I woul dhave expected the end of Moore’s Law for single-thread performance to usher a new era of software optimization, but that just doesn’t seem to have happened. There are economies of scale in bundling a lot of features together, and disincentives to focusing on performance except in narrow niches.

Fun fact: Bert Hubert is a former colleague of mine.

Andrew L Duane February 15, 2024 8:59 AM

I recently bought a new (used) car, a 2014 BMW with no LTE, Cell, or WiFi in it. Because, well, I read this blog. Scrolling through the menus while learning about the car, I found an item called “Software Licenses”. 30 pages of software licenses for dozens of libraries, some of which I’ve heard of, some of which I haven’t. All 10-15 years old, all unpatched. Even a half dozen libraries from my current employer and a former one.

Mind you, this is a car without cell or WiFi, no web browser, no Apple Car Play or Android Auto. Just Bluetooth audio to talk to my phone, CAN bus drivers, and enough connectivity to load software updates at the dealer. I shudder to think how much “stuff” is in a modern car with all that other functionality in it.

Clive Robinson February 15, 2024 9:23 AM

@ Adam, ALL,

Re : Work styles can have reasons.

I used to write code in assembler for microcontrollers and the work process could end up being confused with,

<

blockquote>”how they took a standard office package and were able to remove 80-90 % of the code and still maintain all the functionality. Because of inefficiencies in the code and poor working structure of the people who wrote it.”

One way to write assembler code is in the simplest and clearest way and when the functionality is where it meets the specification then you compress it down to fit in a smaller footprint etc.

If you think about a program for embedded and many other systems it has three distinct basic phases,

1, Startup – Initialise.
2, Runtime – Control loop and subs.
3, Shutdown – clean up and halt.

If you consider the main runtime, it in effect forms a pyramid with the main control loop as the capstone, and each layer down adds more subroutines and broadens horizontally till you get down to the I/O device driver level.

If you use a consistent interface type between all levels (see a variation on design by contract). You end up with all subs being effectively “plug and play” in a consistent framework. Which allows maximum flexibility during development as they are all effectively isolated and have minimal or no interaction.

However it is less and less efficient as you go down the pyramid. So the next stage is to integrate horizontally. Many low level functions will be mostly the same and you can take many subs and extract out the non commonality and minimize it and end up with just a single slightly more complex sub.

The effect of this is to draw the sides of the pyramid in such that it starts to become a diamond.

This has many advantages as it allows not just reuse of code, thus reduction of memory footprint, it also encourages a rethink in the way things are done thus enabling similar reductions of subs at higher layers by reworking methods etc.

The problem with doing this is the code becomes increasingly interdependent and thus vastly more difficult to make changes if required and likewise testing becomes more difficult.

Go to far and things become hellishly intwined and codependent and increasingly impossible to change without getting side effects appear where you might not expect. Thus code maintenance becomes not just hellishly difficult and expensive testing likewise becomes protracted and expensive.

It also has a less spoken about side effect of making forward issues into other software yet to be written.

We all get told the major thing to aim for is “code reuse” as an investment in the future. Whilst it’s desirable, any awkwardness in the code has to be accommodated in future code where it is reused.

Thus what is a benefit at one level close to the metal, becomes a very real problem at higher levels where at a minimum it can stifle development and promote excessive work arounds that neither save work or code and also usually add unnecessary complexity and vulnerabilities. But also has a habit of “technology locking” in the past which very significantly derails or hampers the future.

Thus the decision of how high up and by how much to draw in the sides of the pyramid can be vexatious and some look for excuses to gain maximum decoupling thus minimum or no interaction by not drawing in at all.

KeithB February 15, 2024 10:39 AM

Heinlein once suggested that the two houses in a government should perform different functions: One writes and passes laws, while the other repeals them.

Maybe developers should hire people whose job it is to remove code.

Peter A. February 15, 2024 1:28 PM

@KeithB: it’s often done – I have personally removed quite a lot as a part of my job – but still too little. Much more attention, money and effort is put into writing new things then ‘refactoring’ old ones and the former process overwhelms the latter. Both in software engineering and lawmaking.

mark February 15, 2024 4:19 PM

Nothing I haven’t been complaining about for decades. OOD is interesting. The closer you get to actually coding, however, the fuzzier it is.

And what actually happens in the coding is you want a clipping of Godzilla’s toenail, and instead, you get Godzilla standing there over your with a frame around a toenail.

The functions called are not, like the old idea of Unix, do one thing, and do it well; instead, it’s “we’ll do all of this, and when you invoke us, you only use what you need to”… forgetting that all the rest is still there.

bl5q sw5N February 15, 2024 6:09 PM

Bloat is one kind of design failure. Once something is done about it there may still be other design unhappinesses.

M. A. Jackson’s book Principles of Program Design [1] deals with the design problem in its full generality. The book cover and page 10 shows Jackson’s summary lattice of program design. It encapsulates a lot of his treatment. The lattice can be written as follows, where “->” could mean “prerequisite for” and “,” separates lattice elements in a list –

PROBLEM ENVIRONMENT -> DATA STRUCTURES, TASK TO BE PERFORMED

DATA STRUCTURES -> READING AND WRITING, PROGRAM STRUCTURE

READING AND WRITING, TASK TO BE PERFORMED -> EXECUTABLE OPERATIONS

PROGRAM STRUCTURE, EXECUTABLE OPERATIONS -> PROGRAM

  1. https://archive.org/details/principlesofprog00jack

ResearcherZero February 15, 2024 11:55 PM

A lot of IoT devices are running appallingly outdated packages with few protections.

Many vendors do not fix these problems or provide an environment that aids that process.

‘https://eclypsium.com/blog/flatlined-analyzing-pulse-secure-firmware-and-bypassing-integrity-checking/

There are bound to be many more cases like this…

‘https://abcnews.go.com/Politics/us-disrupts-russian-hacking-campaign-infiltrated-home-small/story?id=107258976

Ubiquiti Edge OS routers still using known default administrator passwords:

1 Perform a hardware factory reset to flush the file systems of malicious files;
2 Upgrade to the latest firmware version;
3 Change any default usernames and passwords; and
4 Implement strategic firewall rules to prevent the unwanted exposure of remote management services.

‘https://www.justice.gov/opa/pr/justice-department-conducts-court-authorized-disruption-botnet-controlled-russian

“A factory reset that is not also accompanied by a change of the default administrator password will return the router to its default administrator credentials, leaving the router open to reinfection or similar compromises.”

‘https://www.documentcloud.org/documents/24429108-fbi-apt28-moobot-redacted-warrant-and-affidavit

Clive Robinson February 16, 2024 1:57 AM

@ ResearcherZero, ALL,

Re : Old, stale, and vulnerable.

“A lot of IoT devices are running appallingly outdated packages with few protections.”

There are a couple of general reasons for this, and I’m guilty of them from time to time.

If you are running a test “lash-up” to see if an idea works you,

“Use what you have, know, trust, is simple to get up and running.”

One of the reasons insecure as it was telnet got used what now feels like half a century beyond it’s “Best Before Date” is it “ticked all the boxes”.

As a simple protocol it was easy, and you could “read the data” off of the screen of a storage scope if you had a practiced eye.

All to often once you had it running there was no incentive to make it more secure after all,

“Just POC right?”

Sadly no, because you see it as POC not needing security, you let others have copies etc without thinking about it. And they don’t see it the way you do, they see it as “something that works”, that is useful so they then pass it on and others then see it as “the standard” or “bench mark”.

Because nobody is apparently getting hurt, it does not get fixed,

“In case you break it.”

And another reason,

Because I’ve some really old bits of tech using 8bit micros I need not just “simple” but “low resource” that is “not bloated”… So I use RS232 (actually V.24/V.28) quite a bit because that’s what’s on the back. This in turn means the device on the other end of the cable has to be compatible. So an old 486DX box with four serial ports VGA display 8Mbyte RAM and on a 100Mbyte IDE drive a copy of Slackware linux from a CD in a back of a book going back to when Linus still had acne.

Secure it is not but it can run four “In Circuit Emulators”(ICE) and talk to a bit of RG58 coax that carries 10Mbit ethanet into a little box that converts to Cat5 that talks to a 24port 10/100 switch that uplinks to a more modern work station.

One day, some day, I’ll finish making the second enclosure that will hold 8 9pin to 25pin D-Cons to RS232 to USB devices and an 8port USB hub that connects to a PC104 card all sitting in a cardboard box, to make an “8 Port Network to Serial Concentrator”. Just like the one I built back in oh… that sits in the back of the lab rack. But that 104 card will still be running a creaky old copy of Linux from a CD out of the back of a book on Red Hat, unless I bite the proverbial and use say a Raspberry Pi or similar instead.

The point is modern code is too bloated to run on hardware that’s probably got a quarter century or more of good working life in it that’s only a decade or two old.

As for M$ many would be surprised how little Win95 needs in terms of resources… And it still looks as crap as Win 10 does now but for different reasons of “butt ugly”. Oh and I’ve still got an unopened pack of five Win95 re-seller retail packs sitting in a cardboard box… But it does run the Mirror Comms software –that came with an 8086 based “luggable”– which has a nice WordStar style text editor in it that plays nicely with MS-DOS debug and Boarland C 3. Also Mirror plays nicely with those ICE boards…

You start to see the picture of why some code is still around, gets used and well passed on.

There is a saying about “you deserve the face you get when you are forty”, which is the greybeard behind the observation of,

“You use as preference the tools you learned before you were forty. As it takes to long to learn to use new tools after you are forty and you don’t want to waste the time.”

echo February 16, 2024 5:25 AM

I’ve made the argument for ages and ages and got fed making it and finally quit because I have other things on in life and no longer code. It’s somebody else’s problem now.

The main thing really is good design from the start including keeping things as simple and clean as possible and abstracted well. One thing almost all people forget is portability. Back that in from day one or it will bite you later. It helps a lot if this is your approach from the beginning as everything from specifications, outline designs, R&D, and project managing tend to hang off this. Clear in line documentation is good as is keeping a grip on straightforward code versus optimised code, debug code, and so on. That’s what flags and switches are for during compilation. So-called “bitrot” is a thing but also not a thing. If you’re abstracted properly including taking into account any forwards and/or backwards compatibility issues “legacy code” shouldn’t be a problem. Again, flags and switches help here for targeted compilation as do run time version testing blah blah. I’m sure I’ve left something out of this list but the basic thing is having a clue, deskwork, and borrowing from an industrial point of view (including before I forget asset management systems to store everything for complex projects which may involve multiple OS install disks and archived installs and VM’s, tools, creative assets over multiple platforms and decades) keep your code to a good standard and nable a support lifecycle over decades. I don’t consider 20+ years exceptional. At the back of my head before I stopped coding I didn’t see why 100+ years support cycles couldn’t be a thing if the underlying base was fine.

You can have rock solid code which can run all the way back to the Win95/NT era without missing a beat. You can also have stuff written back then which runs on the latest hottest thing. A lot of whether it will or not depends on whether you embedded these practices form the start and what choice of API’s and scalability decisions you make at the beginning. Assuming you abstracted well you can bounce around API’s without your core code noticing while continuing to support legacy or alternative platforms.

Right at the base layer: One thing I used to do for Windows code I used to dig into the SDK and unwrap functions down to the lowest level which was compatible with other platforms and wrap that in an abstraction, or redefine functions at compile time according to the compiler and SDK being used. Where an optimised assembler routine could be used I always liked a plain code alternative. It helps document the problem and provides an alternative to switch to even if it might be slower. Fun days.

I always avoided external frameworks when I could. Too much NIH. There are times when you have no choice. Like, who in their right mind would write a JPEG parser just because? That said frameworks can be useful – assuming their design decisions are right from the start and they’re not slapped together. But like I said. Somebody else’s problem now.

As people begin to discover new problems or find different and sometimes better ways of approaching some problems code will be written and rewritten. It’s not all done for reasons of NIH and bloat and and creating dead code for the sake of it. Established code might be tweaked. Other code may need a huge rewrite. That’s just life. And when you stop to think about it probably less than 1% of artifacts and three meters of biosphere we live in will exist in the state they are today in another 300 years. That’s just life.

Note: Some might shout you need to be a “certified engineer” to do this. I’m not and never will be. I just object to appeals to authority being used to rub peoples noses in it. I was taught properly from the start and read books and learned things off other people. And before anyone screams most of this was never taught when I was still in education if it was even a thing. There’s just no need to beat people down because that is a guaranteed way to kill enthusiasm and creativity and make people chose another industry, and that could cost you your best people.

Adam February 16, 2024 5:33 AM

@Nimmo

1) It wasn’t Dan Geer who said it, it was Brian Snow
2) What is your evidence for what you said about Dan Geer?
3) Even if 2 is true, what does that have to do with software bloat?

Just a wannabe techguy February 16, 2024 9:00 AM

@Adam

Those were the points I was thinking of responding with, until I saw yours. Thanks.

Bob February 16, 2024 11:02 AM

This will increase time to market for products

During which time someone else will ship theirs, gobble up this quarter’s profit, and have themselves already entrenched for next quarter.

Our economic model incentivizes almost exclusively irresponsible behavior.

lurker February 16, 2024 12:48 PM

@ResearcherZero
“still using known default administrator passwords”

But isn’t that the point of default passwords? they have to be known …

What’s missing is a law that requires the box to have ugly pictures like cigarette packets, and a warning in big caps:

ALL THE BAD GUYS KNOW YOUR PASSWORD IS PASSWORD
YOU MUST CHANGE IT BEFORE CONNECTING THE INTERNET.

And/or have the device phone home so the vendor can check, if the default is still default then reset the DNS to 127.0.0.1

- February 16, 2024 1:24 PM

@Moderator

/#comment-432440 Nimmo • February 16, 2024 12:48 AM

Appears to be a “Poison the well” attack on the blog.

Ismar February 16, 2024 4:33 PM

For some vendors (depending on your software user base) legislation is already in place which forces them to scan all third party libraries included with their own software for security vulnerabilities which will change the equation when it comes to making decisions on how much of these libraries to use in their code.

ResearcherZero February 16, 2024 8:09 PM

The problem is more with ancient libraries from 20 years ago, that are not even patched for ShellShock, running on your network. Firewall and security devices that are less secure than the network itself.

Flawed integrity checks and unprotected directories to hide the goods in, and a huge attack surface to take your pick from. Unchanged default passwords are the least of your problems if attackers can hide malware udetected in your edge devices.

No binaries protections, the version of Perl they were running was long past EoL, and that is just the basics. Those Ivanti devices were out of date before they were built and they were not built well.

It’s one thing to get a build up and running, but you have to at least put some basic measures in place that will keep out small children, before marketing it as a security solution. That version of CentOS Ivanti used is more than a decade old, but it also was not even patched.

MDK February 16, 2024 8:35 PM

@ALL

Related to MMS

hxxps://www.securityweek.com/mysterious-mms-fingerprint-hack-used-by-spyware-firm-nso-group-revealed/

Have a great weekend and be safe.

izzlers February 17, 2024 4:14 PM

This was an excellent update topic, thanks!

I am reminded of how, many years ago now, Lubuntu seemed to suddenly become degraded in a way severely contrary to the ethos and methods of the Lubuntu official author.

I ended up trying the imposterware anyhow, and noted easily the displacement.

One of the displacements seems to still be a problem.
Lubuntu, ever since then, seems to be showing bloatware problems.

Lubuntu was always specifically antibloatware (in a modest way) up until the blatant invasion of it’s ISO (or servers, etc).

There hasn’t been much a nobody like me could do about it.
So, thanks for at least this general info which umbrellas a lot of us serendipitously.

Sincerely,

K is for Kindness (izzlers?)

bl5q sw5N February 18, 2024 10:15 AM

Look at the software you have, with its bloat, coupling, etc. whatever, and consider it as if it were the result of a proper design process. Now look in the mirror and ask “Is this the program I intended to write ?”

ResearcherZero February 21, 2024 8:39 PM

What would be of benefit would be thorough logging, man pages and technical documentation.
Keep it simple. Better to stuff the fish later after it has been gutted.

Microsoft has introduced “free” logging, for governments only.

Leave a comment

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.