Improving C++

C++ guru Herb Sutter writes about how we can improve the programming language for better security.

The immediate problem “is” that it’s Too Easy By Default™ to write security and safety vulnerabilities in C++ that would have been caught by stricter enforcement of known rules for type, bounds, initialization, and lifetime language safety.

His conclusion:

We need to improve software security and software safety across the industry, especially by improving programming language safety in C and C++, and in C++ a 98% improvement in the four most common problem areas is achievable in the medium term. But if we focus on programming language safety alone, we may find ourselves fighting yesterday’s war and missing larger past and future security dangers that affect software written in any language.

Tags: cybersecurity, programming

Posted on March 15, 2024 at 7:05 AM • 36 Comments

Comments

JonKnowsNothing • March 15, 2024 8:36 AM

re: if we focus on programming language safety alone, we may find ourselves fighting yesterday’s war

A few thoughts

Isn’t this the normal everyday problem of using code to implement human concepts?
Is this a I’m Not Going First! issue? (1)

===

1)
In PVP Games the minion swarms line up on each side. Even if they have a designated leader directing the swarm, they have a common behavior. This behavior can be seen in many ancient film documentaries of rudimentary warfare.

Both sides line up facing each other, lots of jumping up and down, swaving of banners and flags. Then one player makes a dash forward.

They look behind to see if anyone is following them.

If they are out on the battle field alone, they race back to their side (or attempt to do so).
If they see the rest of swarm following them, it’s Full Charge into the melee.

Hans • March 15, 2024 9:14 AM

This behavior can be seen in many ancient film documentaries of rudimentary warfare.

Is that so? Hollywood generally ignores battle lines in favor of the more dramatic image of wild charges. But I would not regard those as documentaries.
I am not objecting on the topic of games.

Clive Robinson • March 15, 2024 9:30 AM

@ Bruce, ALL,

Re : The paint and the painter.

From the quote,

“The immediate problem “is” that it’s Too Easy By Default™ to write security and safety vulnerabilities in C++ that would have been caught by stricter enforcement of known rules for type, bounds, initialization, and lifetime language safety.”

There are two sides to this issue,

1, Technical
2, Human

You can not fix one with the other and you can not fix the security issue with either alone.

To be frank I’ve had more than a belly full of new languages, new paradigms, new methods. They are mostly moving the deckchairs around on the Titanic.

In fact they can be shown as majorly detrimental to the human side.

To develop as humans and as programmers takes time and facing up to and managing risk just like any other skill.

Whilst you can quickly learn to make a bicycle go faster just by peddling more furiously, unless you take the training wheels off you will never learn to ride the bicycle even sensibly let alone safely, because such constraints restrict your ability to manoeuvre. Whilst riding fast in only a straight line might win some limited races from a practical perspective there is little or no other use.

Likewise life is not about getting from A to B in a straight line as fast as possible. I know many managers think it is but they are wrong.

So rather than using technology to make the training wheels ever bigger it’s more productive to take them off.

It’s the same with any human skill we learn most from two things,

1, Our curiosity
2, Our mistakes

Big training wheels allow neither to work for our development.

It does not matter how good the paint is, if the artist has their hands tied to a rail, they are not going to be able to paint in a worthwhile way.

bl5q sw5N • March 15, 2024 9:35 AM

known rules for type, bounds, initialization, and lifetime language safety.

Standard ML has been the way forward for a long time [1]. Also see [2].

Addressing the “what is a programming language” part of the ongoing “what is a program” project.

Derek Jones • March 15, 2024 9:41 AM

It’s an issue of developer culture.

The C/C++ culture is that runtime checks default to off.

In other languages, the culture is for the runtime checks to default to on.

Compilers such as gcc/llvm have options to switch on runtime checking. https://shape-of-code.com/2024/03/03/the-whitehouse-report-on-adopting-memory-safety/

Companies don’t need to rewrite their C/C++ code, they just need to switch on runtime checks (they may have to hassle their compiler vendor to add this functionality).

AJ • March 15, 2024 10:28 AM

There’s a long comment section aboutSutter’s blog entry from people who are or were mostly programmers on Linux Weekly News; some insightful and some less so.

Winter • March 15, 2024 11:10 AM

Isn’t Rust supposed to be this C++ with better security?

It is a looong time since I have coded in C myself. So anything I say about it is pure hearsay.

What I do hear from those who do code and are into security, the crucial point is that Human Beings are not able to do Memory and Pointer Management.

All this talk that it is possible to hand-code secure memory allocation/deallocation as well as secure pointer management if you only take care has been disproven every single day since the birth of C.

We just should accept that no, we human beings cannot do secure memory and pointer management by hand.

So, I will not hold my breath for a secure C++ derivative.

Jimbo • March 15, 2024 11:11 AM

Over 20 years ago Bill Gates and team realized the same issues with C++ and C. Pointers aren’t safe, reference counting doesn’t work, data types and memory management. The solution was managed code (C#, Java) with garbage collection and safer data types. They still allow developers to write poor code and implement poor designs. Unfortunately, the OS and MS Office are legacy code written in C++ so we still have buffer overflows and numerous other attacks.

kiwano • March 15, 2024 12:08 PM

@Derek Jones:

I wouldn’t blame developer culture for most C++ code running with runtime checking turned off. Generally speaking, the motivation for running C++ code with runtime checking turned off is the exact same motivation for using C++ in the first place: a need to write blazingly fast code to be run in settings where an extra half-millisecond to perform a particular task can put the company running the code at a significant competetive disadvantage (e.g. high-frequency trading, or internet ad auctions).

I wouldn’t rule out culture in general, since the existence of these applications is often a consequence of our broader business culture — but developer culture is not the tree I’d be barking up.

Winter • March 15, 2024 12:29 PM

@kiwano, @Derek Jones

a need to write blazingly fast code

What I understood is that C is used when “absolute” control is required over the code paths that are run.

Garbage collection and bounds checking occur are outside the control of the coder and can run at any time. That makes them unsuitable, or so they say, for OS kernel code and device drivers.

Rust does special magic to make the security features run (mostly) at compile time and make the run time additions “deterministic”. Or so I understand (I have not written a single line of Rust code).

I would love to hear from someone who actually knows how it works.

Derek Jones • March 15, 2024 1:38 PM

@kiwano, @Winter

People use language X because that is the one they know, it is the one the software was originally written in, the one everybody else in the team uses, or even because the customer recommends it.

Fans of Ada say that strong typing helps detect coding mistakes early. I agree, but the experimental evidence for the benefits of strong typing is meagre https://shape-of-code.com/2014/08/27/evidence-for-the-benefits-of-strong-typing-where-is-it/

Fans of Rust make a variety of experimentally unsubstantiated claims (which may be true or false).

The performance hit for runtime checking depends on how well the optimizer can figure out when not to insert checks (because out-of-bounds will not occur), and the style of the developer (‘clever’ code can make life difficult for optimizers to figure out what is going on).

Winter • March 15, 2024 2:10 PM

@Derek

The performance hit for runtime checking

It is not just the performance hit, but more the non-deterministic nature of the intervention of, eg, garbage collection that is seen as problematic.

Back in my Day • March 15, 2024 2:53 PM

The real answer is a strongly typed language. C purists ruffle their feathers around both runtime bounds checking and the incessant need to make everything a pointer. The only justification I’ve seen is “it runs faster”- but in truth that’s a rare deciding factor.

Back when I was programming, Turbo Pascal (and later Delphi) did everything I wanted – including embedding the occasional ASM code. Those few times I used C in production, I was plagued by the same problems everyone has – strange memory leaks, odd crashes from mis-used pointers, strange interactions of different types variables interacting, etc. I never got why C was so popular with all of its weaknesses.

Alas, that’s a battle that’s already been lost. C and it’s descendants are too well entrenched.

Gunslinger • March 15, 2024 3:18 PM

Since the only provably secure operating system is written in Pascal, I think we know how to solve this problem, just most people don’t want to do it.

Jon (a different Jon) • March 15, 2024 4:07 PM

@Hans & @JonKnowsNothing

Prof. Bret Deveraux would like a word with both of you about how pre-medieval armies worked (and didn’t work).

He has also reviewed several elaborate strategy games, both in the sense of accuracy and playability and generally good fun.

His blog is at http://www.acoup.com and his only advertising is word of mouth. I think it’s excellent, so if you find his writing interesting, entertaining, or educational, please throw him a bone or two.

I have no affiliation besides being a happy reader. J.

Kenneth • March 15, 2024 6:20 PM

@Winter,

Garbage collection and bounds checking occur are outside the control of the coder and can run at any time. That makes them unsuitable, or so they say, for OS kernel code and device drivers.

As a category, “OS kernel” often combines many disparate types of code, particularly in monolithic designs like Linux. Much of it could, in principle, be run with garbage collection. A device driver, for instance, usually doesn’t need to run in the kernel at all (cf. QNX), or may only need to run a tiny interrupt handler in “kernel mode”. Experimental operating systems such as MS Singularity have been designed to use managed code almost exclusively.

(My problem with automatic tracing garbage collection, though, is that the proponents tend to treat it as a solved problem. But if anyone complains about performance, suddenly it’s not so easy: they’re referred to all kinds of tuning options, alternate collectors—Java apparently has 7—recent research papers…. This does not apply to things such as reference counting or “RAII”, also sometimes referred to as “garbage collection” and generally suitable for kernel and driver use.)

Bounds checking is different. It’s entirely predictable, and notably not guaranteed to be absent in C or C++ or present in Rust’s machine code. It has some performance impact, unlikely to be noticed in 99% of code (and while that last 1% does matter, there’s a good argument to be made that one shouldn’t resort to “unsafe” tactics—like disabling the checks—without good justification).

Clive Robinson • March 15, 2024 7:34 PM

@ Winter,

Re : Are myths worse than statistics, thus worse than damn lies?

“What I do hear from those who do code and are into security, the crucial point is that Human Beings are not able to do Memory and Pointer Management.
“

Human beings are easily capable of doing “Memory and Pointer Management” and have done so both before and after C/C++/C# etc.

The problem is,

“Either you know how to do it properly or you know how to get the compiler to do it badly for you.”

The reasons briefly,

1, Only the first works in the longterm.
2, Using the second effectively takes more learning effort than the first.
3, You can not reliably use both together.
4, Thus you can not transition from the second to the first.
5, The second constrains you to the limits of the compiler developers.

To do “Memory and Pointer Management” without side effects thus consequences requires a very good understanding of what happens “over the gap”.

There is a gap that mostly is like a bottomless canyon for most programmers (see why data serialisation is a very major hazard and some maths libraries barf).

On one side you have the machine ISA on the other you have a high level language of which the lowest in common use is the “fake universal assembler” of the “C languages”.

For most programmers that canyon is a place where “Here be Dragons” is just a “friendly warning”… Because in their heads they imagine demons trolls and all sorts of other things that would make Dante pause, and not laugh divinely.

The first actual problem is effectively,

“All ISA’s are different, many memory “bag of bits” layouts where bits become bytes and beyond are different. With compilers not upto an unassisted translation from high level general to the low level specific.”

The second actual problem is,

“Pointers are closed bags of bits, not textbook integers.”

Whilst you can do a very limited subset of basic maths (the A in ALU) on pointers, you should never do bit masking and the like (the L in ALU). Because the bag of bits meta-data is all to often different from ISA to ISA so the high level code does not travel.

So to do “Memory and Pointer Management” properly and reliably you need to understand not the high level language but the implications and limitations on the ISA side of that canyon.

The thing is it’s not just “Memory and Pointer Management” which needs this understanding. You need it to do any “off-chip” “Data Communications” and also any maths above very basic unsigned integer addition. You can not “abstract it away” without suffering constraints, that are rather more than limitations but security risks. Worse there are subtle edge and corner cases that even compiler writers miss.

In the past I’ve asked people on this blog to consider the deceptive case of the subtraction of two integers that are larger than the native CPU data width. And consider why you might want to use 1’s complement rather than 2’s complement… Back then the replies were interesting. Perhaps we should ask it again.

MK • March 15, 2024 9:00 PM

It all comes down to a choice of which demon you wish to live with. I was recently involved in a very large C# project. What could go wrong? Well, maybe running out of stack by infinite recursion, or running out of processes by misuse of threads, or thread deadlocks, or not realizing that some library code you are using is not managed and not thread-safe.

And it all runs too slow, requiring more investmnt in run-time cycles with Amazon or Azure.

bl5q sw5N • March 15, 2024 9:56 PM

Typical compilers omit checks unless specifically commanded to include them. The C language is particularly unsafe: as its arrays are mere storage addresses, checking their correct usage is impractical. The standard C library includes many procedures that risk corrupting the store; they are given a storage area but not told its size! In consequence, the Unix operating system has many security loopholes. …

ML supports the development of reliable software in many ways. Compilers do not allow checks to be omitted. Appel (1993) cites its safety, automatic storage allocation, and compile-time type checking; these eliminate some major errors altogether, and ensure the early detection of others. Appel shares the view that functional programming is valuable, even in major projects.

Moreover, ML is defined formally. Milner et al. (1990) is not the first formal definition of a programming language, but it is the first one that compiler writers can understand. Because the usual ambiguities are absent, compilers agree to a remarkable extent. The new standard library will strengthen this agreement. A program ought to behave identically regardless of which compiler runs it; ML is close to this ideal.

– ML for the Working Programmer, Larry C. Paulson

JonKnowsNothing • March 16, 2024 12:59 AM

@ Hans

re:
@JKN This behavior can be seen in many ancient film documentaries of rudimentary warfare.

@H Is that so? Hollywood generally ignores battle lines in favor of the more dramatic image of wild charges. But I would not regard those as documentaries

The films and documentaries I was referring to were not cinematic movies. These were black and white films taken (in theory) when explorers (aka white people) trundled into distant areas where no one else had been, meaning no white people, ignoring the time frame for the people who already lived there.

ResearcherZero • March 16, 2024 3:33 AM

Crazy 128 bit choices which have no explanation and could have been far simpler.

There are all kinds of Linux projects which don’t make a lot of sense as to their existence. Anything that improves the situation for little cost should be an improvement.

Winter • March 16, 2024 4:46 AM

@Clive

Human beings are easily capable of doing “Memory and Pointer Management” and have done so both before and after C/C++/C# etc.

After this quote, you write a long essay that implies 99+% of programmers are doing it wrong, ie, are unable to do it. Conclusion, programmers can do it in theory, but not in practice.

Because the bag of bits meta-data is all to often different from ISA to ISA so the high level code does not travel.

The purpose of life of a computer language is to abstract away the underlying HW and ISA. If that is your target, write assembler. If you need to understand the ISA, the whole purpose of the computer language is defeated. It instantly makes any program locked to a single ISA.

FA • March 16, 2024 7:16 AM

@Clive

Human beings are easily capable of doing “Memory and Pointer Management” and have done so both before and after C/C++/C# etc.
…
Either you know how to do it properly or you know how to get the compiler to do it badly for you.

I agree 100% with this.

The ‘advantage’ of using ‘unsafe’ languages (assembler, C,…) is that you are forced to be aware of the dangers, and act accordingly.

Some ‘unsafe’ practices may be required only in some specific circumstances (e.g. interacting with HW or the efficient implementation of some algorithms), but those will not go away.

So if you are working in those areas, you have to learn how to do it safely. Just like an electronics engineer may have to deal with high voltages in a safe way, or a kitchen chef will need to learn how to use a very sharp knife without cutting off his fingers.

The problem is that only a very small fraction of all programmers have experience in these fields, and have developed the required attitudes.

Also ‘safe languages’ will not protect you against

functional requirements that introduce risks,
badly designed algorithms or interfaces,
unsafe communication protocols and data formats,
unsanitised input from untrusted sources,
failing to understand concurrency,
failing to understand error recovery,
attacks on the underlying hardware,
…

Clive Robinson • March 16, 2024 7:46 AM

@ Winter,

“The purpose of life of a computer language is to abstract away the underlying HW and ISA.”

Actually “NO” the primary purpose of a software development tool is,

“To assist in meeting the needs of the project”

That is such that the development is,

1, To specification
2, To time
3, To budget
4, Within final constraints
5, Using the available tools
6, etc…

Not understanding this from the most senior personnel all the way down is why many development projects go badly awry.

The “abstraction” of high level languages is actually another form of “stabilizer wheel” applying unnecessary constraints. Oft hidden behind the notion of “code portability” thus “code reuse” etc etc.

Whilst the constraints are “aways out there a ways” you get the illusion of,

“Doing it better, faster, with less qualified people…”

You will if you don’t do toy/hobby projects run into those constraints hard like an aircraft into a mountain.

It’s not just memory management, and the constraints are hard, thus have to be got around using other tools. Thus complexity rises often unmanageably and we know what that means for ICT security as measured by CVE’s etc.

The real problem can be found with the observation of,

“90% of everything is crap”

Informally called “Sturgeon’s Law”[1] (and as I noted the other day should be “Orwell’s Law”).

And also the Malcolm Gladwell observation often given as,

“It takes 10,000 hours to master a skill”

Or roughly 5years of full time work as an apprentice under a Master of the trade[2].

Learning is a 10% of 10% skill the first time period gets you 10% of the way to your target. The next only gets you 9% closer, the next 8.1%… each time period appears to give you less. This is the “natural” order of “growth” and learning to become a “master”. And ultimately you never become “Master of All” because every day brings something new to learn such is the nature of growth we call “progress”.

Whilst there are short cuts and tricks they are actually counter productive. Which brings us to your,

“After this quote, you write a long essay that implies 99+% of programmers are doing it wrong, ie, are unable to do it. Conclusion, programmers can do it in theory, but not in practice.”

They are by personal choice not “theory”, or the choice of employers and the selection of tools, languages, and methodologies etc. Thereby hand cuffing themselves and their ability to not just learn but produce.

1, Producing “crap” at high speed is still producing “crap”…
2, Reusing “crap” still results in “crap”…

Not producing crap is a human skill and there is only one way to acquire it as I’ve pointed out above.

You might not like this, but it is nether the less true.

As you are aware I’ve been saying this for years, especially with those Universities and other teaching organisations that “partner” with employers. And end up really only teaching the tools the employer wants their employees to know, leaving out fundamentals and much else besides and in effect cheating their students out of their potential and future.

I make no apology for this, and I never will apologise for the truth.

And if you think on it you will realise what I am saying might be a significant causal effect in “the skills shortage” employers and thus their paid for politicians are always claiming.

[1] Called “Sturgeon’s Law” it’s the observation by an American SciFi author and editor about not just writing stories and books but as it turns out just about any collection of human endeavour.

Thus taken as a collection,

“Ninety Percent of Everything Is Crap”

It’s also seen as the “10% rule” where 10% is acceptable, and 10% of that is good, and 10% of that 10% of the 10% is excellent, and so on.

As such it forms a “self similar” curve tangent often called a “Power law” that can be scaled to fit.

Thus, whilst “crap” is subjective on the observers “Point of View” and the curve will fit any arbitrary percentage just by scaling as an observation it holds true as it’s also “the natural growth curve” and in mathematics it’s given the symbol “e” and is considered just about foundational to everything.

https://effectiviology.com/sturgeons-law/

[2] Human history shows that Malcolm Gladwell’s observation of,

“Researchers have settled on what they believe is the magic number for true expertise: 10,000 hours.”

Holds across many activities. However it needs to be tempered in that you learn faster if you are younger, if it’s built on a skill you already have, and a number of other factors.

One of which is what people mean by “learn”. You can in fact pick up many “starting” skills in around 20hours in the right conditions. For instance learning to ride a bicycle is a lot faster on a gentle downwards incline than it is on the flat (that observation gave us training-wheels). Once you have that “Unconstrained Practice” gets you ever closer to what ever skill level “you want”.

The problem is that downhill slope puts some constraints on your abilities just like the even worse constraints training-wheels do. If you don’t stop using them very quickly… then like the old “GOTO considered harmful” you will be forever ruined.

It’s why management should treat the likes of,

https://www.forbes.com/sites/danschawbel/2013/05/30/josh-kaufman-it-takes-20-hours-not-10000-hours-to-learn-a-skill/

With a grain of salt the size of “Lots Wife”.

Take a look within and you find,

“When you’re naturally interested in a particular skill, you’ll learn extremely quickly, so follow your interests where they lead, and avoid forcing yourself to grind through topics you’re not really interested in exploring.”

And realise the implications.

Data allocation in memory is fundamental to effective programming all “Abstract Data Types”(ADT’s) are totally dependent on it, thus all programming of any worth (including “Hello World” programs). If you don’t get this learned properly from the earliest stages then it’s like putting handcuffs on yourself and blinkers on your head. Ask yourself the question,

“Do I want to be a slow ass pulling a coal truck on rails in a mine forever?”

As I’ve said above about “Paint and the Painter” constraints by either self limiting or using inappropriate methods/tools will forever limit your abilities thus worth.

Winter • March 16, 2024 8:01 AM

@Clive

“To assist in meeting the needs of the project”

Which project? C is a universal language. Historically, project need were adapted to meet the requirements of C.

Anyhow, C and Unix were quickly adapted to run on any HW, and C was the HW abstraction layer. Anything depending on ISA is a liability to that.

JonKnowsNothing • March 16, 2024 9:51 AM

@Winter, @Clive, All

re: project need were adapted to meet the requirements of C

Not needs, but design.

The engineering level design is based on what the HW & SW can actually produce.

If a particular set of chips is selected, that’s what you have to work with. If the big honchos in the corner office with the glass windows declares : We will use X tools, then that’s what you have to work your design around.

It is a rare thing for a single developer to unilaterally declare they are writing their code in SomethingElse, unless that SomethingsElse provides leverage into a particular aspect of the project that cannot be achieved with existing tools (as described in above posts).

There are of course, on going round table divisions about specific tools and their usefulness or hindrance to a project. As described in this topic thread, it’s a robust discussion that rarely has any useful resolution.

The bottom of the heap programmers use whatever their bosses tell them to use. Which also includes what the bosses are willing to pay for, ’cause it ain’t cheap stuff.

iirc(badly)

Steve Jobs, in one of his many incarnations of failed computer businesses (which gets brushed aside) started a company that was going to be The Be All and End All of contemporary computers. It was actually a mouth watering hunk of hardware with everything you could want included.

But, when he set up the development system, he crippled the devs by using his “old company stuff” with limited options (saving $$$). Those limitations caused a huge problem when trying to get code out of the pipeline.

So, he had to retrofit the entire development systems with the needed add-ons (extra $$$$).

He was frustrated that the incredible costs did not produce an improvement in throughput in the pipeline.

His complaint went something like:

* I just bought you all brandy-new hard drives!!!

It was too late in the design cycle for it to make a positive improvement in deliverables; it only kept the timeline from slipping further.

fib • March 16, 2024 9:51 AM

Re Pointers

I know they’re just wrappers, but I’ve used smart pointers in non-security projects and they were a really fantastic resource. I would say the same regarding move semantics and features like the range-based for loop.

If any members have a word about whether they are equally useful in security-related projects it would be interesting to hear.

Winter • March 16, 2024 10:29 AM

@JonKnowsNothing

It was actually a mouth watering hunk of hardware with everything you could want included.

IIRC, this was Lisa -> Macintosh -> NeXT. Which gave us NeXTSTEP, moved the Mac to the Mach kernel and saved Apple.[1]

Just another example of how computer HW and SW projects run overtime and over budget. It also shows you always end up in Unix and C. For better and worse.

[1] ‘https://www.xda-developers.com/on-this-day-next-computer/

MK • March 16, 2024 1:05 PM

I give you Multics as an extreme example in abstraction from the underlying hardware. Consider a declaration (I can’t remember the exact syntax):
Bit Array v[][0:32] which was intended to be an array of 32 bit values, but was instead an array of 33 bit values (0 through 32 inclusive), which no one noticed was making the system run very slowly.

Clive Robinson • March 16, 2024 7:34 PM

@ FA,

I hope you are well?

With regards,

“The problem is that only a very small fraction of all programmers have experience in these fields, and have developed the required attitudes.”

Directly yes, but indirectly?

If we think about the first example program in most “Learn XXX” books it’s a variation on printing “Hello World”.

Is the string “a data object” or “a collection of data objects” and how are these objects “sent across the wire”.

Back in the days of RS232 serial connections most programmers found out the hard way in a very short period of time. As old discussions about “big endien” and “little endien” and “network order” show.

The problem has not gone away it just hides behind “Data Object Serialisation” which is mostly done by libraries you would not want to venture into.

Think on it this way,

One of the results of “Object Oriented” methods is the view that “everything is an object” and that means “methods as threads” which means “Inter Process Communication”(IPC)… Which as these days we do as a minimum “client server” because “all displays are browser windows” and remote browsers are free CPU power we send complex data objects across the “World Wide Web” or global network we call the “Internet”.

Mostly it works… Which when you think about it is really a miracle in it’s own right, just for “text strings”.

But a lot of programmers don’t think about it and some “Data Objects” crossing the wire these days are in reality what we would have described as “Flat File Databases” with “Embedded complex data objects” in the individual records and “Embedded Report Generation”.

When things occasionally go wrong the entire mess can become a bag of snakes mixed in with a can of worms looking like a ball of living spaghetti.

Such nightmares used to popup with the use of VPN’s where the length of data packets used to get shortened thus fragmented and thus “faults on the wire” resulted in expressions like “The fault’s leaving here OK” and similar.

The reason we don’t see it much these days is the assumption there are only two ISA’s these days iAx86 and ARM.

In the sandbox I play in there is MIPS and still 6502 and even 12bit PIC and the like. Others regularly bump into Z80, 6800, 6809 and even 68k etc all tucked away in microcontrollers.

Let’s just say that the expression “full stack programmer” is seen like a curse word in certain embedded computing worlds with reason.

As we move into the disparate world of IoT a sleeping dragon is having it’s tail tickled. Distributed computing is coming at us through DDoS malware that has given others the realisation there is a lot of “free CPU cycles” out there to be had. So we’ve had/got Bitcoin mining malware. And certain supposedly more respectable eyes are looking out enviously… At the moment “cloud computing” is all about rack upon rack of high end servers in the likes of big Silicon Valley Corporate Data Centers around the globe. The software that runs on them whilst often multi-threaded is still really sequential code in design and lightly disguised as parallel processing[1], thus data object serialisation is still a “pass it down the wire”, “fire and forget” issue with out exceptions to consider. Fairly soon though, to make use of all those Iot CPU cycles etc, data objects really will have to become properly distributed and parallel in use. I can see many having to dig out Leslie Lamport’s distributed systems work[2] from back in the late 1970’s,

https://lamport.azurewebsites.net/pubs/time-clocks.pdf

And “going off the reservation” thinking about how data objects would “live on the net” as effectively “independent entities” but reliably updated and backed up etc. As well as moving them geographically to reduce latency and so processing times… Which might be the wrong thing to do as slow individual processing times but high interleaving of processes will probably lead to more actual processing being done.

Back more than three decades ago, I started work on doing this sort of thing for a PhD. It actual scares me to see how little work has actually been done by others in that time since. Worse it won’t be long before Leslie Lamport’s initial work on it “turns 50”.

It’s a subject that will unavoidably “come of age” eventually, and probably very fast when it does. But few appear even ready for the notion of data objects living as almost independent entities on the Internet let alone what the implications of that are…

[1] If you look at the way many write threaded code it’s still really a serial process not a parallel process. All they do is write the equivalent of a near balanced tree of non interdependent sequential threads with some kind of data amalgamation process at the end when all threads have finalized.

For many processes such as cryptanalysis or similar where each thread can be fully independent or decoupled this is fine. However as that independence decreases as coupling increases such as with certain types of modeling this becomes problematic and communications rises dramatically. Any issues with serialisation will likewise show up rather dramatically and to many the solutions will appear as “black magic” spells and incantations not rational processes and thoughts.

And I think you can see where that is likely to lead. Thus the question arises as to just what percentage of the world’s CPU cycles do we want devoted to abstracting away serialisation? And worse how few “libraries of code” there will be, making the whole edifice not just fragile but very vulnerable…

[2] Just one relevant paper of many that can be found in,

https://lamport.azurewebsites.net/pubs/pubs.html

Swudu Susuwu • March 17, 2024 10:59 PM

Relates to C++: much of Android OS and Linux are based on C++

What’s wrong with Android OS, plus how to fix Android OS
Android is not a cool OS, because Google lacks good Linux developers:

You can not launch Google Play Store without being bombarded with infected ads,

and neither Google Play Store nor Chrome have options to turn images off.

For Android, all Google had to do was port one of the GPLv2 (or BSD) distributions of Linux to Arm,

and all of our smartphones would have access to all of the Linux programs,

but what Google did was port just the core of Linux, and had “Not Invented Here Syndrome” with regards to all of the toolkits/frameworks that Linux programs use, so that none of the Linux programs can run on our smartphones unless their developers port each program to Android OS.

All Google had to do for touch interfaces (after ports one of the normal Linux distributions) was remove parts of the window manager that took up too much space/resources and map touches to mouse actions, or port one of the Linux distributions that supports touch interfaces.

As a result of Google’s failures, billions of dollars were wasted to produce whole new frameworks and to redo Linux programs to use the new frameworks.

Google could solve this for us, plus reduce Google’s own maintenance costs:

all Google has to do is use one of the FLOSS Linux OS’s that have Arm ports plus center around touches,

replace Google’s junk “not invented here” frameworks with wrappers that call the standard Linux frameworks,

and use this as future versions of Android OS.

The wrappers allow apps that use Google’s “not invented here” frameworks to run on the new Android OS,

and because Android OS would become just a few wrappers around a normal Linux OS,

Google’s maintenance costs are reduced to just small updates to those wrappers.

Chris Becke • March 18, 2024 1:31 AM

I love C++.
But it lost its way years ago when, instead of adding things like GC, they added shared_ptr and loads of other template noise that just increased rather than reduced developer mental load and fatigue.

Objective-C is an ugly language. But it serves as a demonstration of a language that went from explicit memory management, to reference counting, to garbage collection and shows there was a path c++ could have followed.

But it didn’t. Time to move on.

jbmartin6 • March 18, 2024 8:27 AM

@Jon (a different Jon)

Bret Deveraux’s blog is at https://acoup.blog/ not acoup.com which is some healthcare company. “Com reflex” strikes again! 🙂

bl5q sw5N • March 20, 2024 10:31 AM

Re: It’s imperative that one not point.

Standard ML’s imperative features include references, arrays and commands for input and output. They support imperative programming in full generality, … References behave differently from Pascal and C pointers; above all, they are secure.
Imperative features are compatible with functional programming. References and arrays can serve in functions and data structures that exhibit purely functional behaviour. … . We shall code functional arrays (where updating creates a new array) with the help of mutable arrays. This representation of functional arrays can be far more efficient than the binary tree approach … . A typical ML program is largely functional. It retains many of the advantages of functional programming, including readability and even efficiency: garbage collection can be faster for immutable objects. Even for imperative programming, ML has advantages over conventional languages.

ML for the Working Programmer
Larry C. Paulson

Vincent Taeger • March 20, 2024 4:33 PM

There is little that can be done to improve C++ as long as it has to remain backward compatible. Adding new features just adds additional ways to make mistakes. For the same reason C++ is not much of an improvement over C. Most of the functionality that C++ added to C could have been (and has been) added to C using a library and function calls and doing stuff that a C compiler is allowed to do but most compilers do not (e.g. implement undefined behavior in a particular way).

The problem is programmers and managers. If they decide that bad FORTRAN written in bad C is still good C++ nothing changes. Back in the ’80s in my first job I found a stray pointer because it happened to write to screen memory. I got a pat on the back. I then ran lint and found a couple of hundred places that could write through stray pointers. The pat on the back was lower and harder because I had not “proved” that any of these were actual “bugs”.

Anonymous • April 15, 2024 6:02 AM

The Voyager space probes are still out there, running on minimal resources by today’s standards, programmed to do their tasks and reprogrammed over the years.

Yes, humans can use C in a safe manner if they know how – the problem is that with all the abstractions and garbage collection the new kids (from a few decades ago) don’t even learn what RAM is. If they did, they’d use C and others more effectively. There’d still be bugs, humans are not perfect, but blaming the language alone is narrowsided.

As is plugging Rust every time someone mentions C or C++ (they’re not the same language). I won’t why the Rust crowd is so eager to pile on C.

There’s no perfect language and you should use the one that fits your goals but at the end of the day, as i’ve read elsewhere on the net, if you put a chimpanzee behind the wheel of a Ferrari you’re still gonna run into trouble.

Improving C++

Comments

Leave a comment Cancel reply