Comments

jayson November 9, 2015 8:33 AM

Decent read until you get to

A trusted private ledger removes the need for reconciling each transaction with a counterparty, it is fast and it minimises errors.

and

A group of vetted participants within an industry might instead agree to join a private blockchain, say, that needs less security.

At which point there is a “trusted” system and therefore obviates the need for a blockchain or proof of work and any old database will do the job.

blake November 9, 2015 10:45 AM

@jayson

This is the thick-client / thin-client cycle again, in a different setting.

“Trusting unknown people is hard. I want a distributed proof system where I can trust the math and not the counterparties.”

“Validating math is hard. I want to find some people I trust who have vetted the details of the system for me.”

Oscillate between the two, proportional to the rate of scandals.

daniel November 9, 2015 11:37 AM

@blake

/thread over.

(because mathematicians and coders have transcended the human condition and never ever lie.)

Alien Jerky November 9, 2015 12:48 PM

Even the most trusted, honest, never told a lie, never will tell a lie, always open person, can still be wrong.

albert November 9, 2015 5:09 PM

@jason,

“…A trusted private ledger removes the need for reconciling each transaction with a counterparty, it is fast and it minimises errors….”

This makes sense. It’s why banks and financial institutions will eventually deploy private block chains.

“…A group of vetted participants within an industry might instead agree to join a private blockchain, say, that needs less security….”

I believe this refers to a “tweaked” blockchain. Members computers would handle the hashing, etc., hence “less security”. A limited number of members would also limit the computer power required; proportional to the number of transactions per member.

Not very well explained in the article, though.

. .. . .. _ _ _ ….

jayson November 9, 2015 5:29 PM

@albert

A private blockchain is no different than than, say, a sequence of rows in a database mantaining the transaction history. Mature, proven technology. There would be nothing to gain by adopting a new, more complex protocol to achieve what they already do today.

I wouldn’t say they won’t adopt it. Everyone likes to play with shiny, new technology; it looks great on a resume. I’m just saying that it doesn’t add any value to what they already do.

They can tweak the blockchain as much as they like, but to what end? The proof of work and entire use of computer resources is to prevent a hostile brute force attack from forcing their own transaction tree upon the rest of the participants. If there are vetted participants, counterparties are known and subject to legal recourse.

Financial institutions already have many ways of managing these disputes. Pointing out the value of a low power proof of work to assist in a financial dispute would be tedious at best.

Michael November 9, 2015 5:59 PM

@ albert

“…A trusted private ledger removes the need for reconciling each transaction with a counterparty, it is fast and it minimises errors….”

This makes sense. It’s why banks and financial institutions will eventually deploy private block chains.

“…A group of vetted participants within an industry might instead agree to join a private blockchain, say, that needs less security….”

It sounds like a trust factor can be used to discount a trust chain, which can be used, with other factors, to price risk. Then the riskier chains will pay a interest rate to keep customers. Then, people start liking it. Just like a bank. lol.

Nate November 9, 2015 6:41 PM

I don’t see any use case for blockchain technology at all, sorry.

It is a protocol designed for one purpose: to achieve probabilistic (assuming less than 50% centralisation) consensus among hostile mutually non-trusting actors, many of whom are active in organised crime, most of whom are involved in obvious scams, and all of whom are motivated by millions of dollars of potential gain. To try to achieve trust, it not only doesn’t scale, it is designed to anti-scale – to aggressively consume computing and power resources in a attempt to remain decentralised.

But that didn’t work. Economics of scale, an exponential technology race, greed, and the block mining reward designed into the protocol have forced mining to consolidate into an oligoply. There is no longer any point to the blockchain. It consumes vast resources to generate no trust, because the economics of mining concentrates power in fewer and fewer hands.

In one respect it’s working admirably well. It was designed to emulate the operations of capital markets on a rare commodity like gold, and it’s doing exactly what Marx pointed out over a century ago that capital always does: centralise power and reduce freedom.

This is the opposite of what the blockchain is claimed and believed to do, sure. But a technology is what it does, and it was designed to emulate capital accumulation, and that’s what it’s doing. It’s a great object lesson on why the free market cannot remain free without regulation, but not much more.

A ‘trusted private blockchain’ would be a contradiction in terms. If you have privacy and trust, you don’t need the vast inefficiency of a blockchain – which is just as well, because as observed, it doesn’t work.

tldr: Just use a database instead.

Nick P November 9, 2015 6:46 PM

@ jayson

Exactly. The goals that blockchain spinoffs want to do have already been done in various forms. We’ve seen them in: distributed, strongly-consistent, fault-tolerant databases (or centralized w/ distributed audit); distributed systems for accountability like tamper-evident event stores; Byzantine consensus schemes; digital/electronic money schemes; electronic marketplace; highly-assured build systems with similar requirements. List goes on with often stronger security or anonymity benefits with less CPU use. Might eventually change my mind but I’m not liking the wasteful blockchain model for now.

@ All

A few, random papers on pre-Bitcoin finance with distributed focus with one an interesting building block included for crypto-loving readers.

Comparison of Electronic Cash Schemes and Implementations (2001)

Note: In old designs, the currency could simply be counter values with a signature. Decentralizing that just means checking signature values and a Merkle tree. Doesn’t get easier than that conceptually or in implementation. However, this paper was before all the hardware attacks on chips like Mondex. So, ignore part saying hardware was secure. Also funny in retrospect is the “disadvantage” that “Customers reluctant to shop via
Internet.” Haha.

Toward a decentralized, electronic marketplace (2005)

Note: Most of the work around the time recognized that centralization wasn’t as much of an issue as having rules you could automatically implement and ability to counter bad behavior.

Strong accountability for network storage (2007)

Note: For the people that like crypto schemes. A heavyweight approach that still had acceptable performance.

Fabric: A Platform for Secure, Distributed Computation and Storage (2009)

Note: My kind of work. This is from Cornell group that gave us JIF language, SIF language, and SWIFT auto-partitioning for secure, web services. Just one year after Bitcoin paper, they made a far more powerful system that can leverage very strong technology for arbitrary computation. The blockchain methods are like blunt instruments in comparison.

Josh November 9, 2015 9:07 PM

The most depressing thing about the likes of @jayson, @Nick P etc is their lack of vision.

Assuming that the technology, as it is today, cannot develop into something more refined is just pissing in the wind of innovation.

Cryptocurrencies and distributed trust systems continue to develop. The research pace is very high. Imagine what we’ll have in thirty years. Can’t? Think it’s a dead end? That’s your problem.

Nick P November 9, 2015 9:44 PM

@ Josh

Same could be said of Bitcoin-tech supporters: a vision so narrow that it only sees the blockchain and an entire world’s worth of activity built on it. Given its goals, there have been plenty of tech that did that and more. Jayson mentioned databases. I mentioned quite a few others. One, admittedly clever, got a lot of popularity and support. It’s the least energy efficient, slower than many, possibly a pyramid scheme, harder to secure than some, and has regular failures in its banks.

I have plenty of vision. It started with digital cash by Chaum, methods of cypherpunks, private/academic advances, and so on. So many directions. Yet, mainstream usually locks itself into a suboptimal choices (eg COBOL, MS-DOS/Win3.x/successors, HTML/JSapps). The attributes of these tech make me think they’re suboptimal. All the articles talking like we’ve never done anything like this make me wonder if these communities are clueless on top of it.

So, I’d say I’m more visionary than vision-less given I’ve been pushing stuff with attributes of Bitcoin against the tide long before it was ever created. Many did good work further expanding on such things. Following the most popular fad of recent years is certainly not a vision: it’s just what Internet’s sheep always do. No surprise that Bitcoin came and now we have coin/blockchain everything on forums everywhere. Typical fad or crowd behavior.

Matthew November 9, 2015 10:22 PM

@Josh
The issue for blockchain is not about a lack of vision but rather it is facing some tough problems to overcome.

The Bitcoin blockchain itself has already encountered life threatening problems. Firstly there was the selfish miner attack. Secondly Ghash mining pool managed to achieve what everybody said was impossible that is it has gained more than 50% of the network hashing. Luckily the bitcoin community pressured Ghash to scale down their processing to less than 50%. Thirdly the size of the blockchain is growing quite large and it will grow much faster when more people and businesses uses it. Right now the blockchain is being threatened with a fork into two separate incompatible chains because the developers and community cannot agree how to resolve this problem.

As Jayson and Nick P has mentioned, a blockchain is a distributed database of transactions with safeguards against malicious attacks. As research continues, it is likely the blockchain technology will evolve closer to the research systems Nick P describes in his last link.

Winter November 10, 2015 4:35 AM

The blockchain is a proof-of-work system. There is ample literature on such systems (try “proof-of-work” on Google Scholar).

The Bitcoin blockchain is “special” as it has been shown to work in the most hostile and demanding environment where there were very strong incentives to defeat and corrupt it.

This means that we now have a practical tried-and-tested implementation of a well understood cryptographic protocol. I think it is the best tested system of its kind in existence.

On the other hand, the objections made above are mostly hypothetical or even ignorant. Yes, the block-chain is not a perfect solution to every problem and it can be defeated. But that holds for every cryptographic protocol or system known to man.

We do not need perfect security, we need well understood practical security.

MarkH November 10, 2015 5:53 AM

I already read the article linked by ‘Crypto’ (2nd comment).

Using its estimates of bitcoin block creation energy, and assuming a typical mix of electricity sources, I reckon that the bitcoin system is pumping at least a million tons of CO2 into the atmosphere per year, and perhaps several times that.

Pretty sad, for a “currency” that makes up such a miniscule part of world commerce.

Ribbit November 10, 2015 7:05 AM

Could blockchains provide an alternative to expensive certification authorities? If yes, any ideas or references on how it could be set up to work?

jayson November 10, 2015 7:46 AM

@Josh

I may have a lack of vision, but I did not write anything like you suggest. I was simply commenting on the use of the blockchain illustrated by the article.

To be clear, I have my complaints about the Bitcoin blockchain and more complaints about the integrity of other cryptocurrencies but they are being slowly addressed in the development community.

In my eyes, Bitcoin remains revolutionary. The zeal of its supporters is fully matched by the zeal of its detractors.

BoppingAround November 10, 2015 9:05 AM

[Off-topic] Nick P,
Just out of pure curiousity, what would be your ‘optimal’ alternative to HTML?

Gerard van Vooren November 10, 2015 9:30 AM

@ BoppingAround

Just out of pure curiousity, what would be your ‘optimal’ alternative to HTML?

If you ask me, the answer would be LISP (or LISP derivate). S-expressions are really simple to parse (much simpler than [HT|X]ML and with LISP they wouldn’t have to invent JS. A massive reduction in attack surface and code. Besides that, LISP already existed back when they invented HTML.

Nick P November 10, 2015 10:43 AM

@ BoppingAround

“Just out of pure curiousity, what would be your ‘optimal’ alternative to HTML?”

That’s a hard question to answer as I used to really enjoy using “DHTML.” 🙂 The language is good if you want amateurs to be able to understand and display static content. It handles dynamic content on a range from acceptably (CSS) to a step down from good approaches (JavaScript). The main problem is that HTML was designed for a different world that we are in now and it was inefficient even then. Even in Web 1.5 days, the DHTML push was showing we needed something more and it had to be more efficient.

I share my experience first. First problem was verbosity: much like LISP with start and end tags but with a ton of text (even for end tags?!). Also, header and body tags didn’t seem so useful. Given a 28Kbps connection, my first optimization was a compression scheme that created a (2?) character symbol set to substitute for tags, an endtag, and common words (the, that, are). HTML & HEAD were always first so I ditched them entirely: just add that to beginning of generated HTML. HEAD endtag always preceded BODY so I just created a symbol for BODY then auto-inserted a HEAD endtag in front. Had a single symbol for endtags with scoping logic that just figured which tags and ends belonged to each other, then generated endtag based on what tag was. The scheme, prototyped in PERL, trimmed HTML down considerably, was readable, and peformed fast when compiled through my BASIC system. That was on a 200Mhz P2, too. 🙂

Studying scripting & server-side, I eventually determined HTML had several problems:

  1. Too much verbosity for static content, both for typing & transport.
  2. The layout system, existing and emerging, meant we started with an integrated content & presentation layer that would have newer, different layers bolted on.
  3. JavaScript. Enough said?
  4. Browsers weren’t designed to be apps but had to handle that functionality.
  5. Server-side stuff was lame and had to work narrowly through whatever state/interface browser supported.

Clearly, we needed a replacement for the whole stack that was integrated, consistent, readable, space efficient, programming grade, and CPU efficient. At the least, we needed an option to use more efficient or better interfaced replacements for any one of these components.

Curl was the first I saw to meet these requirements. It was like LISP, HTML, and Tcl/Tk mixed together. Easy for humans and machines while handling whatever you needed and compiled for speed. Quite an exciting advance. Still exists.

Juice project was best JavaScript replacement I saw. A nice alternative to Flash, Formula Graphics, full native, etc. They used a type-safe and memory-safe language (Oberon) with ultra-fast compile and compressed, abstract, syntax trees as transfer medium. An advanced version of my own scheme, AST’s keep things close to original representation so you can run any checks, recreate original representation, or guide optimizing compiler to machine code. Very easy to compress, too. Dead now as archive.org link indicates.

As with many better tech, we just saw no take-up of good alternatives. Browser vendors kept HTML/CSS/JS with only exceptions being parts of turf wars that went nowhere. Things were bolted onto that until we got modern web applications. The new crowd discovered how horrible that was versus their other tools, which had also improved. Now, there’s a ressurgence of replacements for HTML, CSS, JS, etc which often target HTML and JS via code generators. Too little, too late, it seems.

So, what of modern systems. Markdown is most successful of efficient alternatives for static content. Many JS alternatives have emerged that compile to JS. There are a number of CSS alternatives that either compile to CSS/HTML or use JS. Haven’t reviewed them so no comment although Layx keeps getting mentioned. HTML/CSS verbosity can still be fixed with S-expressions or lang-specific compression. Opa language is an interesting answer to non-integrated stack with Curl still around, too. Haxe is also popular and amazingly cross-platform.

So, I think many pieces of the answer are well established even if the final word isn’t out yet. There will likely be a number of best answers tied to different types of people and their use cases. However, Markdown, Opa, Juice, and Haxe all seem to have many elements of The Right Thing while HTML/CSS/JS/Java.NET.PHP.WTF show dominance of Worse is Better. Have fun building your own answer to the problem. 😉

Nick P November 10, 2015 11:46 AM

@ Winter

“The blockchain is a proof-of-work system. There is ample literature on such systems (try “proof-of-work” on Google Scholar).”

It’s actually several things combined. It was an impressive achievement in terms of its combination of properties and better than proof-of-work before it. Yet, even those papers often debated whether proof-of-work was the best approach or even good for INFOSEC in long term. Existing systems can be supported with small amounts of money (eg subscription, fees) without massive proof-of-work and mining. That’s not a theory: an entire, global economy exists based on alternatives with clear paths for improvement.

“The Bitcoin blockchain is “special” as it has been shown to work in the most hostile and demanding environment where there were very strong incentives to defeat and corrupt it.””

“On the other hand, the objections made above are mostly hypothetical or even ignorant. Yes, the block-chain is not a perfect solution to every problem and it can be defeated. ”

You kind of contradict yourself, there. You first say it works in the most hostile environment then point out it can be defeated. Which is it? We already know the answer:

“Secondly Ghash mining pool managed to achieve what everybody said was impossible that is it has gained more than 50% of the network hashing. Luckily the bitcoin community pressured Ghash to scale down their processing to less than 50%.” (Matthew)

There were several security failures but this is my favorite. That statement not only means it failed to operate in hostile conditions: they further asked the opponents to play fair and they did. That apparently means they couldn’t make it happen within the existing scheme. That means the blockchain fails under one of its major, adversarial conditions. People relying on this approach may have to rely on their enemies’ goodwill in the future. That’s a ridiculous requirement for a security scheme and should never be tolerated.

So far from “hypothetical and ignorant,” the Bitcoin maintainers themselves would happily tell you that they were wrong about their assumption and could’ve had huge problems in the real world. Further, the alternative tech Jayson and I mention continue to advance to the point of doing 1+ mil transactions per second on two racks work of computers with strong-consistency. That the blockchain is slower, uses exponentially more resources, and doesn’t live up to its own security promises (unlike competing tech) means its provably inferior to alternative setups meeting similar goals. It’s on them to prove otherwise and they still haven’t despite a ton of momentum + money into this stuff.

“We do not need perfect security, we need well understood practical security.”

That’s why I exclusively cited in my post above methods from the past that were analyzed, sometimes mathematically proven, prototyped w/ great security or efficiency gains, deployed in the field for numerous applications, and with relatively few problems from the designs’ themselves. Then, there’s the blockchain movement that asks to ditch most other stuff, put all our resources into their basket, and can’t even protect/manage that basket. I agree that the well-understood stuff is what we need right now. Hence, I recommend we use and improve that instead of blockchains until they work out their problems.

Totally in favor of Comp Sci and cryptographic research into improvements on it, though. Who knows what will come out of it.

@ MarkH

“Pretty sad, for a “currency” that makes up such a miniscule part of world commerce.”

It’s worse. The alternatives are arithmetic, DB’s, some crypto, protocol exchanges, and servers/mainframes that do transaction processing + distributed verification. If we’re talking IBM TPF & Visa, this model was delivering by 1980 on S/370 mainframes: the first of IBM’s to use integrated circuits instead of discrete components. The 370 on a microprocessor takes about 200,000 transistors: small enough for 1um (1,000nm) node from 1985. Bitcoin technology already has miners developing or using 28nm ASIC’s to support what it does. That’s 20+ iterations of Moore’s Law worth of computing difference! Not even taking into account that custom hardware for transaction processing would take less than a full S/370 CPU. Further, it already takes around 50GB to represent the paltry $4-5 billion worth of currency and activity Bitcoin represents.

Combine the CPU and storage requirements to find that the current blockchain is one of the most inefficient and wasteful approaches ever adopted to processing and verifying transactions. Not only that, this model is expected to eventually handle Visa + Mastercard’s 20+ billion transactions a year with the collective dollar amount that exceeds Bitcoin’s above? And with fast transactions (sub-5 seconds) that people expect for average case? Doesn’t include forex, etc. Hard to believe that’s possible in current design.

I’d understand horrible watts or hardware requirements if the security, throughput, flexibility, etc were so much better than current system. However, it’s actually worse in most respects than alternatives that build on prior work or work within the existing model while knocking out issues with design or operational adjustments. So, I think we can say empirically that Bitcoin and its blockchains are a bad approach in terms of resources required for its level of security & performance. Better to invest in alternatives, whether a blockchain or just distributed DB’s/ledgers.

Nate November 10, 2015 3:42 PM

@Nick P: Ooh, thanks for that Fabric link. I’ve been tossing around ideas somewhat similar to this for ages. I’m interested in anyone trying to build a decentralised compute/store cloud.

Basically I see three major, huge design flaws in Bitcoin and the blockchain:

  1. Proof of work as a security principle is FATALLY flawed. It anti-scales – it aggressively fights scaling by consuming resources – and yet it can still be trivially gamed by hostile actors simply by throwing more resources at it until the biggest oligarch (the one with their own chip foundry and powerstation, or who can steal power) wins. Sorry, but that’s a failure, not a solution.
  2. Storing every transaction in one shared public log file means there’s no privacy, AND that the log file grows continually on the order of the number of transactions per day in the entire world. That’s not scalable, and it eliminates privacy.

As the Silk Road case demonstrated, transactions on the Bitcoin blockchain are ‘prosecution futures’. Don’t use blockchain technology for anything illegal, or anything you would like to remain private. Everything you do is broadcast to everyone. It may be ‘pseudonymous’ but unless you’re absolutely obsessive about operational security, it’s very easy to trace the links.

A much better solution would be one that didn’t require broadcasting an entire ledger to the whole planet and storing it forever.

  1. Sequential verification of transactions across the entire blockchain means a huge bottleneck and it’s why Bitcoin’s tps is so slow. Again, this is laughably poor engineering compared to what transaction processing systems were doing in the 1960s even. There needs to be a way of doing most transactions in parallel and only syncing the ones that have dependencies that really need to be done in serial.

I do think that we need a messaging network over the Internet that is:

  1. distributed (so anyone can publish and receive)
  2. persistent (stores messages, doesn’t discard them immediately)
  3. publish-subscribe based (so you only get the messages you care about)
  4. allows anyone to publish arbitrary data (not limited to 140 char text!)
  5. cryptographically secured (so no-one can spoof or change your messages)
  6. can do computation as well as dumb messaging (so we don’t need Javascript)
  7. is functionally pure (so no side effects, and uses functional reactive programming for I/O)

What we don’t need are the three flaws of Bitcoin’s blockchain: exponential resource consumption as more nodes come online (and its corollary, that anyone with sufficient hardware outlay can spoof the network), unfiltered public broadcasting, and enforced serialisation

@Gerard van Vooren:

I agree that S-expressions are a much better base for a fundamental syntax than HTML’s SGML/XML. I like that a fairly simple mapping between them exists: SXML. https://en.wikipedia.org/wiki/SXML

My feeling is that we need not so much new programming languages but new data representation languages. Because we have so much data now that needs to persist through multiple systems.

A data representation language that wants to be used for a long time across multiple systems needs to have as little syntax as possible, because a huge problem for ‘bit-rot’ is that different serialisation syntaxes can corrupt data at the syntax level, and each have very different escaping semantics. Bad handling of syntax escaping is one of the worst security problems to solve – witness SQL’s issues with embedded quotes, which come from a (bad) choice in ASCII to have only a single straight quote character, so there’s no way to detect and reject mismatched quotes.

If we used parens rather than quotes to delimit strings, the SQL quote injection issue would go away. But we can’t use parens to delimit arbitrary strings in most Lisps, because, on top of the minimal S-expression base of (.) they reserve a whole bunch of punctuation characters.

Syntax is important, and I wish we paid a LOT more attention to it than we currently do. We could eliminate whole classes of errors.

Nate November 10, 2015 4:35 PM

To start with, I wish we had a system that was a very simple, distributed hybrid of Usenet, Wiki and Web Forum. Where the basic unit of data was not the ‘page’ but the ‘post’.

Imagine if it was S-expressions, or even SXML, the data for a post might be something like:

(post
(author “Nate”)
(datetime (2015 11 10) (16 00))
(content “To start with…”)
(reply-to ..)
(replaces … )
(signature ….))

And then the system framework would manage aggregating and displaying all these posts. It would be like email, but public, publish-subscribe and persistent.

The structure of the data would resemble both HTML pages and SMTP messages, but I think S-expressions would be a much cleaner base for each.

The biggest initial problems would be managing some kind of central identity repository – this is the problem that Facebook and Twitter solve. But they take huge data centers to solve a problem that email solved years ago. I’m sure we could do much better if we built a system to cache aggressively, only transmit the bare minimum of changed data, and avoid the massive amount of redundant data handling and Web frontend generation (riddled with ‘rich content’ ads) that our current web-and-app infrastructure seems to need.

One feature I think we would need would be transclusion, of the Xanadu kind. The idea is that we should reduce to an absolute minimum the amount of repeated data a post contains. If someone has said it once before (and by ‘someone’ and ‘said’ I include data of Wikidata type), you should be able to quote it and insert a reference to that quote in your post. And any client who already has that quote cached doesn’t have to pull it over the Internet again. This means you get data normalisation as well as reduce Net usage. A double win.

You could do this as easily as

(content “Now see here blah blah” (include some data off the net) “therefore as you can see..”)

Why don’t we already have tags in HTML5? It’s as simple a concept as , but currently we have this insane mess of Javascript:

https://stackoverflow.com/questions/8988855/include-another-html-file-in-a-html-file

It’s all just broken and weird. We keep piling trash on top, but the foundations aren’t built.

Nate November 10, 2015 4:40 PM

Gah. That last part was broken by, you guessed it, syntax escaping issues. I mean to write

Why don’t we already have [include] tags in HTML5?

but of course I used angle brackets and so it vanished, silently, with no error.

This is EXACTLY the kind of foundational syntactic stuff we need to fix, folks, before we can build the tottering Internet tower any higher. Add up all these little data corruption and bitrot issues and it becomes a nightmare, especially for security.

Nate November 10, 2015 4:44 PM

Added because I can’t edit:

Why don’t we already have [include] tags in HTML5? It’s as simple a concept as [a href].

The wider issue I’m harping on is that if we were good programmers we would solve all these issues ONCE, then reuse the solution. Not ‘once per implementation of any kind of Web-based messaging system on every platform’. Once, get it right, stick to it, and move on.

The fact that ‘get it right and stick to it’ is such a controversial idea in programming today – while ‘constant reinvention yay!’ is the new hotness – suggests to me that we our chance of ever getting anything right is fairly low.

Nate November 10, 2015 7:18 PM

@anonymouse:

Yes, I’ve tried JSON, and unfortunately it’s got a couple of major problems. Especially if you’re trying to use it (as I was) to implement a domain specific language involving data representation:

  1. [“JSON”, “has”, “a”, “very”, “noisy”, “list”, “syntax”, “requiring”, “you”, “to”, “doublequote”, “every”, “symbolic”, “literal”, “and”, “also”, “add”, “commas”]

compared to s-exps — so much cleaner! –:

(JSON has a very noisy list syntax requiring you to doublequote every symbolic literal and also add commas)

  1. JSON lists aren’t lists, they’re arrays. There is no dotted-pair notation. Unfortunately, dotted pairs in certain situations are the most elegant (and literally correct) way of representing data.

A classic case is Scheme’s dotted variable binding syntax

(lambda (head . tail) (cons head (cons ‘blah tail))

Sometimes you really really need to be able to cleanly split a list into its head and its tail portion. There’s no elegant way to represent that split in JSON arrays. The only way to do it would be to reserve a symbol to represent the dot.

This wouldn’t be so painful if we didn’t have to quote and comma everything in a JSON list.

  1. {“JSON”: {“objects”: {[“can’t”, “have”, “arbitrary”, “values”]: [“as”, “keys”]}}}

JSON objects – key/value dictionaries – can only have string or number literals as keys. This restriction isn’t inherent in the mathematical definition of a dictionary type, and it isn’t in all other languages which have dictionary structures – such as Lua, which allows arbitrary keys.

Not having arbitrary keys is a severe restriction. Among other things, it means you can’t cleanly represent a list as an object. Eg, you can’t easily round-trip convert:

[[a,b,c],d,e] {{a: {b: {c: {}}}}: {d: {e: {}}}}

Why would you want to do this? Well, suppose you’re wanting (as I did) to represent logical assertions in JSON. Because you had something like RDF, but expressed as arbitrarily long lists – like Prolog assertionis – rather than just triples. Obviously, you’d need the ability for any place in that list to be a list itself, because (as in Prolog) logical terms need the ability to reference other logical terms. So something like:

(believes alice (loves bob alice) 50) -> “Alice believes that Bob loves Alice” with 50% probability

needs arbitrarily structured lists. You can’t restrict any place from being a list. Okay?

Okay, you can do that in lists. Why would you want objects?

Well, suppose you want to have a database containing multiple assertions. You want to represent that ‘ALL of these assertions are true’. Yes, you could make a list and append it together

(all-this-is-true (loves bob carol) (loves alice bob) (believes alice (loves bob alice) 50)

Poor Alice! She’s mistaken about Bob. But at least she’s hedging her bet with that 50% at the end, right?

But the most logical thing to do would be to convert this into an object or table structure, like so:

{loves: {bob: carol, alice: bob},
believes: {alice: { {loves: {bob: alice}}: 50} <—- FAIL

Our syntax breaks down here because we can’t have an object as an index.

There is knowledge in our knowledge base – a fairly ordinary kind of thing that in RDF, or an RDF-like database, we’d get all the time – that we just cannot cleanly convert to JSON.

(Note for the keen-eyed: yes, I’m simplifying. This is still not an exact representation of a S-expression cons list as an object; there’s no trailing {} for nil. But you get the idea. The failure remains.)

And so we can’t use JSON as a syntax for knowledge representation or logic. Whoops. Turns out humans use logic A LOT.

This didn’t have to happen. But it did because the designers of JSON didn’t clearly think through all the implications of their arbitrary restrictions of their data model.

Nate November 10, 2015 7:47 PM

An alternate syntax I’ve been playing with to solve this data representation problem is a very slight extension of S-expressions. Take dotted-pair lists, and instead of just one value after the dot, allow multiple values. This can represent a table or dictionary structure with arbitrary keys AND collapses to a simple sexp list in the case of single values.

(. (loves . (alice bob) (bob carol))
(believes alice (loves bob alice) 50))

or, if we rewrote it in object-oriented format:

(. (alice . (loves bob) (believes (bob loves alice) 50)
(bob loves carol))

The underlying data model for all nodes in this structure is a binary relation: ‘a set of cons pairs’. You could represent SQL tables, Prolog predicates, text files, functions, relations. I reckon it would serve as a minimal but universal data structure for almost anything. It would at least be a small step up from S-expressions (adding the idea of logical AND / set union / parallelism), and two small steps up from ‘unstructured sequence of ASCII or Unicode characters’. It could store more than SQL or CSV could. And it wouldn’t be as noisy and restrictive as either JSON or XML.

Binary relations are extremely well studied in mathematics. But an interesting result is that this data model (set of cons pairs, or binary relation) is just slightly MORE expressive than computing’s favourite data structure: ‘dictionary’, ‘table’, ‘object’ or ‘function’. It has the ‘nil’ element – required to properly represent lists – which there’s simply no equivalent for in, eg, a function, or a dictionary. That’s a subtle but VERY important difference. What does it mean? I’m not sure, but it’s definitely not nothing. It may even represent something like ‘logical truth’ (I’m pretty certain that the empty set is a good match for ‘logical falsehood).

This is the sort of thing I’m talking about when I say the foundations of computing are unclear. There are very deep mathematical structures that, if we followed, would make our lives much easier; and when we ignore them, make our lives hard. Instead we tend to hack things up that almost work and solve some problems but aren’t mathematically solid and 100% correct. And that means they’ll fall apart when we lean on them too hard. And because they fall apart, we end up proliferating a huge array of mutually incompatible partial solutions where one correct solution would be enough.

Remember, a deep problem (maybe THE deep problem) in computing is not just representing data correctly enough to solve one problem and then throw it away. It’s finding a represention for data that can persist for a long time and be useful for a number of DIFFERENT problems. During its entire lifetime this data should be easy for both humans and computers to read, take a minimum of mental effort to parse, and not subtly lose information in each transformation between systems.

I don’t believe we’ve come close to solving that yet, which is why I find it an interesting problem to think about.

anonymouse November 10, 2015 8:53 PM

@ Nate

“(believes alice (loves bob alice) 50) -> “Alice believes that Bob loves Alice” with 50% probability”

I don’t know the language and don’t understand how the 50 in front of the last curved bracket turns into 50% probability for the dictionary meaning of the first word following the first curly bracket, but I think I get your logics.

Gerard van Vooren November 11, 2015 9:15 AM

@ Nate,

About SXML, that looks pretty cool! I wasn’t aware of it. Also the fact that it can be parsed today with ordinary Scheme is a serious benefit. How is it possible that such technology gets unnoticed for a decade when its obvious a much neater syntax. Beats me.

What you mean with a data representation language, you are probably talking about 4GL. I can’t talk about 4GL because I have absolutely zero knowledge about this subject.

To start with, I wish we had a system that was a very simple, distributed hybrid of Usenet,
Wiki and Web Forum. Where the basic unit of data was not the ‘page’ but the ‘post’.

The benefit of LISP is that it is plain text and dynamic. It’s a programming language so it could deal with CRUD, AJAX and other buzz functionality. The only thing you need is libraries and protocols. Well, anyway, the history of WWW shows that being first is much more important than being better.

Nate November 11, 2015 3:49 PM

@anonymouse: I was basically just mocking up a throwaway example of a sentence that might be common in logic and would require a subexpression in a place other than at the very end of the list. This isn’t so much a language as an instance of something approaching Prolog predicates in S-expressions.

If you’re not familiar with Prolog: https://en.wikipedia.org/wiki/Prolog

SWI-Prolog is pretty much the standard reference implementation, and there’s a Web frontend with example databases here: http://swish.swi-prolog.org/

The Prolog syntax for an assertion is

predicate(A,B,C…).

but the Lisp variant PicoLisp has an implementation (Pilog) which uses pure S-expressions. It uses it mostly as a database search tool, but it can be used for general purpose programming or logical inference.

http://software-lab.de/doc/ref.html#pilog

It’s worth taking some time to study Prolog because it’s SO different from the narrow subset of Algol-derived languages that we usually think of as ‘programming’. It combines features of database search and retrieval as well as functional programming, which means you can use it for BOTH data AND code. It’s an entirely different paradigm – ‘logic programming’ – which almost completely died out in the early 1990s.

Oh, and for extra points! MicroProlog, a very nice implementation, including a user-friendly BASIC-like shell language, for the 48K Sinclair Spectrum in 1983! http://www.worldofspectrum.org/infoseekid.cgi?id=0008429 A commercial failure though. This is what we COULD have had in the 1980s if we’d wanted it enough.

However, Prolog is an ancient language, dating from the 1970s, and it doesn’t incorporate all of the thinking that emerged in the Algol or even Lisp worlds since then. It has a very ‘Fortran’ feel to it – a single, flat, namespace; a shared database; serial processing. It doesn’t even have lexical binding! It’s all dynamic binding. It’s like pre-Scheme Lisp – EMACS Lisp, perhaps.

(One modern feature I don’t miss at all, though, is typing. Most modern languages have two sublanguages: a functional/imperative algorithm language, and a ‘type system’ which is actually a cut-down logic language. Prolog provides ONE language which can do BOTH tasks!)

I’m interested in how we could take logic programming into the 21st century – making it more like Scheme. One of the leading contenders for a Prolog successor is miniKanren http://minikanren.org/ – but something still seems missing to me.

Nate November 11, 2015 4:18 PM

Specifically, the thing I think is missing from the Kanren family is Prolog’s very close data-code equivalence – something which is a deep strength of Lisp.

As I understand it, Kanren is an ’embedding’ of logic programming in Scheme – usually done with macros that convert relations into functions – which means that a Kanren relation can’t typically access the raw logic assertions of the database. This is in line with the doctrine of functional programming that data shouldn’t be unnecessarily shared, and that’s good as far as it goes. But it seems to somehow miss the most beautiful thing about Prolog. As Will Byrd (miniKanren) describes:

https://stackoverflow.com/questions/28467011/what-are-the-main-technical-differences-between-prolog-and-minikanren-with-resp

“One very interesting feature of Prolog is that Prolog code is itself stored in the global database of facts, and can be queried against at run time. This makes it trivial to write meta-interpreters that modify the behavior of Prolog code under interpretation. For example, it is possible to encode breadth-first search in Prolog using a meta-interpreter that changes the search order. This is an extremely powerful technique that is not well known outside of the Prolog world. ‘The Art of Prolog’ describes this technique in detail.”

This means a Prolog predicate is not just like a dynamically-bound old-school Lisp function – it’s even more like an old-school Lisp macro (or FEXPR, actually, for the history fans – https://en.wikipedia.org/wiki/Fexpr . I feel like John Schutt is much more correct here than Mitchell Wand, and that a wonderful, deep tool was lost when FEXPRs were unjustly vilified and burned in the 1980s.)

This deep access to the underlying source code makes old-school Prolog absolutely wonderful for someone writing a domain specific language or knowledge representation system, where everything in the source file or knowledge base should be exposed, guts and all.

It’s less wonderful for someone who’s used to the modern functional or object-oriented programming doctrine that nothing should know anything about anything that it doesn’t need to know, that source should irreversibly compile down to an opaque system object, and that decompiling or reverse engineering is morally suspect and a security breach.

I’d love to somehow square the circle between these two very different ideals, which is quite possibly impossible. But they both feel like important required features of a sensible, secure, user-driven operating system: 1. The user should get full access to the system at all levels. 2. Nothing should break unless you intend and have the rights to break it.

Nate November 11, 2015 4:37 PM

Another way of putting it:

Prolog is an obsessively ‘late-binding’ language, even more so than Smalltalk. It stores almost all code as ‘source code’ right up to the moment of execution.

This is very much like Smalltalk! It feels like Prolog and Smalltalk (or rather, logic programming and object oriented programming) should have a LOT of things in common, because their philosophies overlap so much. They’re both in their ways extensions or outgrowths of old-school Lisp; they’re both about late binding, runtime self-modification, and exposing reflection to the programmer. So it should be easy to convert between the two paradigms.

But, surprisingly, they’re not easy to convert. Or at least, Prolog as it exists today isn’t easy to distribute into objects. That big central shared database doesn’t break down cleanly. Robinson resolution with backtracking seems inherently serial. The smartest brains in the 1980s, the Japanese 5th Generation project, failed on the effort to parallelise Prolog – https://en.wikipedia.org/wiki/Fifth_generation_computer

OO systems have since been, with heroic effort, wodged ONTO Prolog – http://logtalk.org/ – but not in an especially distributed way, not down at the core level of ‘function dispatch’, like Smalltalk does so easily.

And that gap intrigues me deeply.

Nick P November 11, 2015 5:21 PM

@ Nate

“This means a Prolog predicate is not just like a dynamically-bound old-school Lisp function – it’s even more like an old-school Lisp macro

I always thought of them as data processed through an algorithm and conditions. That is interesting, though.

“Prolog is an obsessively ‘late-binding’ language, even more so than Smalltalk. It stores almost all code as ‘source code’ right up to the moment of execution.”

I don’t consider that a strength necessarily. I’d rather have both representations with operations on efficient one being the default for… efficiency obviously.

“Or at least, Prolog as it exists today isn’t easy to distribute into objects. That big central shared database doesn’t break down cleanly. Robinson resolution with backtracking seems inherently serial.”

You mean something like this or this? Two of a few parallel implementations of PROLOG. Or are you getting at something else?

Note: I only did some tutorials on PROLOG plus CLIPS and (OPS5?) back when I studied AI for a while. Clearly I don’t have your background knowledge to evaluate this stuff deeply: just surface judgement calls.

Honestly, I think most gave up on PROLOG when they found first-order logic didn’t cut it for most problems and it wasn’t good for programming either. Just too limited across the board. Might be part of that gap you sense. Most effort is going into HOL, Isabelle, Coq, etc. However, it’s advantages in mechanical verification keep it alive in various forms like ACL2, Shen LISP (type system is Sequent Calculus lol), and verified Jitawa/Milawa. Collectively the most interesting to me in terms of potential with Mercury language being notable for strong typing and speed.

“”The smartest brains in the 1980s, the Japanese 5th Generation project, failed on the effort to parallelise Prolog – https://en.wikipedia.org/wiki/Fifth_generation_computer

I remember reading about that and all the competitions to make PROLOG systems. Been a long time since I’ve seen it. Fun times. Meanwhile, even through it wasn’t built, here’s a gift for you to track down and maybe play with. Such a design can even fit in a Spartan-6 with room left for accelerators. 🙂

Paul Dodd November 12, 2015 9:13 PM

“It’s an entirely different paradigm”

Despite all the causal mindwork you put into it, if one wants to get anything meaningful out of logics, one must cross over, or connect, to the real world at some point. As we’ve seen in the past and present, rolling out a new set of soft paradigm, and reboot, does not rid ourselves of underlying legacy hardware, legacy logics, and layers of those on top and bottom, sideways. Despite all the self checking and automated analysis a set of logic will need to be translated into legacy instructions and migration plans, which creates a bigger quagmire, which the smarts will invent new solutions for.

codete January 31, 2017 9:59 AM

Thanks for the read – I’m quite new to the field and wanted something clarifying , e.g. about the relationship between bitcoin and blockchain. I’ve come across blockchain mechanisms of voltage production and – I think – also distribution.

regards

Christopher PAINTER July 7, 2017 3:47 AM

For me a fantastic thread ~ I wrote Prolog and about it in the 80’s, even met O’Keefe. I leave a message for those that pass, that Prolog may soon resurrect as the perfect blockchain engine, consider Nate’s words from above…

“This deep access to the underlying source code makes old-school Prolog absolutely *wonderful* for someone writing a domain specific language or knowledge representation system, where everything in the source file or knowledge base should be exposed, guts and all.”

The Ethereum blockchain mechanism currently uses Proof of Work, and hopes to move to Proof of Stake. But Prolog is a itself proof engine, so the distinction between the interests of miners and users disappears. This opens the way for co-operative blockchains, where everyone is exposed to the same risks and incentives.

Once again, thanks for the meta-thoughts, it made my day! If anyone’s interested feel free to contact.

Leave a comment

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.