The Problem with Treating Data as a Commodity

Excellent Brookings paper: “Why data ownership is the wrong approach to protecting privacy.”

From the introduction:

Treating data like it is property fails to recognize either the value that varieties of personal information serve or the abiding interest that individuals have in their personal information even if they choose to “sell” it. Data is not a commodity. It is information. Any system of information rights­ — whether patents, copyrights, and other intellectual property, or privacy rights — ­presents some tension with strong interest in the free flow of information that is reflected by the First Amendment. Our personal information is in demand precisely because it has value to others and to society across a myriad of uses.

From the conclusion:

Privacy legislation should empower individuals through more layered and meaningful transparency and individual rights to know, correct, and delete personal information in databases held by others. But relying entirely on individual control will not do enough to change a system that is failing individuals, and trying to reinforce control with a property interest is likely to fail society as well. Rather than trying to resolve whether personal information belongs to individuals or to the companies that collect it, a baseline federal privacy law should directly protect the abiding interest that individuals have in that information and also enable the social benefits that flow from sharing information.

Posted on February 26, 2021 at 6:28 AM35 Comments

Comments

jbmartin6 February 26, 2021 7:22 AM

Very interesting,I tend to agree, ownership is entirely the wrong concept to apply to these issues. If I look at you and see you are wearing a green shirt, is that your data or my data? Why should you have the ability to tell me whether or not I can tell someone else that I saw you with a green shirt?

David Rudling February 26, 2021 7:51 AM

I am a hardline privacy protaganist. I therefore support the concept of personal information as property.

Without it, one can write absurdities such as :-

“Rather than trying to resolve whether personal information belongs to individuals or to the companies that collect it, …”

instead of what it should be:-

“Rather than trying to resolve whether personal information belongs to individuals or to the companies that steal it, …”

@jbmartin6
What color shirt I wear is none of your or anyone else’s business.

Delvin Anaris February 26, 2021 8:53 AM

@David Rudling: But where do you draw the line?

If I see you wearing a cool Star Wars shirt in passing, and happen to mention to someone else later that day that I saw a guy wearing a cool Star Wars shirt, is that an invasion of your privacy?

What color or type of shirt you’re wearing is out there in public for everyone to see. I get that there’s been a depressing and distressing level of colonization of the public sphere in recent years, particularly with things like ubiquitous CCTV in London and other cities around the globe, but I don’t think that a reasonable response to that is to say “if you see me in public, any description of me you would make to anyone else is my private data, and you are stealing from me by communicating it in any way”.

Can you articulate clearly a way to tell what’s “talking about something that’s obviously public” and what’s “stealing your private data” without explicitly asking permission for every single conversation from every single person we’ve ever met?

Clive Robinson February 26, 2021 9:55 AM

@ ALL,

In times past “property” was something physical and tangible and effectively unique and unreproducable.

Thus the law relied on the unreproducability as a way of javing unique ownership.

Then someone industrialised the process of making pins and from that point on tangible physical objects became less and less unique.

Sometime there after the value in non tangible works became realised and rather than come up with new fundamental legislation they tried to reuse existing legislation that did not fit well.

As time has moved on although we still have tangible physical objects, they are becoming less and less unique. In many cases the average human can not tell the diference between one consumer object and the next, and even the concept of serial numbers does not help. At the same time the number of non physical intangible information objects has proliferated rapidly.

The law has in no way kept up, thus existing legislation is entirely inadiquate to the task.

Winter February 26, 2021 10:30 AM

The whole “who owns the information” breaks down spectacularly with genetic information. My genetic make-up is shared by my relatives. So when I “sell” my genetic information, I also sell information about my siblings, parents, (grand)-children, uncles aunts, cousins etc.

Ownership is a flawed concept even in the material world, e.g., who can dispose as he want of a lake, river, air, soil, a condominium, farmland? With the infamous IP, it becomes even more strained, and with privacy, information ownership becomes poisonous like radioactive waste.

Karanja February 26, 2021 10:32 AM

@David Rudling, if I see you get run over and in need of help, should I be able to share that information with others without your permission?

JonKnowsNothing February 26, 2021 11:22 AM

@All

Added to the difficulties are the various rules as applied and not applied to the data/items/information.

A bullfighter in Spain was denied a copyright on one of his performances. It is rather hard to determine why a performance doesn’t qualify the same as a movie or theater or live performance does but is seems that in Spain the rain falls mainly on Paintings.

There is some dancing about reproducibility and the court found that using “common techniques” known in bullfighting, strung together in a performance does not qualify for copyright.

So.. if you follow that line:

The 3 basics of computer languages (assignment, loops and if-tests) would prohibit all copyrights on all software and all firmware because they are “common” and it would not matter how different the application is or isn’t it the use of common techniques doesn’t count for protection.

Hardware might count unless the Spanish Courts float down the the Quantum Level and state those are “common” too. Works going up the scale same way.

Courts in the EU stripped Bansky of his trademark on a work of art, which was taken by a greeting card maker and used extensively in their product line. So, trademarks are not safe legal protections either.

Patents are malleable. They may be assigned to one holder only to be reassigned later to another holder.

ht tps://www.theguardian.com/world/2021/feb/25/spanish-court-rejects-matadors-copyright-claim-over-work-of-art

ht tps://en.wikipedia.org/wiki/Banksy
ht tps://en.wikipedia.org/wiki/Banksy#Trademark_dispute

Banksy said “A greetings cards company is contesting the trademark I hold to my art, and attempting to take custody of my name so they can sell their fake Banksy merchandise legally.”

ht tps://abcnews.go.com/Entertainment/wireStory/banksy-loses-eu-trademark-fight-greeting-card-company-73067827

September 18, 2020
The cancellation division of the EU’s intellectual property office said in a ruling this week that Banksy’s trademark for “Flower Thrower” was filed in bad faith and declared it “invalid in its entirety.”

(url fractured to prevent autorun)

name.withheld.for.obvious.reasons February 26, 2021 2:55 PM

I see this in a contextual and time domain dependent expressed as the kernel of the “real world” environment we live in, that is on planet Earth as I understand it.

The level at which information and data is abstracted into knowledge and insight that has the capability to drive events and actions in meatspace seems not well understood. There is the contextual wrap around in which data derives a meta-info type of expression. Say individuals with specific information cause for events that aligned with both their own meta-info and a larger contextual meta-knowledge domain to form a singular and group based triggered action. Individual, sub-group, group, collectives, and other formations of knowledge domains generated from raw data is potentially exploitable to make or cause one or more elements of a domain or domains to respond or react.

Clive Robinson February 26, 2021 5:45 PM

@ name.withheld…,

The level at which information and data is abstracted into knowledge and insight that has the capability to drive events and actions in meatspace seems not well understood.

It’s getting on for midnight here, and although I’ve read your post twice, my brain kind of stops after that sentence.

I’ll give it another try in the morning.

ResearcherZero February 27, 2021 12:22 AM

@Clive Robinson

I’ll have a stab at it:

Why have a system that can detect campaigns launched within the nation, when instead you can have a $1.2B bureaucratic money making machine. You could name it, for example, TrailBlazer, and everyone in your circle could have a stake in it and make a sizable profit from the contract. You could cannibalize ideas from previous projects. Strip out any functions to encrypt or anonymize domestic traffic, which of course would then make it illegal to operate domestically, but it could sweep up everything else, and you could wrap it in a shiny package and market it everywhere as a total surveillance solution.

It would be a bureaucratic wet dream come true, at least from what I can gather from the articles posted around in the crevices of Blogland. Occasionally it might crash the network due to the huge volume of data, but it would tick all the boxes, especially the all important funny handshake boxes (you know when you forget your a Freemason and you think they are molesting your hand or something).

or

Those certification courses are boring, repetitive, and offer very limited real world experience.

or

They should teach network engineering and pentesting in primary schools everywhere.

or

That some people understand somethings but not everything because the world is an increasingly complicated space.

or

Cloud infrastructure authentication is too complicated and inevitably leads to security compromise due to errors in configuration. However, it was first to market so won all the lucrative contracts due to a large advertising spend that came out of the security and development budget. Also someone tore down the ‘Keep It Simple Stupid’ poster on the wall because the project was over complicated and under resourced due to the large advertising spend and large executive bonuses.

or

“privatizing profits and socializing losses”
hxxps://www.nytimes.com/2021/02/23/opinion/solarwinds-hack.html

?

ResearcherZero February 27, 2021 12:37 AM

@Clive Robinson

Perhaps Cybernetics?

The ‘radical’ idea that the natural world can be described using electrical networks and feedback loops, until you get lost in the woods because your phone doesn’t work.
You planned on watching that survival course on the web next week, and as a result you die of exposure.

ResearcherZero February 27, 2021 12:51 AM

Back to the subject at hand.

If I was to be hypothetically, a protected witnesses, data ownership would not protect my privacy. All problems for protected witnesses revolve around data access.

It is very easy for anyone to get access or alter your data if they have departmental clearance, even if it is restricted. Bribery and intimidation is rife, and so are lapses in security.

If a file is restricted to a departmental manager, the departmental manager will at some time go on leave, or be transferred/retired. There are also other points of access, such as for medical reasons, via doctors or someone in the health system.
There are also legal reasons, and the police and justice systems are not that flash as far as security or auditing is related.

These systems are also repeatedly subjected to cyber intrusion attempts.

ResearcherZero February 27, 2021 1:20 AM

Hypothetically speaking, there may also be instances where I would have to give evidence that ‘someone was not me’, where they were framed in a crime, where my ID was dropped at the scene of the crime near their home.

Just sharing the same first name is enough information for someone to find themselves in a spot of bother. Identity theft is also very common.

Mark zuckerberg is not bothered by these implications. Apparently when we are all connected in an open society, serious crimes like rape and murder will simply vanish, and there will be no need to protect the identity or locations of children in danger. I presume in this scenario adults would fend for themselves, as all forms of corruption will have evaporated, this would be incredibly easy.

From what I can gather, open and connected societies, are free from the bothersome troubles of such things as nuance. It’s a perfectly structured Pantone world where all the rough edges have been smoothly routed within the margins of 403 Forbidden tolerances.

ResearcherZero February 27, 2021 5:35 AM

Probably one of the best examples though of treating data as a commodity is strangely enough related to oil. After all they do say that data is the new oil.

What can the collection of private information potentially achieve, and who is interested in it?

Shell executives played a hand in the huge corruption scheme, which reached the highest echelons of the government.
hxxps://qz.com/africa/955409/shell-knew-about-opl245-bribes-to-etetes-malabu-and-other-nigerian-politicians-show-emails/

After the deal was struck, most of that money mysteriously went missing from public coffers.
hxxps://foreignpolicy.com/2017/04/11/emails-show-shells-complicity-in-biggest-oil-corruption-scandal-in-history-nigeria-resource-curse-etete-eni/

Money then allegedly went to former Nigerian oil minister Dan Etete and was “intended for payment to President [Goodluck] Jonathan, members of the government, and other Nigerian public officials”
hxxps://www.independent.co.uk/news/business/news/shell-nigerian-oil-field-deal-funded-alleged-bribery-scheme-global-witness-a8891676.html

But what has that got to do with private data?

Their testimony paints an extraordinary picture of how far a western company would contemplate going in an effort to undermine the democratic process in a country that already struggles to provide free and fair election.
hxxps://www.theguardian.com/uk-news/2018/mar/21/cambridge-analyticas-ruthless-bid-to-sway-the-vote-in-nigeria

sources familiar with the campaign described how the company was looking to collect “kompromat” – compromising material or information – on opposition leader Muhammadu Buhari
hxxps://www.bbc.com/news/world-43476762

A Nigerian government committee is looking into claims that Strategic Communication Laboratories (SCL), which is linked to UK-based firm Cambridge Analytica, organized anti-election rallies to dissuade opposition supporters from voting in 2007
hxxps://www.dw.com/en/nigeria-to-launch-probe-into-2007-2015-elections-over-scl-cambridge-analytica/a-43228067

I’m not saying that Shell was in anyway connected to SCL Elections, just that there are a lot of interests at play during an election, some who might want to keep the incumbent in position for personal benefit, and personal data may help sway the odds in the favor of one entity or another.

In that regard data can be fashioned into a very powerful weapon that can undermine the democratic process itself. I would suppose that is why export controls were placed upon such technology in the first place.

Joe K February 27, 2021 10:36 AM

The article reminded me of a recent (17 Feb 2021) short piece about google streetview in Germany:

Google ‘Removes’ German Residences From Street View By Request (2010)
https://www.techdirt.com/articles/20210217/15054846264/content-moderation-case-study-google-removes-german-residences-street-view-request-2010.shtml

I find google maps streetview unnerving. I have never seen a clear articulation of all my concerns about it. I’m unsure I myself possess a conceptual understanding sufficient to fully articulate them.

I do think this comment raises some interesting questions, though:
https://www.techdirt.com/articles/20210217/15054846264/content-moderation-case-study-google-removes-german-residences-street-view-request-2010.shtml?threaded=true&sp=1#c334

One thing I wonder: Does my relatively mild discomfort with publicly available resources like google maps streetview mask more potentially disturbing (and probably more important) thoughts about the uses/abuses of similar resources not publicly shared (keeping in mind institutions like US National Geospatial-Intelligence Agency, and the US government’s record of contempt for civil society).

Clive Robinson February 27, 2021 10:38 AM

@ ResearcherZero,

I’m not saying that Shell was in anyway connected to SCL Elections, just that there are a lot of interests at play

I’m happy to say that a certain well known US Hedge Fund owner and his daughter who favoured Trumps to play, were very much in on the relationship you describe.

And whilst the Father has slowed down and pulled back a bit, the Daughter is running guns ablazing through certain parts of the US body politic…

Just remember when GWB was supposadly in charge, which oil company his “puppeteer” had significant interests in. Then later how NSA intercepts over a certain South American Countries Oil drilling rights auction got shanghai’d and ended up in the hands of certain US oil interests.

It’s a dirty business and it’s in no way cleaning up it’s act. Having seen both the top and bottom sides in action under the same US oil company pulling dangerous stunts in UK waters I can say that I personally would not have any involvment with them again, if I could possibly avoid them.

Joe K February 27, 2021 11:01 AM

@name.withheld.for.obvious.reasons

The level at which information and data is abstracted into knowledge
and insight that has the capability to drive events and actions in
meatspace seems not well understood.

I wonder if you could rephrase this more simply: “The conversion of data to information is poorly understood.”

name.withheld.for.obvious.reasons February 27, 2021 4:26 PM

@ Joe K
Thank you for the suggestion, it really is a bit complex. Data and information can be processed, Knowledge can only be expressed. Knowledge processing, at least from a human non-machine learning perspective, is to my thinking the ability to take data and information and apply other knowledge to a problem to compute a solution. There is no direct path for the data and information layer to the knowledge layer. I see them as complete domains, now that does mean that one cannot use knowledge to acquire data and information–there are feedback mechanisms in cognition (forward and back).

I think of Prolog, Smalltalk, Lisp, Forth, and the like(some might suggest C++ or Ada but not me) as pseudo examples in formal computing languages. Hadoop systems that are combination of data processing models such as Kalman filters and analysis processes based on Bayesian networks under big data an unstructured data modeling but I fail to see and “intelligence” in these systems.

In fact, I see an example of AI everyday when I awake…

It is looking back at me in the mirror.

I don’t understand that humanity has yet to discover intelligence, our operable mode of intellectual thinking is questionable given history. Humanity is still in the process of becoming intelligent, as in not yet intelligent.

name.withheld.for.obvious.reasons February 27, 2021 4:59 PM

@ ResearcherZero
Have you been in conversation with Thomas Drake and William Binney–it sure seems like it.

If I didn’t know any better (and I probably don’t), I’d say you are an agent of the the deep steak.

SpaceLifeForm February 27, 2021 6:31 PM

@ name.withheld.for.obvious.reasons

I’ve found that Deep Steak may require a lot of A1 sauce.

SpaceLifeForm February 27, 2021 7:14 PM

@ Clive, ALL

Please note that Bitcoin and other CryptoCurrency is actually Data treated as a Commodity.

It may be that it actually is a Comm and an Oddity at the same time.

Think. Think outside the box.

Clive Robinson February 27, 2021 11:48 PM

@ SpaceLifeForm, name.withheld…,

I’ve found that Deep Steak may require a lot of A1 sauce.

What colour is that sauce 😉

@ name.withheld…, Joe K,

Data and information can be processed, Knowledge can only be expressed.

Three terms that get muddled in their meanings almost every time they are uttered. Worse they are used by different “fields of knowledge” or “Research domains” differently. I do not know just how many times I’ve seen “information” warped this way and that, “data” slightly less so, but “knowledge” is now so ill defined it hardly fits the criteria for nebulous.

@ SpaceLifeForm,

Please note that Bitcoin and other CryptoCurrency is actually Data treated as a Commodity.

No it’s treated the same way as a serial number on a bank note or certificate of authentication.

That is it is an “abstraction that is traded” instead of trading the commodity (have a look at the way the precious metal bullion markets actually work). It’s the only practical way to have a “fututes market”, and as we know a futures market is actually about “fraud” because the contracts that are traded are derived from certificates that do not actually exist or are owned by the contract issuer… A revelation “Game Stop” brought to the world in general, which is probably the real reason the person alleged to have started it, is being sued by those who’s dirty laundry he exposed so publicly…

Winter February 28, 2021 9:07 AM

@name.
“Data and information can be processed, Knowledge can only be expressed.”

We can look at it from a behaviorist standpoint. Data and information are measurable as negative entropy. Knowledge can be observed as “goal directed actions”.

Information is like a map of the terrain. Knowledge is the route to your destination.

AI can express knowledge in that it can navigate you to your destination (real or metaphorically) using the information of a map (real or metaphorically).

In the infernal world of IP, knowledge is the patent, and the map is the copyright.

In the modern world (since 1900 or so), both copyright and patents are there to prevent progress. Especially, progress of poor people.

vas pup February 28, 2021 5:14 PM

@Bruce:
“a baseline federal privacy law should directly protect the abiding interest that individuals have in that information and also enable the social benefits that flow from sharing information.”

1.The most important in such law should clear content understandable for average person because it is applied to each and every person.
2.What is priority? Either ‘protect the abiding interest that individuals’ or ‘social benefits that flow from sharing information’? In Europe former is the priority, here is latter, unfortunately.
When you contact with any business it should you and only you decided level and form of sharing, not corporation as e.g. Amazon, Google. Meaning ‘opt out’ is should be default , but not ‘opt in’.
3.Privacy policy in clear English and not legalize. Exceptions of protection citizen’s intercept ALL established by ‘federal privacy law’, not by law office of the Company.
4. In public areas you may have very limited expectation of privacy, but not where privacy is expecting by any reasonable person: toilet, dressing room, medical office, you name it.
5. Xref data on person of private companies and government data bases should have very strict rules established in the Law and known by public: who? when? how deep? etc.

That is just some thoughts.

SpaceLifeForm March 1, 2021 1:04 AM

@ Clive, name.withheld.for.obvious.reasons

Tulips

https://finance.yahoo.com/news/crypto-long-short-coinbase-going-220229942.html

The figures are indeed eye-opening: in the fourth quarter of 2020, the number of verified users on Coinbase’s platform reached 43 million after adding almost 45,000 new users a day. The average number of monthly transacting users grew by over 30% in the fourth quarter alone, to 2.8 million.

Also eye-opening is the inflow of institutional investors, something that we’ve talked about often in this column. Over the fourth quarter, institutional trading volume grew over 110% to $57 billion, while retail trading volume grew by almost 80%. The company services 7,000 institutional accounts.

Cassandra March 1, 2021 3:42 AM

@Clive Robinson

Re: Banksy trademark

I think (and I could be wrong), that at least one of the issues with the failed trademark claim was that Banksy was unwilling to let the intellectual property office know his real identity. I would guess that the office didn’t want to get into the process of anonymous authentication, preferring government issued identification.

Cassandra

Clive Robinson March 1, 2021 5:46 AM

@ Cassandra, JonKnowsNothing

The Banksy issue was raised abovr by @JonKnowsNothing.

However the point you raise of,

I think (and I could be wrong), that at least one of the issues with the failed trademark claim was that Banksy was unwilling to let the intellectual property office know his real identity.

Arises because people create legislation without sufficient thought and “The law of unintended consequences” dog piles in on it.

In essence there is a very stupid assumption that if you are or wish to be anonymous you can not legally have rights.

Taken to a logical conclusion it would be OK to take somebodies life away because they did not have their name tattooed on their forehead…

Thus again logically the notion of “privacy” can not exist either.

Society is not a goldfish bowl, nor should it be forced to be to keep imbecilic bureaucrats happy.

As has been pointed out in the past,

“Rules are for the obayance of fools and the guidence of wise men”

When supposadly wise men make judgments of fools, then others who are wise men know where things are going to go or atleast see the pathways trod before and where they led and thus why they should not be travelled again…

Cassandra March 1, 2021 8:48 AM

@JonKnowsNothing, @Clive Robinson

Apologies JonKnowsNothing – I did not intend to misattribute your work to Clive Robinson.

Clive Robinson, as for your thoughts on the rights of the anonymous, I am in substantial agreement. Governments are not very good at separating identity and authority. The current model is:

(1) You can do X because you are Fred.

What it should be is:

(a) The holder of unique token ‘alpha’ can do X
(b) Fred owns/controls/is connected to token ‘alpha’

By adding one or more levels of indirection, you can add anonymity, and if token ‘alpha’ is misused/compromised in some way, it can be stripped of its rights and a new token (‘beta’) issued, so Fred is less susceptible to identity theft. It is difficult to change your name, date-of-birth, fingerprints, etc, but providing a new token and invalidating the old one is easier.

To a certain extent, websites that use your email address as user-id have partially implemented this: the more savvy Internet users supply a different email address for each web-site, so have multiple ’email identity’ tokens hanging off their real identity. The ‘Social Media’ organisations attempt to prevent you from doing this by tying identities to (mobile) phone numbers and not allowing multiple identities, or anonymous identities (which has caused (some) operational problems for (some) intelligence operatives).

To be sure, adding layers of indirection do add layers of complexity: but there is an old programming saw (stated (bit not necessarily originated) by Butler Lampson*) which states that: “All problems in computer science can be solved by another level of indirection” (it has a corollary, “…except for the problem of too many layers of indirection.”)

I take the view that adding layers of indirection to systems from the beginning is a good thing: they can be removed later quite easily. What is difficult is shoehorning indirection into systems not originally set up for it. Much like starting a programming session on ICL VME SCL with a handful of BEGIN statements because if you entered one END too many, you got logged out losing all your session context.

Mapping out systems interactions/dependencies/relations to work out where best to add indirection is a non-trivial exercise. A bit like analysing a data set to generate a schema in 3rd Normal Form/BCNF/Elementary Key Normal Form. As a general problem it is probably NP-complete, if not NP-hard.

Time to stop wittering.

Cassandra

*Lampson states it was not he who originated the phrase, but David Wheeler.

JonKnowsNothing March 1, 2021 10:45 AM

@Cassandra @Clive

re: Attribution of Who is Who

Current world governments are very adapt at knowing Who is Who yet somehow this system completely fails when it comes to some like:

  • Who owns the cattle stuck on the ship stuck in port in Italy?

The similar ID problem happened with the explosion in Beirut. There they sort of know who owned stuff but there is a legal limbo.

A same limbo zone happens with orphaned ships stuck at sea with sailors on board but the ship owners are no longer interested or directing operations. Primarily whatever the ship was carrying is now of no economic value, so the ships are left with “no direction, no money and stranded sailors”.

These identity problems follow a similar path as Off Shored Money.

  A:   “Who owns this?”
  B:   “I know who owned it yesterday…”

ht tps://www.theguardian.com/environment/2021/feb/20/calls-for-vets-to-be-sent-to-cattle-ships-stranded-at-sea-since-december-italy-cyprus

ht tps://www.theguardian.com/world/2021/feb/27/cattle-stranded-on-ship-in-mediterranean-must-be-destroyed-say-vets

ht tps://www.theguardian.com/environment/2021/feb/17/crew-of-oil-tanker-beached-off-uae-to-go-home-after-four-years-stranded-at-sea

xcv March 2, 2021 11:03 PM

The argument for “treating data as a commodity” is that the transport of it via “common carrier” on an “internet neutrality” platform — as opposed to NSA-style “traffic shaping” for “content delivery” — is a commodity measured in bits and bytes — units of information, where

1 bit of information = 9.56992629 × 10^-24 m^2 kg s^-2 K^-1

if it is measured as a physical constant in the “SI” or metric system.

I recently bought a pocket scale that weighs in several different units of measure, including grams, carats, pennyweight, ounces (both troy and avoirdupois), grains, and a Taiwanese unit of measure (臺兩), where

1 臺兩 = 1/16 臺斤 = 37.5g

Ollie Jones March 3, 2021 8:01 AM

Thanks for posting this, Dr. S.

The authors have a point: treating records of data as having monetary value has its limits as a way to build privacy. Individual records’ values are highly variable. Many kinds of records are far more valuable when aggregated with others.

Obvious example: my street address is a low-value easy-to-obtain record. But, when combined with records of me viewing web sites describing new cars, my address becomes dramatically more valuable to some people (those who want to sell me a car).

Another example: the old rob-the-bereaved scam. Enterprising thieves can read obituaries to find out names and times for wakes and funerals. Low value data. Then can then obtain street addresses for the relatives of the newly deceased. Low value data. They can combine the two and housebreak during funerals. Higher value data. (Police have been alert to this scam for centuries, so don’t try it.)

It’s the aggregation that creates the value. Automated aggregation creates even more value. (We know that.)

So, how about this? Treat caches of data as if they were inherently dangerous, like water in ponds behind private dams. The common-law idea of “strict liability” applies. If a farmer’s dam breaks and floods a downstream house, that farmer is strictly liable for repairs. She must pay for repairs, without any need to prove negligence or malice first.

We already use this legal concept for workers’ compensation and for vaccine injuries. Why not for data breaches? If I have a cache of 145megarecords and it leaks, I’m responsible for making all injured people whole. Full stop. (I would be further liable for punitive damages if I were negligent, but that takes lawsuits and proof.)

This approach would put the responsibility for protecting caches of secrets squarely on the people who keep those caches. It would incentivize us (I work at a place that keeps some records) to store less data, and to store it more securely.

xcv March 3, 2021 8:53 PM

@Ollie Jones

Another example: the old rob-the-bereaved scam. Enterprising thieves can read obituaries to find out names and times for wakes and funerals. Low value data. Then can then obtain street addresses for the relatives of the newly deceased. Low value data. They can combine the two and housebreak during funerals. Higher value data. (Police have been alert to this scam for centuries, so don’t try it.)

It’s the aggregation that creates the value. Automated aggregation creates even more value. (We know that.)

Use of interstate commerce facilities in the commission of murder-for-hire
18 U.S. Code § 1958 ¶¶(a), (b)(2)

(a) Whoever travels in or causes another (including the intended victim) to travel in interstate or foreign commerce, or uses or causes another (including the intended victim) to use the mail or any facility of interstate or foreign commerce, with intent that a murder be committed in violation of the laws of any State or the United States as consideration for the receipt of, or as consideration for a promise or agreement to pay, anything of pecuniary value, or who conspires to do so, shall be fined under this title or imprisoned for not more than ten years, or both; and if personal injury results, shall be fined under this title or imprisoned for not more than twenty years, or both; and if death results, shall be punished by death or life imprisonment, or shall be fined not more than $250,000, or both.__

(b) As used in this section and section 1959— … (2) “facility of interstate or foreign commerce” includes means of transportation and communication;

Obviously it is the goal of the statute to cover the use of encrypted chats and the like on the internet to arrange, plan, and coordinate mob hits, assassinations, and other homicides.

But there is a larger issue with queasy judges who hem and haw at the reliance on the interstate commerce clause of the Constitution to establish the jurisdiction for a possible federal death penalty case.

That’s the same clause of the Constitution on which the feds rely to impose gun control, with the same punishments for mere possession of an otherwise legal firearm or ammunition by a non-professional, when the person is not a criminal and the firearm itself is securely stored or carried for self-defense, lawful hunting, etc., and clearly not part of a plan to commit murder, especially on a professional basis. The lazy prosecutor’s attitude in such cases is then to impose a lifetime cheap copper’s gun ban on “prohibited persons,” social undesirables, mental defectives, etc., just to “get the guns off the street” with a wink and a nod to the real professional killers and hit men who “work” on behalf of city hall and the Establishment to eliminate and do away with the same social undesirables and mental defectives who are placed on their NICS hit list for general spite.

And the very same doctors who so often go to court with a mental health diagnosis to revoke the gun rights of their patients, and so aggressivly pull the teeth of poor people, are now prescribing and administering shots for COVID-19 vaccination on government-mandated basis — and there are already reports of healthy people becoming sick and dying after receiving compulsory COVID-19 immunization shots.

xcv March 8, 2021 1:51 AM

Regarding previous comment:

1 bit of information = 9.56992629 × 10^-24 m^2 kg s^-2 K^-1

This is based on the notion — a rather irrefutable notion at that — that one bit of information essentially represents a doubling of the state space of the universe, because there are effectively two universes: one universe in which the bit is cleared to 0, and the other universe in which the bit is set to 1, all other things being equal.

Boltzmann’s constant kB measures a unit increase in the natural logarithm of the thermodynamic probability of the state space of the universe. Taking logarithms to base 2 rather than base e, Landauer’s principle yields the equivalence of 1 bit of information entropy to kB×ln(2) thermodynamic entropy.

While in principle any irreversible computation may be reversed by storing all the intermediate steps and then unrolling the entire computation to its true state of beginning, save only the output, Landauer’s principle still establishes hard universal limits on the total space complexity of any computer program or algorithm.

If P=NP, then the polynomial hierarchy of bounded alternation (PH) collapses to P, the class of problems solvable in polynomial time on a deterministic Turing machine. It is known that PSPACE is equivalent to the class of problems solvable in polynomial time on a non-deterministic Turing machine with unbounded alternation between universal and existential quantification of non-deterministic branching.

Unbounded alternation is sufficiently powerful to reduce polynomial time to polynomial space, but it still isn’t clear that even PSPACE, let alone NP, properly contains P.

1&1~=Umm March 8, 2021 12:31 PM

@xcv:

“This is based on the notion — a rather irrefutable notion at that”

Have you considered the effect of Georg Cantor’s diagonal argument/proof of 1891 in your reasoning about the “set” of bits that go into making this universe?

Leave a comment

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.