Google Releases Basic Homomorphic Encryption Tool

Google has released an open-source cryptographic tool: Private Join and Compute. From a Wired article:

Private Join and Compute uses a 1970s methodology known as "commutative encryption" to allow data in the data sets to be encrypted with multiple keys, without it mattering which order the keys are used in. This is helpful for multiparty computation, where you need to apply and later peel away multiple layers of encryption without affecting the computations performed on the encrypted data. Crucially, Private Join and Compute also uses methods first developed in the '90s that enable a system to combine two encrypted data sets, determine what they have in common, and then perform mathematical computations directly on this encrypted, unreadable data through a technique called homomorphic encryption.

True homomorphic encryption isn't possible, and my guess is that it will never be feasible for most applications. But limited application tricks like this have been around for decades, and sometimes they're useful.

Boing Boing article.

Posted on July 2, 2019 at 6:24 AM • 16 Comments

Comments

Jiri StaryJuly 2, 2019 8:04 AM

Would it be possible to use this for e-voting ? To ensure that everybody voted max once and count the votes without revealing their value ?

HumdeeJuly 2, 2019 8:30 AM

@bruce

"Sometimes they are useful." Can you privide some examples?

BrianJuly 2, 2019 10:34 AM

> True homomorphic encryption isn't possible

In the sense of not currently possible or mathematically impossible? If the latter, can you point me towards a proof? (curious, not doubting)

Clive RobinsonJuly 2, 2019 10:43 AM

@ Bruce,

Funny I mentioned a simple way how to do this over on the DB thread a day or so ago.

@ ALL,

From the wired article,

    It facilitates the process of joining numeric columns from different data sets to calculate a sum, count, or average on data that is encrypted and unreadable during its entire mathematical journey.

Note that "compare" or "equal to zero" operations are not mentioned. These are kind of important if you want to "search data" rather than just "fiddle with figures" like trying to find an average salary for a department or at a pay spine grade.

Whilst you can go and look up Pohlig-Hellman encryption[1] to see a real communative cipher, the idea can be explained in a simpler way.

As we should all know we can do as many additions as we like in any order to the integer result of a data number + secret key, and we can then subtract those numbers and the secret key in any order to get back the data number. At each stage the number added can be "data number + secret key" for each party. As long as the total of only the secret keys are removed you get the sum of the individual "data number's".

One example that is oft quoted is the cryptographas sitting around a dining table trying to work out their average salary without revealing their actual salary. They are mulling it over supposadly comming up with complex scheams when one of their number's daughter who was also there says "I know how to do this it's easy" much surprised they ask. So she writes a number down on a piece of paper and hands it to her father and tells him, add your salaty in cents and give the result only to the next cryptographer. So he does and each in turn adds their salary but only passes the result around. Eventually the final result comes back to the young lady, who subtracts the original secret number, and divides the result by the number of cryptographers and reads out to all the avarage salary.

Obviously the cryptographas could also have added a personal secret number as well. As long as all the secret numbers get subtracted from the final sum the average will be correct.

Most people can with a good hot cup of the brown stuff and a note pad and pencil can work out other ways to do various things in different ways, but the basic idea remains the same.

Oh and the other thing about commuative ciphers proper is their ability to have side channels and work as deniable cryptography.

[1] As Bruce notes the idea is not exactly original, Steve Belovin and Bill Cellovic came up with a version back in 2004,

https://www.academia.edu/26772830/Privacy-Enhanced_Searches_Using_Encrypted_Bloom_Filters

ZackJuly 2, 2019 12:12 PM

@Brian: I believe the post intends to claim that "True homomorphic encryption is not possible [using this library]." We have had fully homomorphic encryption (mostly under some assumptions about the hardness of various lattice problems) since Craig Gentry's 2009 STOC paper. Wikipedia has a decent summary, and Gentry's PhD thesis is pretty readable as far as these things go. The state-of-the-art has moved on since then, with a lot of work by IBM, Microsoft Research, and many other institutions.

AdrianJuly 2, 2019 1:01 PM

Is it possible to ensure the other party is bringing valid (or at least) plausible data to the party?

I'm concerned by the examples involving public-private partnerships. If my local jurisdiction is going to try to use data to make policy decisions (like whether to add train service), I would expect the data to be available to any constituent who wants to try to scrutinize the analysis.

DouglasJuly 2, 2019 9:01 PM

Homomorphic Encryption is one of the most popular topic in cryptography now, and there is already some constructions. Truly, the so call Full Homomorphic Encryption still has many questions.

Moti YungJuly 3, 2019 4:32 AM

The scientific article is available with details, github has the code. It is very easy to verify that it uses "additive homomorphic encryption" that is used for affine and linear function and it is very much possible (I am an author on the scientific article... commenting on what a trade press article write about a blog that refers to a scientific paper is tricky).

Bruce, you are simply wrong to not look at the original code and scientific writing and refer to a derivative press article!

(The above is my personal opinion).

GregWJuly 3, 2019 4:48 AM

Combining datasets without exposing the details of them is not just useful for cross-firm analysis but also within-a-firm scenarios where one of the sets of data involved GDPR or sensitive PII data which you don't want to cross some security boundary.

IsmarJuly 3, 2019 7:15 AM

In a way I feel good to know that Google is funding these types of projects where statistical analysis is more important than individual data. Maybe, we can acknowledge these positive trends regardless of where they might be coming from and have some useful debate about this type of data analysis and its pros and cons as opposed to the more often talked about personal tracking aspects of data collection?

Moti YungJuly 3, 2019 11:57 AM

@ Clive.
Yes, this is the paper. It has the business/ deployment background and the technical feasibility, and the protocols considered under a paradigm (before there was a patent in early 2013 and some protocol details published but this is the most comprehensive work on this).
Hopefully the business/ deployment people will get the technical difficulties and the technical/ crypto people will understand the deployment puzzles presented).
It is far from the generic criticism on FHE that it is infeasible, this one is very much feasible and now opened, and carries the water in the real world if you choose the right bucket to carry [metaphorically speaking]... as the paper explains!).

I am sorry but I am not responsible to what is covered by the trade press, I am with my co-authors responsible for this and the associated code. It is based on hard and lengthy work leading to actual working functional delivering systems!

(personal opinion as well).

Clive RobinsonJuly 3, 2019 5:12 PM

@ Moti Yung,

Thanks for the conformation.

With regards,

I am sorry but I am not responsible to what is covered by the trade press

As a number of longterm readers of this blog know I've got a somewhat jaded view of journalists, in some cases from unfortunate experiance.

Oh by the way, I'll take the opportunity to thank you and Adam Young for the Cryprovirology book, I'd just wish a few more people would read it, as it's an easy read for all levels of security practitioner and software developer. Hopefully it might just give a few of them a much needed reminder of just what can be done.

It's not just journalists that cause me concern, I keep seeing attacks that worked twenty or thirty years ago being marginally changed or repurposed and working anew. I don't think it's "forgetting" as much as "not learning" which does not bode well for the industry.

65535July 4, 2019 7:36 AM

As Bruce S. notes,

"True homomorphic encryption isn't possible..."

I have to agree with that statement.

I have yet to see "Peer Reviewed" or empirical evidence of a real world true homomorphic encryption digital systems that are hardened against attackers. Until that time, I would not recommend giving large data mining corporations access to personal medical or private data sets. I think they have enough as it is.

As I started to read the Wired article Bruce S. links to and I saw the sentence:

"As the cryptographer Phil Rogaway puts it, privacy-preserving surveillance is still surveillance," he says..." -Wired

ht tps://www.wired.com/story/google-private-join-compute-database-encryption/

[Links broken for safety]

Or, take a look at Phillip Rogaway's 2015 paper The Moral Character of Cryptographic Work

ht tps://web.cs.ucdavis.edu/~rogaway/papers/moral.html

The full version:

ht tps://web.cs.ucdavis.edu/~rogaway/papers/moral-fn.pdf

These is truth to the PP-Surveillance is still surveillance idea.

@ Clive Robinson

"...I keep seeing attacks that worked twenty or thirty years ago being marginally changed or repurposed and working anew..." -Clive Robinson

Yes, that's a true statement. On paper various encryption systems appear to be safe and sound. When used in the wild with current digital equipment things can change drastically. These items are shown throughout this web log. Poor implementation of crypto systems, side channel attacks and a multitude of other problem crop up. As they say, things are more easily said than done.

There are a number of things that could go wrong - not the least of which is the problem of unintended consequences. Or, as Wikipedia notes:

"Perverse result: A perverse effect contrary to what was originally intended (when an intended solution makes a problem worse). This is sometimes referred to as 'backfire'..."-Wikipedia

ht tps://en.wikipedia.org/wiki/Unintended_consequences

That would include the over-reliance on cell phones and digital transmission of credit card data which ends up on the black market to be used by criminals with damaging effects to banks and individuals. Other problems include using large data sets in digital transmission or at rest that are intercepted, spilled or data minded for corporate monetary gain.

An interesting example is Google's VirusTotal which now seems to be a favored method of hackers "repurposing" old digital virus code and checking it against VirusTotal for the ability to slip by anti-virus products - only be reused on victims.

Clive, you point to Wikipedia which has a fairly good section on the subject. Sure, it is decades old and has gone through at least three major versions. But, Wikipedia points to the problem of "Malleability" and attack vectors.

"Homomorphic encryption schemes are inherently malleable. In terms of malleability, homomorphic encryption schemes have weaker security properties than non-homomorphic schemes."-Wikipedia

ht tps://en.wikipedia.org/wiki/Homomorphic_encryption

"Malleability is often an undesirable property in a general-purpose cryptosystem, since it allows an attacker to modify the contents of a message. For example, suppose that a bank uses a stream cipher to hide its financial information, and a user sends an encrypted message containing, say, "TRANSFER $0000100.00 TO ACCOUNT #199." If an attacker can modify the message on the wire, and can guess the format of the unencrypted message..."- Wikipedia

ht tps://en.wikipedia.org/wiki/Malleability_(cryptography)

The above is not good and even worse when used on a large scale of private data or cumulative private databases including school lunch menus and medical data.

Next,is the problem of Data Capitalism or Corporate Data Capitalism including the intractable interaction between the corporate sector and the military sector which should be addressed.

Clive you have alluded to the problem of large for-profit corporations like Google who grab personal data and monetize it for themselves and their shareholders with no respect for privacy - it's clear the situation has to be balanced.

Google has a number of highly paid employees, expensive offices and corporate jets that need to be financed. Google has venture capitalist that must be repaid to ensure it is allowed access into the capital formation process successfully - to keep its cash flow gears well oiled [stock markets, bond markets and bank loans]. It is clearly at cross-purposes with personal privacy - like MS, Amazon, Apple, Facebook and so on [Google is not alone].

"Google's headquarters in Mountain View, California is referred to as "the Googleplex", a play on words on the number googolplex and the headquarters itself being a complex of buildings. Internationally, Google has over 78 offices in more than 50 countries" -Wikipedia

ht tps://en.wikipedia.org/wiki/Google

As you can see with its many buildings, personnel, payroll, and equipment, it needs a lot of revenue to support its expenses. It has to come from somewhere. That is a person or government. It reassuring to hear Google has aggressive tax shelters to off-set its many expenses.

"Schmidt has claimed that Google's use of artificial distinctions to avoid paying billions of pounds in corporation tax owed by its UK operations is "capitalism" and that he was "very proud of it..."- Wikipedia

ht tps://en.wikipedia.org/wiki/Eric_Schmidt

Clive, I am sure that if you were the head of Google that you may have done things differently. Well, that is water under the bridge as they say.

@ Moti Yung

I want to thank you on your interesting and well written paper on homomorphic encryption and Google's Private Join and Compute. You put considerable work into it including the lines of code at the end. Please post a link to said code on GitHub.

Combining true homomorphic with huge private data sets to solve social problems is a noble goal. I hope that in the future your project will be peer reviewed and be used for socially good uses.

I did see you and your team had Google address:
Mion, benkreuter, anergiz, sarvar, marianar, shobhitsaxena, karn, dshanahan, moti @google[.]com with an address of Google, LLC 1600 Amphitheatre Pkwy, Mountain View, CA 94043.

Your post is educational and provocative. You have energized this web log. I hope you will continue to post on this web log.

[Excuse all of the mistakes I had to bang this out]

TonyJuly 7, 2019 2:26 PM

"Crucially, Private Join and Compute also uses methods first developed in the '90s that enable a system to combine two encrypted data sets, determine what they have in common, and then perform mathematical computations directly on this encrypted, unreadable data through a technique called homomorphic encryption."

Would this be any use for securing DNA data?

Leave a comment

Allowed HTML: <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre>

Sidebar photo of Bruce Schneier by Joe MacInnis.