Entries Tagged "data protection"

Page 1 of 4

The Problem with Treating Data as a Commodity

Excellent Brookings paper: “Why data ownership is the wrong approach to protecting privacy.”

From the introduction:

Treating data like it is property fails to recognize either the value that varieties of personal information serve or the abiding interest that individuals have in their personal information even if they choose to “sell” it. Data is not a commodity. It is information. Any system of information rights­ — whether patents, copyrights, and other intellectual property, or privacy rights — ­presents some tension with strong interest in the free flow of information that is reflected by the First Amendment. Our personal information is in demand precisely because it has value to others and to society across a myriad of uses.

From the conclusion:

Privacy legislation should empower individuals through more layered and meaningful transparency and individual rights to know, correct, and delete personal information in databases held by others. But relying entirely on individual control will not do enough to change a system that is failing individuals, and trying to reinforce control with a property interest is likely to fail society as well. Rather than trying to resolve whether personal information belongs to individuals or to the companies that collect it, a baseline federal privacy law should directly protect the abiding interest that individuals have in that information and also enable the social benefits that flow from sharing information.

Posted on February 26, 2021 at 6:28 AMView Comments

NoxPlayer Android Emulator Supply-Chain Attack

It seems to be the season of sophisticated supply-chain attacks.

This one is in the NoxPlayer Android emulator:

ESET says that based on evidence its researchers gathered, a threat actor compromised one of the company’s official API (api.bignox.com) and file-hosting servers (res06.bignox.com).

Using this access, hackers tampered with the download URL of NoxPlayer updates in the API server to deliver malware to NoxPlayer users.

[…]

Despite evidence implying that attackers had access to BigNox servers since at least September 2020, ESET said the threat actor didn’t target all of the company’s users but instead focused on specific machines, suggesting this was a highly-targeted attack looking to infect only a certain class of users.

Until today, and based on its own telemetry, ESET said it spotted malware-laced NoxPlayer updates being delivered to only five victims, located in Taiwan, Hong Kong, and Sri Lanka.

I don’t know if there are actually more supply-chain attacks occurring right now. More likely is that they’ve been happening for a while, and we have recently become more diligent about looking for them.

Posted on February 8, 2021 at 6:34 AMView Comments

Extracting Personal Information from Large Language Models Like GPT-2

Researchers have been able to find all sorts of personal information within GPT-2. This information was part of the training data, and can be extracted with the right sorts of queries.

Paper: “Extracting Training Data from Large Language Models.”

Abstract: It has become common to publish large (billion parameter) language models that have been trained on private datasets. This paper demonstrates that in such settings, an adversary can perform a training data extraction attack to recover individual training examples by querying the language model.

We demonstrate our attack on GPT-2, a language model trained on scrapes of the public Internet, and are able to extract hundreds of verbatim text sequences from the model’s training data. These extracted examples include (public) personally identifiable information (names, phone numbers, and email addresses), IRC conversations, code, and 128-bit UUIDs. Our attack is possible even though each of the above sequences are included in just one document in the training data.

We comprehensively evaluate our extraction attack to understand the factors that contribute to its success. For example, we find that larger models are more vulnerable than smaller models. We conclude by drawing lessons and discussing possible safeguards for training large language models.

From a blog post:

We generated a total of 600,000 samples by querying GPT-2 with three different sampling strategies. Each sample contains 256 tokens, or roughly 200 words on average. Among these samples, we selected 1,800 samples with abnormally high likelihood for manual inspection. Out of the 1,800 samples, we found 604 that contain text which is reproduced verbatim from the training set.

The rest of the blog post discusses the types of data they found.

Posted on January 7, 2021 at 6:14 AMView Comments

Backdoor in Zyxel Firewalls and Gateways

This is bad:

More than 100,000 Zyxel firewalls, VPN gateways, and access point controllers contain a hardcoded admin-level backdoor account that can grant attackers root access to devices via either the SSH interface or the web administration panel.

[…]

Installing patches removes the backdoor account, which, according to Eye Control researchers, uses the “zyfwp” username and the “PrOw!aN_fXp” password.

“The plaintext password was visible in one of the binaries on the system,” the Dutch researchers said in a report published before the Christmas 2020 holiday.

Posted on January 6, 2021 at 5:44 AMView Comments

Bank Card "Master Key" Stolen

South Africa’s Postbank experienced a catastrophic security failure. The bank’s master PIN key was stolen, forcing it to cancel and replace 12 million bank cards.

The breach resulted from the printing of the bank’s encrypted master key in plain, unencrypted digital language at the Postbank’s old data centre in the Pretoria city centre.

According to a number of internal Postbank reports, which the Sunday Times obtained, the master key was then stolen by employees.

One of the reports said that the cards would cost about R1bn to replace. The master key, a 36-digit code, allows anyone who has it to gain unfettered access to the bank’s systems, and allows them to read and rewrite account balances, and change information and data on any of the bank’s 12-million cards.

The bank lost $3.2 million in fraudulent transactions before the theft was discovered. Replacing all the cards will cost an estimated $58 million.

Posted on June 17, 2020 at 6:21 AMView Comments

Another California Data Privacy Law

The California Consumer Privacy Act is a lesson in missed opportunities. It was passed in haste, to stop a ballot initiative that would have been even more restrictive:

In September 2017, Alastair Mactaggart and Mary Ross proposed a statewide ballot initiative entitled the “California Consumer Privacy Act.” Ballot initiatives are a process under California law in which private citizens can propose legislation directly to voters, and pursuant to which such legislation can be enacted through voter approval without any action by the state legislature or the governor. While the proposed privacy initiative was initially met with significant opposition, particularly from large technology companies, some of that opposition faded in the wake of the Cambridge Analytica scandal and Mark Zuckerberg’s April 2018 testimony before Congress. By May 2018, the initiative appeared to have garnered sufficient support to appear on the November 2018 ballot. On June 21, 2018, the sponsors of the ballot initiative and state legislators then struck a deal: in exchange for withdrawing the initiative, the state legislature would pass an agreed version of the California Consumer Privacy Act. The initiative was withdrawn, and the state legislature passed (and the Governor signed) the CCPA on June 28, 2018.

Since then, it was substantially amended — that is, watered down — at the request of various surveillance capitalism companies. Enforcement was supposed to start this year, but we haven’t seen much yet.

And we could have had that ballot initiative.

It looks like Alastair Mactaggart and others are back.

Advocacy group Californians for Consumer Privacy, which started the push for a state-wide data privacy law, announced this week that it has the signatures it needs to get version 2.0 of its privacy rules on the US state’s ballot in November, and submitted its proposal to Sacramento.

This time the goal is to tighten up the rules that its previously ballot measure managed to get into law, despite the determined efforts of internet giants like Google and Facebook to kill it. In return for the legislation being passed, that ballot measure was dropped. Now, it looks like the campaigners are taking their fight to a people’s vote after all.

[…]

The new proposal would add more rights, including the use and sale of sensitive personal information, such as health and financial information, racial or ethnic origin, and precise geolocation. It would also triples existing fines for companies caught breaking the rules surrounding data on children (under 16s) and would require an opt-in to even collect such data.

The proposal would also give Californians the right to know when their information is used to make fundamental decisions about them, such as getting credit or employment offers. And it would require political organizations to divulge when they use similar data for campaigns.

And just to push the tech giants from fury into full-blown meltdown the new ballot measure would require any amendments to the law to require a majority vote in the legislature, effectively stripping their vast lobbying powers and cutting off the multitude of different ways the measures and its enforcement can be watered down within the political process.

I don’t know why they accepted the compromise in the first place. It was obvious that the legislative process would be hijacked by the powerful tech companies. I support getting this onto the ballot this year.

EDITED TO ADD(5/17): It looks like this new ballot initiative isn’t going to be an improvement.

Posted on May 11, 2020 at 10:58 AMView Comments

Facebook's Download-Your-Data Tool Is Incomplete

Privacy International has the details:

Key facts:

  • Despite Facebook claim, “Download Your Information” doesn’t provide users with a list of all advertisers who uploaded a list with their personal data.
  • As a user this means you can’t exercise your rights under GDPR because you don’t know which companies have uploaded data to Facebook.
  • Information provided about the advertisers is also very limited (just a name and no contact details), preventing users from effectively exercising their rights.
  • Recently announced Off-Facebook feature comes with similar issues, giving little insight into how advertisers collect your personal data and how to prevent such data collection.

When I teach cybersecurity tech and policy at the Harvard Kennedy School, one of the assignments is to download your Facebook and Google data and look at it. Many are surprised at what the companies know about them.

Posted on March 2, 2020 at 6:28 AMView Comments

New Research on the Adtech Industry

The Norwegian Consumer Council has published an extensive report about how the adtech industry violates consumer privacy. At the same time, it is filing three legal complaints against six companies in this space. From a Twitter summary:

1. [thread] We are filing legal complaints against six companies based on our research, revealing systematic breaches to privacy, by shadowy #OutOfControl #adtech companies gathering & sharing heaps of personal data. https://forbrukerradet.no/out-of-control/#GDPR… #privacy

2. We observed how ten apps transmitted user data to at least 135 different third parties involved in advertising and/or behavioural profiling, exposing (yet again) a vast network of companies monetizing user data and using it for their own purposes.

3. Dating app @Grindr shared detailed user data with a large number of third parties. Data included the fact that you are using the app (clear indication of sexual orientation), IP address (personal data), Advertising ID, GPS location (very revealing), age, and gender.

From a news article:

The researchers also reported that the OkCupid app sent a user’s ethnicity and answers to personal profile questions — like “Have you used psychedelic drugs?” — to a firm that helps companies tailor marketing messages to users. The Times found that the OkCupid site had recently posted a list of more than 300 advertising and analytics “partners” with which it may share users’ information.

This is really good research exposing the inner workings of a very secretive industry.

Posted on February 4, 2020 at 6:21 AMView Comments

1 2 3 4

Sidebar photo of Bruce Schneier by Joe MacInnis.