Data and Goliath: The Hidden Battles to Collect Your Data and Control Your World

  • Mara Paun
  • Law, Innovation & Technology
  • May 2018

Data and Goliath is Bruce Schneier’s most recent book. Published in 2015, the book addresses the issues arising from governments’ and corporations’ great capabilities of mass surveillance, and the dangers they bring about. As Schneier aptly puts it, “[w]e live in the golden age of surveillance,” and this affects both our security as well as our freedoms (4). The book is meant to convey an eye-opening message: we need to change the status quo, and we need to do it soon.

Since its publication, Data and Goliath has been recognised as being a thought-provoking and compelling book about the reality of surveillance, leading Malcolm Gladwell[1] to state that ‘The public conversation about surveillance in the digital age would be a good deal more intelligent if we all read Bruce Schneier first’. This points exactly to the type of book Data and Goliath is. It is a book for everyone to read—no matter what their background or age, readers will be able understand the magnitude of the issue presented and its message. The writing style is informal, without jargon, and the book is well explained and filled with supporting examples, without simplifying the complexity at stake. For those interested in a more in-depth analysis, more than 100 pages of notes are available, containing extra commentaries as well as numerous and various sources. Moreover, updates to the book are posted on Schneier’s website page dedicated to Data and Goliath.[2] Although Schneier writes with a strong US bias, the societal and technological issues identified are generally valid; international angles are also provided, for instance by reference to European law, international organizations or situations taking place outside the borders of the US.

In terms of structure, the book consists of 16 chapters, divided over three parts: (1) “The world we’re creating”; (2) “What’s at stake”; and (3) “What to do about it.” This structure is helpful in comprehending the bigger picture, guided by the title of each part. Furthermore, every single section in Data and Goliath has a lesson, usually concisely expressed in the last sentence, which could be called a “take-away sentence.” This approach makes it both inspiring and easier to grasp for laymen as well as professionals.

Part 1 describes the role of data in a surveillance society. It starts by defining data as a by-product of computing, emphasizing that access to the data surrounding a person means access to that person (Chapter 1) (19). Subsequently, it explains how data can be used as surveillance (Chapter 2), emphasizing that content data collection is invasive, and metadata equally so. On the basis of these data, conclusions are drawn which can impact our lives without our even knowing or having the opportunity to object (20). Chapter 3 moves to the analysis of data and explains the correlations made possible by Big Data analytics, and how even large sets of anonymous data can presently be de-anonymized (43). It concludes by stating that more robust techniques for preserving anonymity are needed (45). The business of surveillance is discussed in Chapter 4, which focuses on corporate surveillance and the exchange of our data for services. It explains Internet tracking via cookies, behavioral advertising as well as the data broking industry (47). While doing so, it studies power concentrations in this industry and loss of control of users over their computer environment (56, 59). Chapter 5 turns to government surveillance, with a focus on the US agencies and the Snowden revelations (62), while also looking at other governments’ practices, for instance the UK or Germany. A very important point made in Chapter 6 is that corporate surveillance and government surveillance are intertwined, supporting one another (79). This is because corporate and government surveillance interests have converged, meaning that both want to know everything about everyone, albeit for different reasons—this leads to strong public—private security/surveillance partnerships (25).

Part 2 discusses the interrelated harms arising from ubiquitous mass surveillance. Chapter 7 deals with its effects on political liberty and justice and starts with accusations by inferences from data. It emphasizes that we should not have to worry about how our daily actions or data trails might be interpreted by the government (94)—this fear may lead to chilling effects (95) that are damaging to political discourse (97). The discussion turns to whistle-blowers and their lack of protection when they disclose illegal government practices, compared to the protection they receive when they do so against corporations (at least under US law) (101). A second harm identified is related to commercial fairness and equality (Chapter 8), including the themes of surveillance-based discrimination, such as in pricing or credit scores (109—110), surveillance-based manipulation, for instance the display of the “I Voted” icon during elections or the phenomenon of the “filter bubble” (114—115), and privacy breaches where customer data are stolen (116). Business competitiveness in the US is also affected by National Security Agency (NSA) surveillance (Chapter 9), since customers choose non-US cloud providers, computers or networking equipment to escape US surveillance, or simply because people do not trust US companies (121). Chapter 10 turns to harms caused to privacy and the misconception that privacy is about having something to hide (125). It continues by explaining that not all privacy violations are equally damaging, and people in marginalized socioeconomic situations may be affected to a larger extent (127). Furthermore, it clarifies that automatic collection and analysis of data already imply harm to privacy, and that even the possibility that a human being might look at the collected data or guide algorithms that process those data already qualifies as surveillance (130—131). The last chapter of Part 2 is about security. It talks about data mining as a means to “connect the dots” and its doubtful efficiency in terrorism detection (136—140), the value of encryption (143—144) as well as the risks of maintaining an insecure Internet, with backdoors for surveillance, or by stockpiling vulnerabilities—while these can be used by intelligence agencies to protect national security, they can be used as an entry door for cyber-attacks as well (146—150).

Part 3 is about what we can do about the world that we have created, and it is addressed to governments, corporations as well as citizens. It begins by establishing general principles to apply in the context of surveillance (Chapter 12): security and privacy (there should not be a trade-off between security and privacy, and we should maintain them both) (155—157), security over surveillance (the designing of systems with minimum surveillance necessary to function, while retaining data for the shortest time possible) (157—158), transparency (159), oversight and accountability (161), and resilient design (if systemic imperfections are unavoidable, we need to design technology to work despite them) (164). The next three chapters suggest solutions. Firstly, Chapter 13 offers solutions for government, inter alia, transferring the traditional law enforcement transparency principles to national security (171), improving oversight (172), protecting whistle-blowers (178), collecting data more narrowly and only with judicial approval (179), and limiting the military’s role in cyberspace (185). Some interesting and bold proposals are made, such as breaking up the NSA and restoring the agencies’ responsibilities prior to 9/11, and establishing commons on the Internet (places not controlled by private parties) (186—189). Secondly, regarding solutions for companies (Chapter 14), suggestions are made to make companies liable for privacy breaches (191), regulating data collection (195) as well as use (197), introducing a sort of data minimization (199), and giving individuals rights to their data (200). In these sections, the EU data protection framework is mentioned as a leading example of data protection regulation (191). Furthermore, companies are called upon to fight government surveillance and support litigation efforts against warrants (207). Thirdly, solutions for “the rest of us” are offered, both in the context of our private lives (such as privacy enhancing technologies or choosing the provider we trust most) as well as in the context of asking for political change (for instance, engaging with legislators, supporting relevant non-governmental organizations), and, most importantly, not giving up, because “fatalism is the enemy of change” (224, Chapter 15).

The concluding chapter (Chapter 16) discusses social norms and the big data trade-off, and it contains suggestions for changes in the attitude of individuals in order to get beyond the surveillance society. Schneier advocates that we need to recalibrate our fear (of terrorism) (227)—privacy is not to be traded in exchange for security, but needs to be protected, thereby achieving real security (233). Furthermore, we need social norms to tell us when and how to use the information we have about another person in times when the ephemeral nature of communications is displaced by the semi-permanent nature of the Internet (230). Regarding Big Data, Schneier identifies the tension between group value and individual value, and argues that society needs to find a “Nash equilibrium” for data collection (237). The last few lines of the book contain a powerful message regarding future generations—we should try to make them proud (238).

From a layman’s perspective, Data and Goliath sets the scene for a public discussion on surveillance. From an academic perspective, the multitude of situations described as illustrations of a surveillance society are indeed relevant and compelling, however a more thorough analysis could have been applied to them. For instance, although there are references to laws and other theories such as the “filter bubble,” the level of analysis could have been raised. For instance, as a European data protection scholar, there was a particular phrase that specifically came to my attention which might convey an inaccurate impression to the reader: “Unlike the EU, in the US today personal information about you is not your property; it’s owned by the collector” (195). While in the EU, the regulatory approach does emphasize control of the data subject over the processing of information about them as part of the fundamental right to the protection of personal data, this has not reached the level of property over personal information.[3]

However, the fact remains that Schneier does an incredible job of bridging discussions from academia and the public at large, and delivers a book that is an accessible and highly informative reading for both types of audiences.


1 Malcolm Gladwell is the author of David and Goliath: Underdogs, misfits, and the art of battling giants (Little, Brown & Co., 2013).

2 (accessed 21 March 2018).

3 This review is not the right place to go into the debate relating to property over personal data, see, for instance, N Purtova, Property rights in personal data: a European perspective (Uitgeverij BOXPress, 2011) ISBN978-90-8891-235-1.

Categories: Data and Goliath, Text

Sidebar photo of Bruce Schneier by Joe MacInnis.