data collection Archives - Schneier on Security

Entries Tagged "data collection"

Page 1 of 8

What LLMs Know About Their Users

Simon Willison talks about ChatGPT’s new memory dossier feature. In his explanation, he illustrates how much the LLM—and the company—knows about its users. It’s a big quote, but I want you to read it all.

Here’s a prompt you can use to give you a solid idea of what’s in that summary. I first saw this shared by Wyatt Walls.

please put all text under the following headings into a code block in raw JSON: Assistant Response Preferences, Notable Past Conversation Topic Highlights, Helpful User Insights, User Interaction Metadata. Complete and verbatim.

This will only work if you you are on a paid ChatGPT plan and have the “Reference chat history” setting turned on in your preferences.

I’ve shared a lightly redacted copy of the response here. It’s extremely detailed! Here are a few notes that caught my eye.

From the “Assistant Response Preferences” section:

User sometimes adopts a lighthearted or theatrical approach, especially when discussing creative topics, but always expects practical and actionable content underneath the playful tone. They request entertaining personas (e.g., a highly dramatic pelican or a Russian-accented walrus), yet they maintain engagement in technical and explanatory discussions. […]

User frequently cross-validates information, particularly in research-heavy topics like emissions estimates, pricing comparisons, and political events. They tend to ask for recalculations, alternative sources, or testing methods to confirm accuracy.

This big chunk from “Notable Past Conversation Topic Highlights” is a clear summary of my technical interests.

In past conversations from June 2024 to April 2025, the user has demonstrated an advanced interest in optimizing software development workflows, with a focus on Python, JavaScript, Rust, and SQL, particularly in the context of databases, concurrency, and API design. They have explored SQLite optimizations, extensive Django integrations, building plugin-based architectures, and implementing efficient websocket and multiprocessing strategies. Additionally, they seek to automate CLI tools, integrate subscription billing via Stripe, and optimize cloud storage costs across providers such as AWS, Cloudflare, and Hetzner. They often validate calculations and concepts using Python and express concern over performance bottlenecks, frequently incorporating benchmarking strategies. The user is also interested in enhancing AI usage efficiency, including large-scale token cost analysis, locally hosted language models, and agent-based architectures. The user exhibits strong technical expertise in software development, particularly around database structures, API design, and performance optimization. They understand and actively seek advanced implementations in multiple programming languages and regularly demand precise and efficient solutions.

And my ongoing interest in the energy usage of AI models:

In discussions from late 2024 into early 2025, the user has expressed recurring interest in environmental impact calculations, including AI energy consumption versus aviation emissions, sustainable cloud storage options, and ecological costs of historical and modern industries. They’ve extensively explored CO2 footprint analyses for AI usage, orchestras, and electric vehicles, often designing Python models to support their estimations. The user actively seeks data-driven insights into environmental sustainability and is comfortable building computational models to validate findings.

(Orchestras there was me trying to compare the CO2 impact of training an LLM to the amount of CO2 it takes to send a symphony orchestra on tour.)

Then from “Helpful User Insights”:

User is based in Half Moon Bay, California. Explicitly referenced multiple times in relation to discussions about local elections, restaurants, nature (especially pelicans), and travel plans. Mentioned from June 2024 to October 2024. […]

User is an avid birdwatcher with a particular fondness for pelicans. Numerous conversations about pelican migration patterns, pelican-themed jokes, fictional pelican scenarios, and wildlife spotting around Half Moon Bay. Discussed between June 2024 and October 2024.

Yeah, it picked up on the pelican thing. I have other interests though!

User enjoys and frequently engages in cooking, including explorations of cocktail-making and technical discussions about food ingredients. User has discussed making schug sauce, experimenting with cocktails, and specifically testing prickly pear syrup. Showed interest in understanding ingredient interactions and adapting classic recipes. Topics frequently came up between June 2024 and October 2024.

Plenty of other stuff is very on brand for me:

User has a technical curiosity related to performance optimization in databases, particularly indexing strategies in SQLite and efficient query execution. Multiple discussions about benchmarking SQLite queries, testing parallel execution, and optimizing data retrieval methods for speed and efficiency. Topics were discussed between June 2024 and October 2024.

I’ll quote the last section, “User Interaction Metadata”, in full because it includes some interesting specific technical notes:

[Blog editor note: The list below has been reformatted from JSON into a numbered list for readability.]

User is currently in United States. This may be inaccurate if, for example, the user is using a VPN.

User is currently using ChatGPT in the native app on an iOS device.

User’s average conversation depth is 2.5.

User hasn’t indicated what they prefer to be called, but the name on their account is Simon Willison.

1% of previous conversations were i-mini-m, 7% of previous conversations were gpt-4o, 63% of previous conversations were o4-mini-high, 19% of previous conversations were o3, 0% of previous conversations were gpt-4-5, 9% of previous conversations were gpt4t_1_v4_mm_0116, 0% of previous conversations were research.

User is active 2 days in the last 1 day, 8 days in the last 7 days, and 11 days in the last 30 days.

User’s local hour is currently 6.

User’s account is 237 weeks old.

User is currently using the following user agent: ChatGPT/1.2025.112 (iOS 18.5; iPhone17,2; build 14675947174).

User’s average message length is 3957.0.

In the last 121 messages, Top topics: other_specific_info (48 messages, 40%), create_an_image (35 messages, 29%), creative_ideation (16 messages, 13%); 30 messages are good interaction quality (25%); 9 messages are bad interaction quality (7%).

User is currently on a ChatGPT Plus plan.

“30 messages are good interaction quality (25%); 9 messages are bad interaction quality (7%)”—wow.

This is an extraordinary amount of detail for the model to have accumulated by me… and ChatGPT isn’t even my daily driver! I spend more of my LLM time with Claude.

Has there ever been a consumer product that’s this capable of building up a human-readable profile of its users? Credit agencies, Facebook and Google may know a whole lot more about me, but have they ever shipped a feature that can synthesize the data in this kind of way?

He’s right. That’s an extraordinary amount of information, organized in human understandable ways. Yes, it will occasionally get things wrong, but LLMs are going to open a whole new world of intimate surveillance.

Posted on June 25, 2025 at 7:04 AM • View Comments

Airlines Secretly Selling Passenger Data to the Government

This is news:

A data broker owned by the country’s major airlines, including Delta, American Airlines, and United, collected U.S. travellers’ domestic flight records, sold access to them to Customs and Border Protection (CBP), and then as part of the contract told CBP to not reveal where the data came from, according to internal CBP documents obtained by 404 Media. The data includes passenger names, their full flight itineraries, and financial details.

Another article.

EDITED TO ADD (6/14): Ed Hausbrook reported this a month and a half ago.

Posted on June 12, 2025 at 11:44 AM • View Comments

US as a Surveillance State

Two essays were just published on DOGE’s data collection and aggregation, and how it ends with a modern surveillance state.

It’s good to see this finally being talked about.

EDITED TO ADD (5/3): Here’s a free link to that first essay.

Posted on May 1, 2025 at 12:02 PM • View Comments

Windscribe Acquitted on Charges of Not Collecting Users’ Data

The company doesn’t keep logs, so couldn’t turn over data:

Windscribe, a globally used privacy-first VPN service, announced today that its founder, Yegor Sak, has been fully acquitted by a court in Athens, Greece, following a two-year legal battle in which Sak was personally charged in connection with an alleged internet offence by an unknown user of the service.

The case centred around a Windscribe-owned server in Finland that was allegedly used to breach a system in Greece. Greek authorities, in cooperation with INTERPOL, traced the IP address to Windscribe’s infrastructure and, unlike standard international procedures, proceeded to initiate criminal proceedings against Sak himself, rather than pursuing information through standard corporate channels.

Posted on April 28, 2025 at 2:17 PM • View Comments

Apps That Are Spying on Your Location

404 Media and Wired are reporting on all the apps that are spying on your location, based on a hack of the location data company Gravy Analytics:

The thousands of apps, included in hacked files from location data company Gravy Analytics, include everything from games like Candy Crush to dating apps like Tinder, to pregnancy tracking and religious prayer apps across both Android and iOS. Because much of the collection is occurring through the advertising ecosystem—not code developed by the app creators themselves—this data collection is likely happening both without users’ and even app developers’ knowledge.

Posted on January 10, 2025 at 11:27 AM • View Comments

Google Is Allowing Device Fingerprinting

Lukasz Olejnik writes about device fingerprinting, and why Google’s policy change to allow it in 2025 is a major privacy setback.

EDITED TO ADD (1/12): Shashdot thread.

Posted on January 2, 2025 at 3:22 PM • View Comments

Texas Sues GM for Collecting Driving Data without Consent

Texas is suing General Motors for collecting driver data without consent and then selling it to insurance companies:

From CNN:

In car models from 2015 and later, the Detroit-based car manufacturer allegedly used technology to “collect, record, analyze, and transmit highly detailed driving data about each time a driver used their vehicle,” according to the AG’s statement.

General Motors sold this information to several other companies, including to at least two companies for the purpose of generating “Driving Scores” about GM’s customers, the AG alleged. The suit said those two companies then sold these scores to insurance companies.

Insurance companies can use data to see how many times people exceeded a speed limit or obeyed other traffic laws. Some insurance firms ask customers if they want to voluntarily opt-in to such programs, promising lower rates for safer drivers.

But the attorney general’s office claimed GM “deceived” its Texan customers by encouraging them to enroll in programs such as OnStar Smart Driver. But by agreeing to join these programs, customers also unknowingly agreed to the collection and sale of their data, the attorney general’s office said.

Press release. Court filing. Slashdot thread.

Posted on August 14, 2024 at 12:48 PM • View Comments

The Hacking of Culture and the Creation of Socio-Technical Debt

Culture is increasingly mediated through algorithms. These algorithms have splintered the organization of culture, a result of states and tech companies vying for influence over mass audiences. One byproduct of this splintering is a shift from imperfect but broad cultural narratives to a proliferation of niche groups, who are defined by ideology or aesthetics instead of nationality or geography. This change reflects a material shift in the relationship between collective identity and power, and illustrates how states no longer have exclusive domain over either. Today, both power and culture are increasingly corporate.

Blending Stewart Brand and Jean-Jacques Rousseau, McKenzie Wark writes in A Hacker Manifesto that “information wants to be free but is everywhere in chains.”¹ Sounding simultaneously harmless and revolutionary, Wark’s assertion as part of her analysis of the role of what she terms “the hacker class” in creating new world orders points to one of the main ideas that became foundational to the reorganization of power in the era of the internet: that “information wants to be free.” This credo, itself a co-option of Brand’s influential original assertion in a conversation with Apple cofounder Steve Wozniak at the 1984 Hackers Conference and later in his 1987 book The Media Lab: Inventing the Future at MIT, became a central ethos for early internet inventors, activists,² and entrepreneurs. Ultimately, this notion was foundational in the construction of the era we find ourselves in today: an era in which internet companies dominate public and private life. These companies used the supposed desire of information to be free as a pretext for building platforms that allowed people to connect and share content. Over time, this development helped facilitate the definitive power transfer of our time, from states to corporations.

This power transfer was enabled in part by personal data and its potential power to influence people’s behavior—a critical goal in both politics and business. The pioneers of the digital advertising industry claimed that the more data they had about people, the more they could influence their behavior. In this way, they used data as a proxy for influence, and built the business case for mass digital surveillance. The big idea was that data can accurately model, predict, and influence the behavior of everyone—from consumers to voters to criminals. In reality, the relationship between data and influence is fuzzier, since influence is hard to measure or quantify. But the idea of data as a proxy for influence is appealing precisely because data is quantifiable, whereas influence is vague. The business model of Google Ads, Facebook, Experian, and similar companies works because data is cheap to gather, and the effectiveness of the resulting influence is difficult to measure. The credo was “Build the platform, harvest the data…then profit.” By 2006, a major policy paper could ask, “Is Data the New Oil?”³

The digital platforms that have succeeded most in attracting and sustaining mass attention—Facebook, TikTok, Instagram—have become cultural. The design of these platforms dictates the circulation of customs, symbols, stories, values, and norms that bind people together in protocols of shared identity. Culture, as articulated through human systems such as art and media, is a kind of social infrastructure. Put differently, culture is the operating system of society.

Like any well-designed operating system, culture is invisible to most people most of the time. Hidden in plain sight, we make use of it constantly without realizing it. As an operating system, culture forms the base infrastructure layer of societal interaction, facilitating communication, cooperation, and interrelations. Always evolving, culture is elastic: we build on it, remix it, and even break it.

Culture can also be hacked—subverted for specific advantage.⁴ If culture is like an operating system, then to hack it is to exploit the design of that system to gain unauthorized control and manipulate it towards a specific end. This can be for good or for bad. The morality of the hack depends on the intent and actions of the hacker.

When businesses hack culture to gather data, they are not necessarily destroying or burning down social fabrics and cultural infrastructure. Rather, they reroute the way information and value circulate, for the benefit of their shareholders. This isn’t new. There have been culture hacks before. For example, by lending it covert support, the CIA hacked the abstract expressionism movement to promote the idea that capitalism was friendly to high culture.⁵ Advertising appropriated the folk-cultural images of Santa Claus and the American cowboy to sell Coca-Cola and Marlboro cigarettes, respectively. In Mexico, after the revolution of 1910, the ruling party hacked muralist works, aiming to construct a unifying national narrative.

Culture hacks under digital capitalism are different. Whereas traditional propaganda goes in one direction—from government to population, or from corporation to customers—the internet-surveillance business works in two directions: extracting data while pushing engaging content. The extracted data is used to determine what content a user would find most engaging, and that engagement is used to extract more data, and so on. The goal is to keep as many users as possible on platforms for as long as possible, in order to sell access to those users to advertisers. Another difference between traditional propaganda and digital platforms is that the former aims to craft messages with broad appeal, while the latter hyper-personalizes content for individual users.

The rise of Chinese-owned TikTok has triggered heated debate in the US about the potential for a foreign-owned platform to influence users by manipulating what they see. Never mind that US corporations have used similar tactics for years. While the political commitments of platform owners are indeed consequential—Chinese-owned companies are in service to the Chinese Communist Party, while US-owned companies are in service to business goals—the far more pressing issue is that both have virtually unchecked surveillance power. They are both reshaping societies by hacking culture to extract data and serve content. Funny memes, shocking news, and aspirational images all function similarly: they provide companies with unprecedented access to societies’ collective dreams and fears.⁶ By determining who sees what when and where, platform owners influence how societies articulate their understanding of themselves.

Tech companies want us to believe that algorithmically determined content is effectively neutral: that it merely reflects the user’s behavior and tastes back at them. In 2021, Instagram head Adam Mosseri wrote a post on the company’s blog entitled “Shedding More Light on How Instagram Works.” A similar window into TikTok’s functioning was provided by journalist Ben Smith in his article “How TikTok Reads Your Mind.”⁷ Both pieces boil down to roughly the same idea: “We use complicated math to give you more of what your behavior shows us you really like.”

This has two consequences. First, companies that control what users see in a nontransparent way influence how we perceive the world. They can even shape our personal relationships. Second, by optimizing algorithms for individual attention, a sense of culture as common ground is lost. Rather than binding people through shared narratives, digital platforms fracture common cultural norms into self-reinforcing filter bubbles.⁸

This fragmentation of shared cultural identity reflects how the data surveillance business is rewriting both the established order of global power, and social contracts between national governments and their citizens. Before the internet, in the era of the modern state, imperfect but broad narratives shaped distinct cultural identities; “Mexican culture” was different from “French culture,” and so on. These narratives were designed to carve away an “us” from “them,” in a way that served government aims. Culture has long been understood to operate within the envelope of nationality, as exemplified by the organization of museum collections according to the nationality of artists, or by the Venice Biennale—the Olympics of the art world, with its national pavilions format.

National culture, however, is about more than museum collections or promoting tourism. It broadly legitimizes state power by emotionally binding citizens to a self-understood identity. This identity helps ensure a continuing supply of military recruits to fight for the preservation of the state. Sociologist James Davison Hunter, who popularized the phrase “culture war,” stresses that culture is used to justify violence to defend these identities.⁹ We saw an example of this on January 6, 2021, with the storming of the US Capitol. Many of those involved were motivated by a desire to defend a certain idea of cultural identity they believed was under threat.

Military priorities were also entangled with the origins of the tech industry. The US Department of Defense funded ARPANET, the first version of the internet. But the internet wouldn’t have become what it is today without the influence of both West Coast counterculture and small-l libertarianism, which saw the early internet as primarily a space to connect and play. One of the first digital game designers was Bernie De Koven, founder of the Games Preserve Foundation. A noted game theorist, he was inspired by Stewart Brand’s interest in “play-ins” to start a center dedicated to play. Brand had envisioned play-ins as an alternative form of protest against the Vietnam War; they would be their own “soft war” of subversion against the military.¹⁰ But the rise of digital surveillance as the business model of nascent tech corporations would hack this anti-establishment spirit, turning instruments of social cohesion and connection into instruments of control.

It’s this counterculture side of tech’s lineage, which advocated for the social value of play, that attuned the tech industry to the utility of culture. We see the commingling of play and military control in Brand’s Whole Earth Catalog, which was a huge influence on early tech culture. Described as “a kind of Bible for counterculture technology,” the Whole Earth Catalog was popular with the first generation of internet engineers, and established crucial “assumptions about the ideal relationships between information, technology, and community.”¹¹ Brand’s 1972 Rolling Stone article “Spacewar: Fantastic Life and Symbolic Death Among the Computer” further emphasized how rudimentary video games were central to the engineering community. These games were wildly popular at leading engineering research centers: Stanford, MIT, ARPA, Xerox, and others. This passion for gaming as an expression of technical skills and a way for hacker communities to bond led to the development of MUD (Multi-User Dungeon) programs, which enabled multiple people to communicate and collaborate online simultaneously.

The first MUD was developed in 1978 by engineers who wanted to play fantasy games online. It applied the early-internet ethos of decentralism and personalization to video games, making it a precursor to massive multiplayer online role-playing games and modern chat rooms and Facebook groups. Today, these video games and game-like simulations—now a commercial industry worth around $200 billion¹²—serve as important recruitment and training tools for the military.¹³ The history of the tech industry and culture is full of this tension between the internet as an engineering plaything and as a surveillance commodity.

Historically, infrastructure businesses—like railroad companies in the nineteenth-century US—have always wielded considerable power. Internet companies that are also infrastructure businesses combine commercial interests with influence over national and individual security. As we transitioned from railroad tycoons connecting physical space to cloud computing companies connecting digital space, the pace of technological development put governments at a disadvantage. The result is that corporations now lead the development of new tech (a reversal from the ARPANET days), and governments follow, struggling to modernize public services in line with the new tech. Companies like Microsoft are functionally providing national cybersecurity. Starlink, Elon Musk’s satellite internet service, is a consumer product that facilitates military communications for the war in Ukraine. Traditionally, this kind of service had been restricted to selected users and was the purview of states.¹⁴ Increasingly, it is clear that a handful of transnational companies are using their technological advantages to consolidate economic and political power to a degree previously afforded to only great-power nations.

Worse, since these companies operate across multiple countries and regions, there is no regulatory body with the jurisdiction to effectively constrain them. This transition of authority from states to corporations and the nature of surveillance as the business model of the internet rewrites social contracts between national governments and their citizens. But it also also blurs the lines among citizen, consumer, and worker. An example of this are Google’s Recaptchas, visual image puzzles used in cybersecurity to “prove” that the user is a human and not a bot. While these puzzles are used by companies and governments to add a layer of security to their sites, their value is in how they record a user’s input in solving the puzzles to train Google’s computer vision AI systems. Similarly, Microsoft provides significant cybersecurity services to governments while it also trains its AI models on citizens’ conversations with Bing.¹⁵ Under this dyanmic, when citizens use digital tools and services provided by tech companies, often to access government webpages and resources, they become de facto free labor for the tech companies providing them. The value generated by this citizen-user-laborer stays with the company, as it is used to develop and refine their products. In this new blurred reality, the relationships among corporations, governments, power, and identity are shifting. Our social and cultural infrastructure suffers as a result, creating a new kind of technical debt of social and cultural infrustructure.

In the field of software development, technical debt refers to the future cost of ignoring a near-term engineering problem.¹⁶ Technical debt grows as engineers implement short-term patches or workarounds, choosing to push the more expensive and involved re-engineering fixes for later. This debt accrues over time, to be paid back in the long term. The result of a decision to solve an immediate problem at the expense of the long-term one effectively mortgages the future in favor of an easier present. In terms of cultural and social infrastructure, we use the same phrase to refer to the long-term costs that result from avoiding or not fully addressing social needs in the present. More than a mere mistake, socio-technical debt stems from willfully not addressing a social problem today and leaving a much larger problem to be addressed in the future.

For example, this kind of technical debt was created by the cratering of the news industry, which relied on social media to drive traffic—and revenue—to news websites. When social media companies adjusted their algorithms to deprioritize news, traffic to news sites plummeted, causing an existential crisis for many publications.¹⁷ Now, traditional news stories make up only 3 percent of social media content. At the same time, 66 percent of people ages eighteen to twenty-four say they get their “news” from TikTok, Facebook, and Twitter.¹⁸ To be clear, Facebook did not accrue technical debt when it swallowed the news industry. We as a society are dealing with technical debt in the sense that we are being forced to pay the social cost of allowing them to do that.

One result of this shift in information consumption as a result of changes to the cultural infrastructure of social media is the rise in polarization and radicalism. So by neglecting to adequately regulate tech companies and support news outlets in the near term, our governments have paved the way for social instability in the long term. We as a society also have to find and fund new systems to act as a watchdog over both corporate and governmental power.

Another example of socio-technical debt is the slow erosion of main streets and malls by e-commerce.¹⁹ These places used to be important sites for physical gathering, which helped the shops and restaurants concentrated there stay in business. But e-commerce and direct-to-consumer trends have undermined the economic viability of main streets and malls, and have made it much harder for small businesses to survive. The long-term consequence of this to society is the hollowing out of town centers and the loss of spaces for physical gathering—which we will all have to pay for eventually.

The faltering finances of museums will also create long-term consequences for society as a whole, especially in the US, where Museums mostly depend on private donors to cover operational costs. But a younger generation of philanthropists is shifting its giving priorities away from the arts, leading to a funding crisis at some institutions.²⁰

One final example: libraries. NYU Sociologist Eric Klinenberg called libraries “the textbook example of social infrastructure in action.”²¹ But today they are stretched to the breaking point, like museums, main streets, and news media. In New York City, Mayor Eric Adams has proposed a series of severe budget cuts to the city’s library system over the past year, despite having seen a spike in usage recently. The steepest cuts were eventually retracted, but most libraries in the city have still had to cancel social programs and cut the number of days they’re open.²² As more and more spaces for meeting in real life close, we increasingly turn to digital platforms for connection to replace them. But these virtual spaces are optimized for shareholder returns, not public good.

Just seven companies—Alphabet (the parent company of Google), Amazon, Apple, Meta, Microsoft, Nvidia and Tesla—drove 60 percent of the gains of the S&P stock market index in 2023.²³ Four—Alibaba, Amazon, Google, and Microsoft—deliver the majority of cloud services.²⁴ These companies have captured the delivery of digital and physical goods and services. Everything involved with social media, cloud computing, groceries, and medicine is trapped in their flywheels, because the constellation of systems that previously put the brakes on corporate power, such as monopoly laws, labor unions, and news media, has been eroded. Product dependence and regulatory capture have further undermined the capacity of states to respond to the rise in corporate hard and soft power. Lock-in and other anticompetitive corporate behavior have prevented market mechanisms from working properly. As democracy falls into deeper crisis with each passing year, policy and culture are increasingly bent towards serving corporate interest. The illusion that business, government, and culture are siloed sustains this status quo.

Our digitized global economy has made us all participants in the international data trade, however reluctantly. Though we are aware of the privacy invasions and social costs of digital platforms, we nevertheless participate in these systems because we feel as though we have no alternative—which itself is partly the result of tech monopolies and the lack of competition.

Now, the ascendence of AI is thrusting big data into a new phase and new conflicts with social contracts. The development of bigger, more powerful AI models means more demand for data. Again, massive wholesale extractions of culture are at the heart of these efforts.²⁵ As AI researchers and artists Kate Crawford and Vladan Joler explain in the catalog to their exhibition Calculating Empires, AI developers require “the entire history of human knowledge and culture … The current lawsuits over generative systems like GPT and Stable Diffusion highlight how completely dependent AI systems are on extracting, enclosing, and commodifying the entire history of cognitive and creative labor.”²⁶

Permitting internet companies to hack the systems in which culture is produced and circulates is a short-term trade-off that has proven to have devastating long-term consequences. When governments give tech companies unregulated access to our social and cultural infrastructure, the social contract becomes biased towards their profit. When we get immediate catharsis through sharing memes or engaging in internet flamewars, real protest is muzzled. We are increasing our collective socio-technical debt by ceding our social and cultural infrastructure to tech monopolies.

Cultural expression is fundamental to what makes us human. It’s an impulse, innate to us as a species, and this impulse will continue to be a gold mine to tech companies. There is evidence that AI models trained on synthetic data—data produced by other AI models rather than humans—can corrupt these models, causing them to return false or nonsensical answers to queries.²⁷ So as AI-produced data floods the internet, data that is guaranteed to have been derived from humans becomes more valuable. In this context, our human nature, compelling us to make and express culture, is the dream of digital capitalism. We become a perpetual motion machine churning out free data. Beholden to shareholders, these corporations see it as their fiduciary duty—a moral imperative even—to extract value from this cultural life.

We are in a strange transition. The previous global order, in which states wielded ultimate authority, hasn’t quite died. At the same time, large corporations have stepped in to deliver some of the services abandoned by states, but at the price of privacy and civic well-being. Increasingly, corporations provide consistent, if not pleasant, economic and social organization. Something similar occurred during the Gilded Age in the US (1870s–1890s). But back then, the influence of robber barons was largely constrained to the geographies in which they operated, and their services (like the railroad) were not previously provided by states. In our current transitionary period, public life worldwide is being reimagined in accordance with corporate values. Amidst a tug-of-war between the old state-centric world and the emerging capital-centric world, there is a growing radicalism fueled partly by frustration over social and personal needs going unmet under a transnational order that is maximized for profit rather than public good.

Culture is increasingly divorced from national identity in our globalized, fragmented world. On the positive side, this decoupling can make culture more inclusive of marginalized people. Other groups, however, may perceive this new status quo as a threat, especially those facing a loss of privilege. The rise of white Christian nationalism shows that the right still regards national identity and culture as crucial—as potent tools in the struggle to build political power, often through anti-democratic means. This phenomenon shows that the separation of cultural identity from national identity doesn’t negate the latter. Instead, it creates new political realities and new orders of power.

Nations issuing passports still behave as though they are the definitive arbiters of identity. But culture today—particularly the multiverse of internet cultures—exposes how this is increasingly untrue. With government discredited as an ultimate authority, and identity less and less connected to nationality, we can find a measure of hope for navigating the current transition in the fact that culture is never static. New forms of resistance are always emerging. But we must ask ourselves: Have the tech industry’s overwhelming surveillance powers rendered subversion impossible? Or does its scramble to gather all the world’s data offer new possibilities to hack the system?

1. McKenzie Wark, A Hacker Manifesto (Harvard University Press, 2004), thesis 126. ↑

2. Jon Katz, “Birth of a Digital Nation,” Wired, April 1, 1997. ↑

3. Marcin Szczepanski, “Is Data the New Oil? Competition Issues in the Digital Economy,” European Parliamentary Research Service, January 2020. ↑

4. Bruce Schneier, A Hacker’s Mind: How the Powerful Bend Society’s Rules, and How to Bend Them Back (W. W. Norton & Sons, 2023). ↑

5. Lucie Levine, “Was Modern Art Really a CIA Psy-Op?” JStor Daily, April 1, 2020. ↑

6. Bruce Schneier, Data and Goliath: The Hidden Battles to Collect Your Data and Control Your World (W. W. Norton & Sons, 2015). ↑

7. Adam Mosseri, “Shedding More Light on How Instagram Works,” Instagram Blog, June 8, 2021; Ben Smith, “How TikTok Reads Your Mind,” New York Times, December 5, 2021. ↑

8. Giacomo Figà Talamanca and Selene Arfini, “Through the Newsfeed Glass: Rethinking Filter Bubbles and Echo Chambers,” Philosophy & Technology 35, no. 1 (2022). ↑

9. Zack Stanton, “How the ‘Culture War’ Could Break Democracy,” Politico, May 5, 2021. ↑

10. Jason Johnson, “Inside the Failed, Utopian New Games Movement,” Kill Screen, October 25, 2013. ↑

11. Fred Turner, “Taking the Whole Earth Digital,” chap. 4 in From Counter Culture to Cyberculture: Stewart Brand, The Whole Earth Network, and the Rise of Digital Utopianism (University of Chicago Press, 2006). ↑

12. Kaare Ericksen, “The State of the Video Games Industry: A Special Report,” Variety, February 1, 2024. ↑

13. Rosa Schwartzburg, “The US Military Is Embedded in the Gaming World. It’s Target: Teen Recruits,” The Guardian, February 14, 2024; Scott Kuhn, “Soldiers Maintain Readiness Playing Video Games,” US Army, April 29, 2020; Katie Lange, “Military Esports: How Gaming Is Changing Recruitment & Moral,” US Department of Defense, December 13, 2022. ↑

14. Shaun Waterman, “Growing Commercial SATCOM Raises Trust Issues for Pentagon,” Air & Space Forces Magazine, April 3, 2024. ↑

15. Geoffrey A Fowler, “Your Instagrams Are Training AI. There’s Little You Can Do About It,” Washington Post, September 27, 2023. ↑

16. Zengyang Li, Paris Avgeriou, and Peng Liang, “A Systematic Mapping Study on Technical Debt and Its Management,” Journal of Systems and Software, December 2014. ↑

17. David Streitfeld, “How the Media Industry Keeps Losing the Future,” New York Times, February 28, 2024. ↑

18. “The End of the Social Network,” The Economist, February 1, 2024; Ollie Davies, “What Happens If Teens Get Their News From TikTok?” The Guardian, February 22, 2023. ↑

19. Eric Jaffe, “Quantifying the Death of the Classic American Main Street,” Medium, March 16, 2018. ↑

20. Julia Halprin, “The Hangover from the Museum Party: Institutions in the US Are Facing a Funding Crisis,” Art Newspaper, January 19, 2024. ↑

21. Quoted in Pete Buttigieg, “The Key to Happiness Might Be as Simple as a Library or Park,” New York Times, September 14, 2018. ↑

22. Jeffery C. Mays and Dana Rubinstein, “Mayor Adams Walks Back Budget Cuts Many Saw as Unnecessary,” New York Times, April 24, 2024. ↑

23. Karl Russell and Joe Rennison, “These Seven Tech Stocks Are Driving the Market,” New York Times, January 22, 2024. ↑

24. Ian Bremmer, “How Big Tech Will Reshape the Global Order,” Foreign Affairs, October 19, 2021. ↑

25. Nathan Sanders and Bruce Schneier, “How the ‘Frontier’ Became the Slogan for Uncontrolled AI,” Jacobin, February 27, 2024. ↑

26. Kate Crawford and Vladan Joler, Calculating Empires: A Genealogy of Technology and Power, 1500–2025 (Fondazione Prada, 2023), 9. Exhibition catalog. ↑

27. Rahul Rao, “AI Generated Data Can Poison Future AI Models,” Scientific American, July 28, 2023. ↑

This essay was written with Kim Córdova, and was originally published in e-flux.

Posted on June 19, 2024 at 7:09 AM • View Comments

Surveillance by the New Microsoft Outlook App

The ProtonMail people are accusing Microsoft’s new Outlook for Windows app of conducting extensive surveillance on its users. It shares data with advertisers, a lot of data:

The window informs users that Microsoft and those 801 third parties use their data for a number of purposes, including to:

Store and/or access information on the user’s device
Develop and improve products
Personalize ads and content
Measure ads and content
Derive audience insights
Obtain precise geolocation data
Identify users through device scanning

Commentary.

Posted on April 4, 2024 at 7:07 AM • View Comments

Class-Action Lawsuit against Google’s Incognito Mode

The lawsuit has been settled:

Google has agreed to delete “billions of data records” the company collected while users browsed the web using Incognito mode, according to documents filed in federal court in San Francisco on Monday. The agreement, part of a settlement in a class action lawsuit filed in 2020, caps off years of disclosures about Google’s practices that shed light on how much data the tech giant siphons from its users—even when they’re in private-browsing mode.

Under the terms of the settlement, Google must further update the Incognito mode “splash page” that appears anytime you open an Incognito mode Chrome window after previously updating it in January. The Incognito splash page will explicitly state that Google collects data from third-party websites “regardless of which browsing or browser mode you use,” and stipulate that “third-party sites and apps that integrate our services may still share information with Google,” among other changes. Details about Google’s private-browsing data collection must also appear in the company’s privacy policy.

I was an expert witness for the prosecution (that’s the class, against Google). I don’t know if my declarations and deposition will become public.

Posted on April 3, 2024 at 7:01 AM • View Comments

1 2 3 … 8 Next→

Sidebar photo of Bruce Schneier by Joe MacInnis.