Data, Surveillance, and the AI Arms Race

According to foreign policy experts and the defense establishment, the United States is caught in an artificial intelligence arms race with China—one with serious implications for national security. The conventional version of this story suggests that the United States is at a disadvantage because of self-imposed restraints on the collection of data and the privacy of its citizens, while China, an unrestrained surveillance state, is at an advantage. In this vision, the data that China collects will be fed into its systems, leading to more powerful AI with capabilities we can only imagine today. Since Western countries can’t or won’t reap such a comprehensive harvest of data from their citizens, China will win the AI arms race and dominate the next century.

This idea makes for a compelling narrative, especially for those trying to justify surveillance—whether government- or corporate-run. But it ignores some fundamental realities about how AI works and how AI research is conducted.

Thanks to advances in machine learning, AI has flipped from theoretical to practical in recent years, and successes dominate public understanding of how it works. Machine learning systems can now diagnose pneumonia from X-rays, play the games of go and poker, and read human lips, all better than humans. They’re increasingly watching surveillance video. They are at the core of self-driving car technology and are playing roles in both intelligence-gathering and military operations. These systems monitor our networks to detect intrusions and look for spam and malware in our email.

And it’s true that there are differences in the way each country collects data. The United States pioneered “surveillance capitalism,” to use the Harvard University professor Shoshana Zuboff’s term, where data about the population is collected by hundreds of large and small companies for corporate advantage—and mutually shared or sold for profit The state picks up on that data, in cases such as the Centers for Disease Control and Prevention’s use of Google search data to map epidemics and evidence shared by alleged criminals on Facebook, but it isn’t the primary user.

China, on the other hand, is far more centralized. Internet companies collect the same sort of data, but it is shared with the government, combined with government-collected data, and used for social control. Every Chinese citizen has a national ID number that is demanded by most services and allows data to easily be tied together. In the western region of Xinjiang, ubiquitous surveillance is used to oppress the Uighur ethnic minority—although at this point there is still a lot of human labor making it all work. Everyone expects that this is a test bed for the entire country.

Data is increasingly becoming a part of control for the Chinese government. While many of these plans are aspirational at the moment—there isn’t, as some have claimed, a single “social credit score,” but instead future plans to link up a wide variety of systems—data collection is universally pushed as essential to the future of Chinese AI. One executive at search firm Baidu predicted that the country’s connected population will provide them with the raw data necessary to become the world’s preeminent tech power. China’s official goal is to become the world AI leader by 2030, aided in part by all of this massive data collection and correlation.

This all sounds impressive, but turning massive databases into AI capabilities doesn’t match technological reality. Current machine learning techniques aren’t all that sophisticated. All modern AI systems follow the same basic methods. Using lots of computing power, different machine learning models are tried, altered, and tried again. These systems use a large amount of data (the training set) and an evaluation function to distinguish between those models and variations that work well and those that work less well. After trying a lot of models and variations, the system picks the one that works best. This iterative improvement continues even after the system has been fielded and is in use.

So, for example, a deep learning system trying to do facial recognition will have multiple layers (hence the notion of “deep”) trying to do different parts of the facial recognition task. One layer will try to find features in the raw data of a picture that will help find a face, such as changes in color that will indicate an edge. The next layer might try to combine these lower layers into features like shapes, looking for round shapes inside of ovals that indicate eyes on a face. The different layers will try different features and will be compared by the evaluation function until the one that is able to give the best results is found, in a process that is only slightly more refined than trial and error.

Large data sets are essential to making this work, but that doesn’t mean that more data is automatically better or that the system with the most data is automatically the best system. Train a facial recognition algorithm on a set that contains only faces of white men, and the algorithm will have trouble with any other kind of face. Use an evaluation function that is based on historical decisions, and any past bias is learned by the algorithm. For example, mortgage loan algorithms trained on historic decisions of human loan officers have been found to implement redlining. Similarly, hiring algorithms trained on historical data manifest the same sexism as human staff often have. Scientists are constantly learning about how to train machine learning systems, and while throwing a large amount of data and computing power at the problem can work, more subtle techniques are often more successful. All data isn’t created equal, and for effective machine learning, data has to be both relevant and diverse in the right ways.

Future research advances in machine learning are focused on two areas. The first is in enhancing how these systems distinguish between variations of an algorithm. As different versions of an algorithm are run over the training data, there needs to be some way of deciding which version is “better.” These evaluation functions need to balance the recognition of an improvement with not over-fitting to the particular training data. Getting functions that can automatically and accurately distinguish between two algorithms based on minor differences in the outputs is an art form that no amount of data can improve.

The second is in the machine learning algorithms themselves. While much of machine learning depends on trying different variations of an algorithm on large amounts of data to see which is most successful, the initial formulation of the algorithm is still vitally important. The way the algorithms interact, the types of variations attempted, and the mechanisms used to test and redirect the algorithms are all areas of active research. (An overview of some of this work can be found here; even trying to limit the research to 20 papers oversimplifies the work being done in the field.) None of these problems can be solved by throwing more data at the problem.

The British AI company DeepMind’s success in teaching a computer to play the Chinese board game go is illustrative. Its AlphaGo computer program became a grandmaster in two steps. First, it was fed some enormous number of human-played games. Then, the game played itself an enormous number of times, improving its own play along the way. In 2016, AlphaGo beat the grandmaster Lee Sedol four games to one.

While the training data in this case, the human-played games, was valuable, even more important was the machine learning algorithm used and the function that evaluated the relative merits of different game positions. Just one year later, DeepMind was back with a follow-on system: AlphaZero. This go-playing computer dispensed entirely with the human-played games and just learned by playing against itself over and over again. It plays like an alien. (It also became a grandmaster in chess and shogi.)

These are abstract games, so it makes sense that a more abstract training process works well. But even something as visceral as facial recognition needs more than just a huge database of identified faces in order to work successfully. It needs the ability to separate a face from the background in a two-dimensional photo or video and to recognize the same face in spite of changes in angle, lighting, or shadows. Just adding more data may help, but not nearly as much as added research into what to do with the data once we have it.

Meanwhile, foreign-policy and defense experts are talking about AI as if it were the next nuclear arms race, with the country that figures it out best or first becoming the dominant superpower for the next century. But that didn’t happen with nuclear weapons, despite research only being conducted by governments and in secret. It certainly won’t happen with AI, no matter how much data different nations or companies scoop up.

It is true that China is investing a lot of money into artificial intelligence research: The Chinese government believes this will allow it to leapfrog other countries (and companies in those countries) and become a major force in this new and transformative area of computing—and it may be right. On the other hand, much of this seems to be a wasteful boondoggle. Slapping “AI” on pretty much anything is how to get funding. The Chinese Ministry of Education, for instance, promises to produce “50 world-class AI textbooks,” with no explanation of what that means.

In the democratic world, the government is neither the leading researcher nor the leading consumer of AI technologies. AI research is much more decentralized and academic, and it is conducted primarily in the public eye. Research teams keep their training data and models proprietary but freely publish their machine learning algorithms. If you wanted to work on machine learning right now, you could download Microsoft’s Cognitive Toolkit, Google’s Tensorflow, or Facebook’s Pytorch. These aren’t toy systems; these are the state-of-the art machine learning platforms.

AI is not analogous to the big science projects of the previous century that brought us the atom bomb and the moon landing. AI is a science that can be conducted by many different groups with a variety of different resources, making it closer to computer design than the space race or nuclear competition. It doesn’t take a massive government-funded lab for AI research, nor the secrecy of the Manhattan Project. The research conducted in the open science literature will trump research done in secret because of the benefits of collaboration and the free exchange of ideas.

While the United States should certainly increase funding for AI research, it should continue to treat it as an open scientific endeavor. Surveillance is not justified by the needs of machine learning, and real progress in AI doesn’t need it.

This essay was written with Jim Waldo, and previously appeared in Foreign Policy.

Posted on June 17, 2019 at 5:52 AM37 Comments

Comments

AlanS June 17, 2019 7:25 AM

The term “surveillance capitalism” is a redundancy and misleading. It makes it seem as if this is a new thing. It’s not. And it’s not just capitalism. You can trace the development and the use of technology to track and control human behavior (for good and ill) back centuries. What’s ‘new’ is only that we’ve hit the point in the curve where the exponential increase in the ability to collect and process data is obvious.

Majid Hosseini June 17, 2019 9:37 AM

Unfortunately this is just a word salad, not a coherent articulation of where machine learning is and where it is going. The author clearly lacks any knowledge beyond buzzwords and very shallow understanding. Not sure how it gets published by FP and gets featured by Bruce Scneier

Alex A June 17, 2019 11:13 AM

@Majid Hosseini 100% agree. I was initially intrigued, but this author fails to tie this into a coherent argument about how the AI Arms Race will either be won or lost, or what the implications of that would even be. Definitely not up to par with the usual articles that Bruce posts.

Sergey Babkin June 17, 2019 11:38 AM

Perhaps a better analogy would be the Japanese “5th generation computers” project, which also revolved around that time’s iteration of AI. That project failed to deliver anything useful, but kind of the same goal actually worked out in US, without a national project.

On the other hand, some major progress in the recent years has been done in China. At Microsoft, for example, the Chinese office holds the most advanced position in ML.

Andrew June 17, 2019 11:41 AM

@alex a
AI arms race:
Pentagon will deploy soon drones as wingman for fighters. On future fighters generations AI will replace pilots, in simulators AI already performed better than humans. There will be autonomous tanks, fighting machines and killer drones. All these with target identification, instant reaction, no need for sleep etc. It will make the difference in a new type of conventional war.

Eric Johnson June 17, 2019 11:54 AM

Also worth keeping in mind the trends in computer capabilities. What took $1,000 to compute around five years ago, probably costs a lot less than that now. Following Moore’s “Law” it maybe as little as 1/8th of that – around $125. In other words, the economics here are extremely important in controlling where money flows, and what is “possible”. The computing industry recognizes that accomplishing the same thing, but with less energy cost is now a critical factor, whether it is $$ spent on cloud computing, or battery-life for mobile devices. We’ve got quite a number of strong economic incentives pointing in this direction.

The net effect of this pushes in multiple directions. Advantage will accrue to the companies and countries that: find lower-cost-to-execute algorithms, find algorithms that require less external data to train (like Alpha Go!), and create hardware that can perform these AI operations using less energy. None of these characteristics relate to having more surveillance data that might benefit China.

For the same reason that trying to find terrorists based on large data sets of behavioral data doesn’t work – as you’ve outlined before – AI will hit the analogous limits with large data sets.

vas pup June 17, 2019 2:17 PM

I agree with most of the point @AlanS.
And just want to add that super profits from such activity are new.

Petre Peter June 17, 2019 3:08 PM

Yes, access to larger training sets makes a huge difference in the AI race. But it would seem that countries should collaborate rather than compete since it would be difficult to judge the winners.

Alyer Babtu June 17, 2019 6:23 PM

It plays like an alien.

Johann Wolfgang von Goethe – ‘Mathematicians are like Frenchmen: whatever you say to them they translate into their own language and forthwith it is something entirely different.’ Applies to computer scientists too. The full implications of the algorithm may be surprising, although you were in potency to knowing them already in understanding the algorithm.

Just adding more data may help, but not nearly as much as added research into what to do with the data once we have it.

Richard Hamming – ‘The purpose of computing is insight, not numbers.’ The work of Stephen Grossberg is one place in AI and ML where insight is had. Those deep layered networks are a snare and a confusion.

Faustus June 17, 2019 8:20 PM

This is a very nice article about AI. Although AI uses a lot of data, the essence of AI is not the data. It is the techniques, which really can be perfected with computer generated data or non spfecific data sets. Harvesting real data is only needed after the techniques are perfected. Real data adds little to the theory.

Anything learned specifically from surveillance data is only going to apply to surveillance. We are losing very little if we don’t have the surveillance data that China has (which I find unlikely anyhow), and all we are losing is authoritarianism specific. The whole “China is getting ahead of us” argument is simply another excuse to spy on us. And spend money faster and faster.

@Alyer Babtu is, in my opinion, very correct. The neural net/statistical approach to AI (i.e. “machine learning”) is teaching us very little about thinking or intelligence. The systems are functional, but they hide their insights in opaque arrays of numbers. The whole ML field is one big snore unless you are only interested in money.

This is why I have been working for years on an AI that creates subconcepts and human understandable explanations of what it has learned. That is not to say that they are always interpretable. But a good proportion of the the time they provide human level insight into the structure of the problem and the AI’s approach to solving it.

name.withheld.for.obvious.reasons June 17, 2019 11:32 PM

Regarding an arms race, here I diverge from Bruce. The future from this vantage point (mine, not the author’s) seems to be a type of “Logan’s Run”.

If I understand Bruce, it was necessarily the quality and quantity of the argument but several of the key elements. Here Bruce has some strong contradictions with the Author.

J. Tomkowitz June 17, 2019 11:39 PM

A very interesting report about „Mapping Regulatory Proposals for Artifical Intelligence in Europe“ by accessnow.org

https://www.accessnow.org/cms/assets/uploads/2018/11/mapping_regulatory_proposals_for_AI_in_EU.pdf

and how the EU

https://ec.europa.eu/digital-single-market/en/news/communication-artificial-intelligence-europe

tries to build trust in human-centric AI (The Ethics Guidelines for Trustworthy Artificial Intelligence (AI))

https://ec.europa.eu/futurium/en/ai-alliance-consultation/guidelines#Top

As you can see with this excellent essay of Bruce Schneier and Jim Waldo and the links of this comment there are different ways to handle AI in China, the US, and the EU.

65535 June 18, 2019 12:50 AM

“…the data that China collects will be fed into its systems, leading to more powerful AI with capabilities we can only imagine today. Since Western countries can’t or won’t reap such a comprehensive harvest of data from their citizens, China will win the AI arms race and dominate the next century. This idea makes for a compelling narrative, especially for those trying to justify surveillance… Surveillance is not justified by the needs of machine learning, and real progress in AI doesn’t need it.” -Bruce S.

That is a fairly good conclusion.

I am guessing the end result will be the “need to increase” the NSA’s budget [or the entire military complex budget].

If you need to expand your organization or your budget just use a confidence game of complex Artificial Intelligence, “China” and a dose of fear.

“Surveillance capitalism” should be interpreted as “Con-artist capitalism”. I and others in the have been fairly well scammed by a combination of Cyber junk, Data brokers, and fancy cell phones. It should be stopped. But, I doubt it will.

Let’s follow Bruce’s links to the people selling this need for AL arms and counter AL arms wars. Are these sources unbiased? I doubt it.

First up is a paper [pdf] from PcW. Who is that? Let’s Giggle it.

“AI is going to be big: $15.7 trillion big by 2030”-PwC

ht tps://www.pwc.com/us/en/services/consulting/library/artificial-intelligence-predictions/ai-arms-race.html

[links broken for saftey]

At the bottom of PcW’s page is see they sell: “Audit and assurance, Consulting, Tax services, Newsroom… services

[and]

“What is ‘PwC’? PwC is the brand under which the member firms of PricewaterhouseCoopers International Limited (PwCIL) operate and provide professional services. Together, these firms form the PwC network. ‘PwC’ is often used to refer either to individual firms within the PwC network or to several or all of them collectively.”- PcW

ht tps://www.pwc.com/gx/en/about/corporate-governance/network-structure.html

[Then]

“PricewaterhouseCoopers (doing business as PwC) is a multinational professional services network with headquarters in London, United Kingdom. PwC ranks as the second largest professional services firm in the world and is one of the Big Four auditors, along with Deloitte, EY and KPMG. – Wikipedia

ht tps://en.wikipedia.org/wiki/PricewaterhouseCoopers

Ah, a huge accounting/newsroom firm. Maybe they are unbiased or maybe not.

[Moving on to the next AL arms race promoter link -Jane’s]

Jane’s Defence Weekly:

“Jane’s Defence Weekly gained worldwide attention after printing several images from an American spy satellite of the Nikolaiev 444 shipyard in the Black Sea, showing a Kiev-class aircraft carrier under construction.”- Wikipedia

ht tps://en.wikipedia.org/wiki/Jane%27s_Defence_Weekly

and

“Jane’s Information Group (often referred to as Jane’s) is a British publishing company specializing in military, aerospace and transportation topics.”- Wikipedia

ht tps://en.wikipedia.org/wiki/Jane%27s_Information_Group

The second one is directly connected to the military industrial complex. It is not exactly an unbiased reporting service. They are biased.

What’s with the boogeyman China deal. Why not expand the list to Russia, or even to every other “military style” of countries?

I see there are over 60~ non-democratic [or authoritarian] countries that could be named as AL arms adversaries. Why not go after those also?

[List of authoritarian countries]

“The Democracy Index is an index compiled by the Economist Intelligence Unit (EIU), a UK-based company. Its intention is to measure the state of democracy in 167 countries…”- Wikipedia

ht tps://en.wikipedia.org/wiki/Democracy_Index

I note the US Cyber Command is entrenched in the US government along with the NSA and other TLAs. I will also note head of US Cyber Command is Gen.Nakasone is from Japanese decent: “Nakasone’s father is a second-generation Japanese American..”-Wikipedia

ht tps://en.wikipedia.org/wiki/Paul_M._Nakasone

What is with Gen. Nakasone? Did he or his father have a dislike for China?

I read about the Japanese destroying Nanking and probably a number of its citizens in WW2. The Japanese did not like the Chinese at that time. Could this play a role this whole Boogieman China deal? Who knows?

“The event remains a contentious political issue and a stumbling block in Sino-Japanese relations.”- Wikipedia

ht tps://en.wikipedia.org/wiki/Nanjing_Massacre
[links broken for saftey]

This doesn’t prove Gen. Nakasone has a dislike for China – but, it doesn’t disprove it.

In the big picture AL is an interesting complex and possibly good thing. But, throw in some fear and you have a nice con-game. Giggle, Facecrook, and cell phone companies have all used it.

In summation, the so called AL arms race could be just a scam to take money [and data] from the average Jane/Joe and give it to the military industrial complex. That would enrich the rich and inpoverish the poor.

No One/ Ex Cathedra June 18, 2019 2:41 AM

Let’s get clear about China

China is not North Korea. It’s a modern country full of people who work hard, many of whom take vacations all over the world, and a common dream is to send their kid or kids to Harvard. The Chinese word for America is “Mei Guo”, which means “beautiful country”. Generally, they really respect America, and one big thing on our side is that we helped them smash the murderous Japanese racist juggernaut during W.W. II. It is a normal place where Joe Wang works all day and goes home to drink beer (Harbin) and watch TV (the NBA). It’s a very prosperous country that focuses on family, work, and making money. In China, money is the name of the game, no joke, and that is not bad. “Money, money, money” has many positive effects such as most people having an incredible work ethic.

Surveillance state? Is this a joke? If you live in Great Britain, then you are truly one of the most watched and collected upon human beings in the history of human life on earth. America is the same, but with some limits. Most of the tools that now plague us were invented and first used in the good ‘ol USA, especially in the wake of 9-11. The Chinese play second fiddle to the U.S. and Great Britain in eavesdropping on their populations, and pointing the finger at China will not change that. Now let’s add in Facebook, Google, the IoT, etc. America and Great Britain are both becoming panopticons, and everyone knows they cannot stop it.

If you ask a Chinese person about government surveillance, you get the same answer: the government does a good job protecting me and my family. That is what they really believe. Are you old enough to remember when Americans talked about things and really believed them? Especially about their own government? Chinese people have real faith in their leaders, which is hard for us to even picture having. We Americans do not want to hear about how another country, especially China, which is so foreign to most of us, is a place of well-grounded optimism, hard work, and people who still have a lot of basic trust for each other, things that America is losing or has already lost.

Even worse, the taboo topic in America is this: China is overall less intrusive than the U.S. because the people are mostly the same race; they trust one another. In America, things are different. We have tried to expunge race from the conversation, but it will not go away. All of those immigrants from the Middle East who raise their right hands and become citizens are going to be spied up ladies and gentlemen, as will any racial group which is seen as a threat.

Which brings me to a related point: why don’t the armed Chinese police go around shooting people like the police do in America? Answer: because they are Chinese.

I am not so sure that China’s work in Xinjiang is wrong in any way whatsoever. From what I have seen, it looks like heavy investment in modernization and a zero-tolerance policy towards radicalization. Great Britain and their failed social experiment do not have enough guts to protect their own citizens from knife crime by certain people, so they have accepted a certain level of violence. Great Britain is doomed. They are losing their country. China, on the other hand, has sensible policies which function in a robust legal framework, some of which are backed up by intelligence, so that their country can reach real goals: making money, pulling people out of poverty, and growing their country. It is all very sensible, and it certainly has a future.

And let’s be very frank about it: the U.S. and Great Britain, along with others, are responsible for several disastrous military actions in which a lot of innocent people–at least 200,000–to include children, died early or were maimed, which was also driven by intelligence, much of it bad. China’s last foreign war was in 1975, if I remember correctly.

One wonders who is the real threat to the world’s security. If the U.S. followed its own laws, then we would not need to doubt anything, but after Snowden it seems clear that a parallel, well-paid structure exists inside the U.S. which violates its own laws with impunity and gusto (while losing wars). So, between the U.S. and China, who is the cowboy in the white hat and who is in the black hat?

Otter June 18, 2019 6:06 AM

AlphaZero does NOT play “like an alien”.

It plays like a master of the game of chess. That is, like somebody who (something which) has mastered the rules of the game, and has enough playing experience to have learned how those rules work together.

An early chess program simply, brute-force, tried every possible move until its timing algorythm decided it had to post a move or risk violating the time rule. It evaluated the board after each possible move, assigning (more or less) a numerical score, playing the highest-scored move when stopped by the timing algorythm. Success depended on how cleverly the programers programmed the evaluation function, how fast the engineers made the hardware, and (later) how fast and clever the programmers made data-bases of historical game records.

Evaluation functions and game data-bases are literally “the last war”. Every general knows, and some understand, fighting the last war today can only lead to defeat, unless, we hope, the other general is also fighting the last war.

Evaluation functions are attempts to summarize the real or hypothetical game history database, PLUS statements by successful players about what they THINK they did to win. People don’t always know what they did. Also, in my opinion, players actually imagine that pawns are are ill-armed, untrained peasants, knights are big guys wrapped in steel riding horses, and so on … in addition to, or subtraction from, what the chess rules say.

AlphaZero does not know about human game history, peasants or horses. It knows only the game rules, and its own games against a copy of itself, but none of the learned human commentary. On a decent computer it can play a LOT of games in a few hours.

One illustration. All beginner, intermediate, and advanced chess books teach that doubled pawns, two pawns on the same file, are very weak, and games are won by attcking them. Chess players, without thinking, automatically avoid doubling pawns. Alpha Zero does not avoid them, per se. AlphaZero doubles and even triples pawns, if it happens, or if it wins.

AlphaZero plays some moves which humans, famous humans, heretofore have been taught not to play.

Lee Sedol, if I understand his thought, translated by him to verbal Korean, translated by somebody else to verbal English, thinks AlphaGo made some interesting moves, which Lee had never seen before, and Lee intends to study and understand them. I am not Lee Sedol, but I think Go players are like Chess players : they have a history of gamess and strategies and a literature, a liturgy, which can be questioned.

AlphaZero is not alien. It is merely more patient, and more focused, than humans.

Ah, crap. I know that somewhere up there, but I cannot find it now, @Bruce said something about the data-base and AI strategy may be less important than the outcome evaluation … both the programmed evaluation code, and the human stated or unstated evaluation of the ultimate result.

And that is what I had in mind as I wrote the above paragraphs. Perception of AlphaZero varies enormously : this is alien; what is this crap I don’t understand it; what is it thinking how can I understand it.

Howabout these two : I hope that drone up there knows I am only a poor mother going to market to buy some food for my children, two left at home and one, praise Allah, in my belly. I hope we get this guy, nobody gives a damn about the collaterals as long as we get this terrorist, please God don’t make me ship home anymore American boys and girls in bodybags.

The attacked has very different evaluation functions from the attacker.

Winter June 18, 2019 6:27 AM

@No One
“If you ask a Chinese person about government surveillance, you get the same answer: the government does a good job protecting me and my family.”

Giving “Living Memory”, aka, China in the previous century, they are right.

What the Chinese fear most is disorder. With good reason. The horrors from the Boxing rebellion, war lord era, Japanese occupation, Civil War, Big Leap Forward, Cultural Revolution are still retold as “recent history”. The biggest fear is yet another civil war.

I think that only when they start trusting their compatriots more than their government will they start demanding more liberties.

Winter June 18, 2019 6:31 AM

@No One
“If the U.S. followed its own laws, then we would not need to doubt anything, but after Snowden it seems clear that a parallel, well-paid structure exists inside the U.S. which violates its own laws with impunity and gusto (while losing wars).”

What was most remarkable in the Snowden affair was that people from the USA did not care a iota for what was done to “foreigners”. They only cared, exclusively, about what was done to US citizens. Whether foreigners lived or died was not considered interesting in the US media.

Michael Turnbull June 18, 2019 9:21 AM

I think the biggest issue is that most people don’t know or understand how their being watched. Surveillance is coming from government agencies, and private companies (which is scarier, imo).

This article (https://choosetoencrypt.com/news/what-is-surveillance-capitalism-how-does-it-work/) makes a good point that more touch points between companies and consumers opens the door for more data collection. Now that every company sees the value in user data, it’s almost impossible to avoid sharing some information with the companies you interact with online.

On one hand, this gold rush towards data is driving companies to innovate and improve their products. On the other hand, it means that every company, even if they don’t have proper security in place, is collecting, storing and analyzing your data.

Denton Scratch June 18, 2019 10:23 AM

Predictably, Denton mentions his usual gripe about the loose usage of the term “AI”. I’ll just let it pass, this time.

“As different versions of an algorithm are run over the training data”

This time my gripe is about the term “algorithm”. I can’t be sure, but it think the kinds of “algorithm” that are being compared to determine which ones are successful are machine-generated procedures, perhaps genetically, perhaps by some other kind of random process.

That is, they are not algorithms in the sense that anyone knows how they work, or whether an alternative implementation of the same algorithm might be possible, or even if they are susceptible to algorithmic analysis. An algorithm is a precise and specific definition of a computational procedure, for which any implementation should produce exactly the same outcome as any other implementation. If it’s not one of those, then it’s not an algorithm – maybe it’s a program, maybe it’s a function, or maybe just some machine-generated table of parameters that nobody can have a hope of understanding.

With regard to international AI contests, I do not think the team with the biggest dataset is the winner. I think the team that came up with a type of AI that can explain its reasoning to a human is the winning team, at the end of the day.

General: “So, AI, you are saying that the enemy’s forces are deployed in such a way that our attack at point X is sure to fail. Our HUMINT says different. Can you explain this discrepancy?”

Present AI: “mumble”

Future AI: “Your HUMINT is not considering the large body of statistical data that I have been able to glean from SIGINT, that is simply way too large for your human staff to glean through. Specifically, they have missed the fact that you will face large numbers of insurgents that your HUMINT analysers thought were on your side, but will in fact side with the enemy”.

No One/ Ex Cathedra June 18, 2019 11:15 AM

@ Winter

I agree entirely with both of your excellent points.

“What the Chinese fear most is disorder.” Exactly. The Japanese were able to divide them, manipulate them, and get Chinese people to stab other Chinese people in the back. If we take an unblinking look at the sad history of human cruelty and sickness, our Japanese “friends” would take a top ten spot for what they did in South Korea and China in the first half of the twentieth century. I like to do my own research on this, and it is often chilling. What the Japanese did in China was so bad, so horrific, that even now many people want to erase it from memory. 1938, Nanjing: we are talking about mentally ill monsters on a rampage. An American lady who was there, Minnie Vautrin, saw it all and left. It was so bad that she just went home to Illinois and killed herself. The Chinese are determined that such disorder and division are not going to happen again.

China is a profoundly disorderly and lively place. If you don’t believe that, go to Hefei and take a taxi. The governmental order and centralization fits them. They need it.

And you are right about how Americans like to forget about the deaths our country causes.

What I am trying to do here is get some realistic perspective. China runs vocational training schools and lets people who are becoming radicalized know that it is not going to be tolerated by sending them to some kind of reform center. The U.S. builds secret prisons and kills people like flies. And the U.S. “tortured some folks.” Who is the bad guy here? All of the lies and bad actions, all of the killing and imprisoning, are going to come back and haunt the U.S. and damage its national security. We would do best to acknowledge previous mistakes and try our very best not to repeat them. And China, it is time to treat them as an equal because they already are. It is better to try and get along instead of vilifying them, especially since it would be the pot calling the kettle black.

smh June 18, 2019 11:40 AM

@No One/ Ex Cathedra The difference is that in China, you’d go to jail for that post.

China’s big problem is that no one wants to live in China. They will be geopolitically irrelevant by mid-century. The big question for the CCP is whether the country can get rich before it gets old. If not, they are in big trouble.

Personally, I think the CCP is screwed. And when it’s gone, China breaks up or at least loses control over its rebellious territories like Hong Kong, Tibet, and certainly Taiwan.

Denton Scratch June 18, 2019 11:52 AM

NoOne/ExCathedra

It may be true that “Mei Guo” does indeed translate into english as “beatiful country”, for some dialect of chinese. But it also happens to be a phrase that for many chinese is about as close as they can get to a phrase that sounds like “America”. So you get “mei-gou gen” (where ‘gen’ is people, meaning “american people”.

Compare “In guo” for “England, and “in guo gen” for english people, and “in guo hua” for english language, ( “hua” means speech). “Ye hui bu hui in guo hua” means literally translated “You speak/not speak england speak” (i.e. “Do you speak english?”

I think these terms are more to do with phonetics than fanciful notions of America The Beautiful.

(Apologies for the limitations of my transliteration skills – I learned this stuff 50 years ago, and never practised or revised it since)

David Leppik June 18, 2019 2:01 PM

There’s an old AI saying: “there’s no data like mo data.” That is to say, large datasets rule.

That said, Bruce is right that just having a big dataset (and lots of money and computer time) isn’t enough. Face recognition famously does best with white males. China has two major disadvantages when it comes to developing face recognition. First is the relative homogeneity of the faces. Second is that they have much more control over their visual environment: if face-recognizing cameras don’t work, they can either get better AI or they can get better pictures (e.g. better lighting, visible or infrared, or require people to stand in a particular way to get locked doors to open.) Their ubiquitous surveillance does not need to be secret if it provides convenience, such as using face recognition to board trains.

For better or worse, Western researchers, whether the NSA or Facebook, have more data and motivation to develop covert surveillance AI technology that works against people who are actively trying to evade it.

This reminds me of the 1990s, when the US was losing the “arms race” in computer technology. It was assumed that because Japan had become the world’s best at manufacturing RAM chips, it would soon dominate all of computing. Japan had a big AI research push, which put terror in the hearts of US policy makers. It failed. It was a big bureaucratic mess, with the politicians demanding something they couldn’t articulate. It seems to me that China is similarly making bold but vague claims that imply a disconnect between the politicians and researchers.

Otter June 18, 2019 2:02 PM

@Denton Scratch • June 18, 2019 10:23 AM sez

”Future AI: “Your HUMINT is not considering […] side with the enemy”.”

That is not the end of it.

General: “No problem. We’ll wait a few years. Sabotage their economy. Buy some thought leaders. Hire a bunch of demonstrators. Smuggle in a few tons of weapons. Maybe even shoot some videos. In 5 years, they’ll be fighting the government. They don’t need to know they are on our side.”

Future AI: “OK. We’ll wait then.”

Future AI: “By the way… What are these nuclear things?”

Years later, General: “The computer made me do it.”

Winter June 18, 2019 2:52 PM

With big data AI it is just as with the army: Fight as you Train, Train as you Fight.

If you train on surveillance data, all you learn is surveillance. If you want to use AI for something else, you will need different data.

Faustus June 18, 2019 3:44 PM

@ David Leppik

” China has two major disadvantages when it comes to developing face recognition. First is the relative homogeneity of the faces.”

Is this an objective fact, or an expression of ethnocentrism? Do you have any reference for the assertion that Chinese faces are relatively homogeneous?

There is a long documented social psychological effect that people can identify faces of their own race more easily than other races: https://en.wikipedia.org/wiki/Cross-race_effect

And there does seem to be a theoretical problem in current facial recognition systems that make darker face identification more error prone, regardless of the frequency of darker faces in test data: https://www.nytimes.com/2018/02/09/technology/facial-recognition-race-artificial-intelligence.html

But I see no indication online that Chinese faces are more homogeneous or harder to recognize with AI. Are you sure you are not generalizing from your experience of the Cross-race effect?

No One / Ex Cathedra June 18, 2019 7:50 PM

@smh

"No one wants to live in China"

Tell that to the +100,000 Americans who live there.

 Your interesting comments get to the point of something very important:  China does not export its culture.  And very few foreigners go to China and become Chinese and stay there for the rest of their lives.  Staying there is usually done on a year-to-year basis.

 Time will tell.  No one knows the future.  I would be careful when talking about China as "old"--it is definitely old, but it is young too.  It is the crowds of PhD engineers and the sea of young technologists that should give folks elsewhere pause.  

 In my judgment, the United States has one foot in the grave, and China is on its way to becoming the most powerful country that the world has ever beheld.  I don't want to see the U.S. wane, to see the experiment fail.  It is good to travel if you really want to know what is going on.

 It would be interesting to come up with metrics which would help us make solid statements about the directions in which both countries are going.

No One / Ex Cathedra June 18, 2019 7:55 PM

“Every Chinese citizen has a national ID number that is demanded by most services and allows data to easily be tied together.”

Did you ever hear of the Social Security Number in the U.S.?

That was the first time this website made me laugh.

n June 18, 2019 8:41 PM

@ Denton Scratch

You got me on that one. You are definitely right to some degree: it is partly about phonetics. I should not have used that as an example of them having a respectful view of the U.S. They do, or used to, have such a view.

On the other side of the coin, I know that when President Obama went to Hangzhou, the Chinese view of him and all of this people went like this:
amused contempt.

But one of their favorite human beings is Ivanka (Yiwanka) Trump– a star, a goddess, a human wonder they adore–for her heady mix of beauty, motherhood, and political power. When she came to China it caused an absolute sensation, but the Washington Post portrayed her visit as some kind of boring failure. That is when I knew the Washington Post was not worth my time reading anymore. Fake stories do not help me understand the world much–or perhaps they do.

In short, to sum up my comments, if you want to know anything about China, you need to find out for yourself. The prejudice against it in the U.S. media is staggering in its shallowness and hypocrisy.

No One / Ex Cathedra June 19, 2019 1:16 AM

If you want to get the high-level Chinese view of what they are doing in their country, to include information security, check out the journal Qiushi, which is published under the guidance of the Central Committee of the Communist Party of China.

The website is also fascinating. It is the kind of material that diplomats and academics read.

You get the real Chinese view, and they often talk about technology. I find that it is always worth reading.

Winter June 19, 2019 6:37 AM

@Denton Scratch

All the Chinese country names have polite meanings: England – Yīngguó also means “Brave land”, France – Fàguó also means “Law land”, Germany – Déguó also means “Virtue land” etc.

This is a general rule in Chinese name transliterations. If the phonetic transcription of a name results in an impolite transliteration in Chinese, then this is intentional.

It is like in person names. People are called “Rose” or “Marigold”, not “Nettle” or “Thistle”.

Denton Scratch June 19, 2019 6:55 AM

@Winter

Thank you for drawing my attention to that fact, of which I was unaware. As I said, my introduction to Chinese was a long time ago; it lasted one year of school lessons, and I have never used Chinese since.

JPA June 19, 2019 12:42 PM

Regarding playing Go or chess as a measure of intelligence:

Playing complex games well certainly measures a form of intelligence, but another form of intelligence is the ability to adapt to a situation in which the rules are changing.

For example I would like to see a computer play a human in a game in which rules change without either player being informed ot the change.

Jerry June 20, 2019 3:37 AM

As far as I know, AI needs a lot of training data that are most effectively “labelled” thus it requires a human force to label such data for training. This is an advantage for the Chinese because they have a lot of very cheap educated labour available that will work for peanuts compared to the dollar. In the US of A, our boys take the approach of “crowd source” which means they set up platforms of machine learning for anyone to tap into. Those who utilize this “platforum” voluntarily share their own trained data sets with the platform providers. This is “free labour” in some eyes but many see it as fair trade.

The east and the west are apparently taking separate approaches to tackle this problem. The West however has a big head start because these studies have been around much longer in the western circles.

A Nonny Bunny June 22, 2019 3:33 PM

The different layers will try different features and will be compared by the evaluation function until the one that is able to give the best results is found, in a process that is only slightly more refined than trial and error.

That sounds more like a description of genetic algorithms than of what’s used in deep neural networks. (And even then it would be way off base.)
In deep neural networks the learning algorithm can tell from the gradient of the error-landscape how to alter its parameters to improve it’s performance. Frankly, calling this “only slightly more refined that trial and error” is like calling Newton’s method for finding roots “only slightly more refined than trial and error”.

David Wall July 15, 2019 1:44 PM

This will depend on how much money is made by tracking Chinese people. Their massive dataset surely will be a treasure trove for the police and marketers in China, but that dataset won’t port easily to the west, or even Asian nations that aren’t all about surveillance.
Most useful AI isn’t about marketing or tracking criminals.

Leave a comment

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.