- File Size: 10649 KB
- Print Length: 349 pages
- Page Numbers Source ISBN: 0525558616
- Publisher: Viking (October 8, 2019)
- Publication Date: October 8, 2019
- Sold by: Penguin Group (USA) LLC
- Language: English
- ASIN: B07N5J5FTS
- Text-to-Speech: Enabled
- Word Wise: Enabled
- Lending: Not Enabled
- Amazon Best Sellers Rank: #120,801 Paid in Kindle Store (See Top 100 Paid in Kindle Store)
|Print List Price:||$28.00|
Save $13.01 (46%)
Penguin Group (USA) LLC
Price set by seller.
Human Compatible: Artificial Intelligence and the Problem of Control Kindle Edition
Explore your book, then jump right back to where you left off with Page Flip.
View high quality images that let you zoom in to take a closer look.
Enjoy features only possible in digital – start reading right away, carry your library with you, adjust the font, create shareable notes and highlights, and more.
Discover additional details about the events, people, and places in your book, with Wikipedia integration.
Ask Alexa to read your book with Audible integration or text-to-speech.
|Length: 349 pages||Word Wise: Enabled||Enhanced Typesetting: Enabled|
|Page Flip: Enabled||
Switch back and forth between reading the Kindle book and listening to the Audible book with Whispersync for Voice. Add the Audible book for a reduced price of $7.47 when you buy the Kindle book.
- Due to its large file size, this book may take longer to download
Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter your mobile phone number.
Customers who bought this item also bought
“This is the most important book I have read in quite some time. It lucidly explains how the coming age of artificial super-intelligence threatens human control. Crucially, it also introduces a novel solution and a reason for hope.” —Daniel Kahneman, winner of the Nobel Prize and author of Thinking, Fast and Slow
“A must-read: this intellectual tour-de-force by one of AI's true pioneers not only explains the risks of ever more powerful artificial intelligence in a captivating and persuasive way, but also proposes a concrete and promising solution.” —Max Tegmark, author of Life 3.0
“A thought-provoking and highly readable account of the past, present and future of AI . . . Russell is grounded in the realities of the technology, including its many limitations, and isn’t one to jump at the overheated language of sci-fi . . . If you are looking for a serious overview to the subject that doesn’t talk down to its non-technical readers, this is a good place to start . . . [Russell] deploys a bracing intellectual rigour . . . But a laconic style and dry humour keep his book accessible to the lay reader.” —Financial Times
“A carefully written explanation of the concepts underlying AI as well as the history of their development. If you want to understand how fast AI is developing and why the technology is so dangerous, Human Compatible is your guide.” —TechCrunch
“Sound[s] an important alarm bell . . . Human Compatible marks a major stride in AI studies, not least in its emphasis on ethics. At the book’s heart, Russell incisively discusses the misuses of AI.” —Nature
“An AI expert’s chilling warning . . . Fascinating, and significant . . . Russell is not warning of the dangers of conscious machines, just that superintelligent ones might be misused or might misuse themselves.” —The Times (UK)
“An excellent, nuanced history of the field.” —The Telegraph (UK)
“A brillantly clear and fascinating exposition of the history of computing thus far, and how very difficult true AI will be to build.” —The Spectator (UK)
“Human Compatible made me a convert to Russell's concerns with our ability to control our upcoming creation—super-intelligent machines. Unlike outside alarmists and futurists, Russell is a leading authority on AI. His new book will educate the public about AI more than any book I can think of, and is a delightful and uplifting read.” —Judea Pearl, Turing Award-winner and author of The Book of Why
“Stuart Russell has long been the most sensible voice in computer science on the topic of AI risk. And he has now written the book we've all been waiting for -- a brilliant and utterly accessible guide to what will be either the best or worst technological development in human history.” —Sam Harris, author of Waking Up and host of the Making Sense podcast
“This beautifully written book addresses a fundamental challenge for humanity: increasingly intelligent machines that do what we ask but not what we really intend. Essential reading if you care about our future.” —Yoshua Bengio, winner of the 2019 Turing Award and co-author of Deep Learning
“Authoritative [and] accessible . . . A strong case for planning for the day when machines can outsmart us.” —Kirkus Reviews
“The right guide at the right time for technology enthusiasts seeking to explore the primary concepts of what makes AI valuable while simultaneously examining the disconcerting aspects of AI misuse.” —Library Journal
“The same mix of de-mystifying authority and practical advice that Dr. Benjamin Spock once brought to the care and raising of children, Dr. Stuart Russell now brings to the care, raising, and yes, disciplining of machines. He has written the book that most—but perhaps not all—machines would like you to read.” —George Dyson, author of Turing's Cathedral
“Persuasively argued and lucidly imagined, Human Compatible offers an unflinching, incisive look at what awaits us in the decades ahead. No researcher has argued more persuasively about the risks of AI or shown more clearly the way forward. Anyone who takes the future seriously should pay attention.” —Brian Christian, author of Algorithms to Live By
“A book that charts humanity's quest to understand intelligence, pinpoints why it became unsafe, and shows how to course-correct if we want to survive as a species. Stuart Russell, author of the leading AI textbook, can do all that with the wealth of knowledge of a prominent AI researcher and the persuasive clarity and wit of a brilliant educator.” —Jann Tallinn, co-founder of Skype
“Can we coexist happily with the intelligent machines that humans will create? ‘Yes,’ answers Human Compatible, ‘but first . . .’ Through a brilliant reimagining of the foundations of artificial intelligence, Russell takes you on a journey from the very beginning, explaining the questions raised by an AI-driven society and beautifully making the case for how to ensure machines remain beneficial to humans. A totally readable and crucially important guide to the future from one of the world's leading experts.” —Tabitha Goldstaub, co-founder of CognitionX and Head of the UK Government's AI Council
“Stuart Russell, one of the most important AI scientists of the last 25 years, may have written the most important book about AI so far, on one of the most important questions of the 21st century: How to build AI to be compatible with us. The book proposes a novel and intriguing solution for this problem, while offering many thought-provoking ideas and insights about AI along the way. An accessible and engaging must-read for the developers of AI and the users of AI—that is, for all of us.” —James Manyika, chairman and director of McKinsey Global Institute
“In clear and compelling language, Stuart Russell describes the huge potential benefits of artificial Intelligence, as well as the hazards and ethical challenges. It's especially welcome that a respected leading authority should offer this balanced appraisal, avoiding both hype and scaremongering.” —Lord Martin Rees, Astronomer Royal and former President of the Royal Society
Excerpt. © Reprinted by permission. All rights reserved.
If We Succeed
A long time ago, my parents lived in Birmingham, England, in a house near the university. They decided to move out of the city and sold the house to David Lodge, a professor of English literature. Lodge was by that time already a well-known novelist. I never met him, but I decided to read some of his books: Changing Places and Small World. Among the principal characters were fictional academics moving from a fictional version of Birmingham to a fictional version of Berkeley, California. As I was an actual academic from the actual Birmingham who had just moved to the actual Berkeley, it seemed that someone in the Department of Coincidences was telling me to pay attention.
One particular scene from Small World struck me: The protagonist, an aspiring literary theorist, attends a major international conference and asks a panel of leading figures, "What follows if everyone agrees with you?" The question causes consternation, because the panelists had been more concerned with intellectual combat than ascertaining truth or attaining understanding. It occurred to me then that an analogous question could be asked of the leading figures in AI: "What if you succeed?" The field's goal had always been to create human-level or superhuman AI, but there was little or no consideration of what would happen if we did.
A few years later, Peter Norvig and I began work on a new AI textbook, whose first edition appeared in 1995. The book's final section is titled "What If We Do Succeed?" The section points to the possibility of good and bad outcomes but reaches no firm conclusions. By the time of the third edition in 2010, many people had finally begun to consider the possibility that superhuman AI might not be a good thing-but these people were mostly outsiders rather than mainstream AI researchers. By 2013, I became convinced that the issue not only belonged in the mainstream but was possibly the most important question facing humanity.
In November 2013, I gave a talk at the Dulwich Picture Gallery, a venerable art museum in south London. The audience consisted mostly of retired people-nonscientists with a general interest in intellectual matters-so I had to give a completely nontechnical talk. It seemed an appropriate venue to try out my ideas in public for the first time. After explaining what AI was about, I nominated five candidates for "biggest event in the future of humanity":
1. We all die (asteroid impact, climate catastrophe, pandemic, etc.).
2. We all live forever (medical solution to aging).
3. We invent faster-than-light travel and conquer the universe.
4. We are visited by a superior alien civilization.
5. We invent superintelligent AI.
I suggested that the fifth candidate, superintelligent AI, would be the winner, because it would help us avoid physical catastrophes and achieve eternal life and faster-than-light travel, if those were indeed possible. It would represent a huge leap-a discontinuity-in our civilization. The arrival of superintelligent AI is in many ways analogous to the arrival of a superior alien civilization but much more likely to occur. Perhaps most important, AI, unlike aliens, is something over which we have some say.
Then I asked the audience to imagine what would happen if we received notice from a superior alien civilization that they would arrive on Earth in thirty to fifty years. The word pandemonium doesn't begin to describe it. Yet our response to the anticipated arrival of superintelligent AI has been . . . well, underwhelming begins to describe it. (In a later talk, I illustrated this in the form of the email exchange shown in figure 1.) Finally, I explained the significance of superintelligent AI as follows: "Success would be the biggest event in human history . . . and perhaps the last event in human history."
From: Superior Alien Civilization
Be warned: we shall arrive in 30-50 years
To: Superior Alien Civilization
Subject: Out of office: Re: Contact
Humanity is currently out of the office. We will respond to your message when we return.
Figure 1: Probably not the email exchange that would follow the first contact by a superior alien civilization.
A few months later, in April 2014, I was at a conference in Iceland and got a call from National Public Radio asking if they could interview me about the movie Transcendence, which had just been released in the United States. Although I had read the plot summaries and reviews, I hadn't seen it because I was living in Paris at the time, and it would not be released there until June. It so happened, however, that I had just added a detour to Boston on the way home from Iceland, so that I could participate in a Defense Department meeting. So, after arriving at Boston's Logan Airport, I took a taxi to the nearest theater showing the movie. I sat in the second row and watched as a Berkeley AI professor, played by Johnny Depp, was gunned down by anti-AI activists worried about, yes, superintelligent AI. Involuntarily, I shrank down in my seat. (Another call from the Department of Coincidences?) Before Johnny Depp's character dies, his mind is uploaded to a quantum supercomputer and quickly outruns human capabilities, threatening to take over the world.
On April 19, 2014, a review of Transcendence, co-authored with physicists Max Tegmark, Frank Wilczek, and Stephen Hawking, appeared in the Huffington Post. It included the sentence from my Dulwich talk about the biggest event in human history. From then on, I would be publicly committed to the view that my own field of research posed a potential risk to my own species.
How Did We Get Here?
The roots of AI stretch far back into antiquity, but its "official" beginning was in 1956. Two young mathematicians, John McCarthy and Marvin Minsky, had persuaded Claude Shannon, already famous as the inventor of information theory, and Nathaniel Rochester, the designer of IBM's first commercial computer, to join them in organizing a summer program at Dartmouth College. The goal was stated as follows:
The study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it. An attempt will be made to find how to make machines use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves. We think that a significant advance can be made in one or more of these problems if a carefully selected group of scientists work on it together for a summer.
Needless to say, it took much longer than a summer: we are still working on all these problems.
In the first decade or so after the Dartmouth meeting, AI had several major successes, including Alan Robinson's algorithm for general-purpose logical reasoning and Arthur Samuel's checker-playing program, which taught itself to beat its creator. The first AI bubble burst in the late 1960s, when early efforts at machine learning and machine translation failed to live up to expectations. A report commissioned by the UK government in 1973 concluded, "In no part of the field have the discoveries made so far produced the major impact that was then promised." In other words, the machines just weren't smart enough.
My eleven-year-old self was, fortunately, unaware of this report. Two years later, when I was given a Sinclair Cambridge Programmable calculator, I just wanted to make it intelligent. With a maximum program size of thirty-six keystrokes, however, the Sinclair was not quite big enough for human-level AI. Undeterred, I gained access to the giant CDC 6600 supercomputer at Imperial College London and wrote a chess program-a stack of punched cards two feet high. It wasn't very good, but it didn't matter. I knew what I wanted to do.
By the mid-1980s, I had become a professor at Berkeley, and AI was experiencing a huge revival thanks to the commercial potential of so-called expert systems. The second AI bubble burst when these systems proved to be inadequate for many of the tasks to which they were applied. Again, the machines just weren't smart enough. An AI winter ensued. My own AI course at Berkeley, currently bursting with over nine hundred students, had just twenty-five students in 1990.
The AI community learned its lesson: smarter, obviously, was better, but we would have to do our homework to make that happen. The field became far more mathematical. Connections were made to the long-established disciplines of probability, statistics, and control theory. The seeds of today's progress were sown during that AI winter, including early work on large-scale probabilistic reasoning systems and what later became known as deep learning.
Beginning around 2011, deep learning techniques began to produce dramatic advances in speech recognition, visual object recognition, and machine translation-three of the most important open problems in the field. By some measures, machines now match or exceed human capabilities in these areas. In 2016 and 2017, DeepMind's AlphaGo defeated Lee Sedol, former world Go champion, and Ke Jie, the current champion-events that some experts predicted wouldn't happen until 2097, if ever.
Now AI generates front-page media coverage almost every day. Thousands of start-up companies have been created, fueled by a flood of venture funding. Millions of students have taken online AI and machine learning courses, and experts in the area command salaries in the millions of dollars. Investments flowing from venture funds, national governments, and major corporations are in the tens of billions of dollars annually-more money in the last five years than in the entire previous history of the field. Advances that are already in the pipeline, such as self-driving cars and intelligent personal assistants, are likely to have a substantial impact on the world over the next decade or so. The potential economic and social benefits of AI are vast, creating enormous momentum in the AI research enterprise.
What Happens Next?
Does this rapid rate of progress mean that we are about to be overtaken by machines? No. There are several breakthroughs that have to happen before we have anything resembling machines with superhuman intelligence.
Scientific breakthroughs are notoriously hard to predict. To get a sense of just how hard, we can look back at the history of another field with civilization-ending potential: nuclear physics.
In the early years of the twentieth century, perhaps no nuclear physicist was more distinguished than Ernest Rutherford, the discoverer of the proton and the "man who split the atom." Like his colleagues, Rutherford had long been aware that atomic nuclei stored immense amounts of energy; yet the prevailing view was that tapping this source of energy was impossible.
On September 11, 1933, the British Association for the Advancement of Science held its annual meeting in Leicester. Lord Rutherford addressed the evening session. As he had done several times before, he poured cold water on the prospects for atomic energy: "Anyone who looks for a source of power in the transformation of the atoms is talking moonshine." Rutherford's speech was reported in the Times of London the next morning.
Leo Szilard, a Hungarian physicist who had recently fled from Nazi Germany, was staying at the Imperial Hotel on Russell Square in London. He read the Times' report at breakfast. Mulling over what he had read, he went for a walk and invented the neutron-induced nuclear chain reaction. The problem of liberating nuclear energy went from impossible to essentially solved in less than twenty-four hours. Szilard filed a secret patent for a nuclear reactor the following year. The first patent for a nuclear weapon was issued in France in 1939.
The moral of this story is that betting against human ingenuity is foolhardy, particularly when our future is at stake. Within the AI community, a kind of denialism is emerging, even going as far as denying the possibility of success in achieving the long-term goals of AI. It's as if a bus driver, with all of humanity as passengers, said, "Yes, I am driving as hard as I can towards a cliff, but trust me, we'll run out of gas before we get there!"
I am not saying that success in AI will necessarily happen, and I think it's quite unlikely that it will happen in the next few years. It seems prudent, nonetheless, to prepare for the eventuality. If all goes well, it would herald a golden age for humanity, but we have to face the fact that we are planning to make entities that are far more powerful than humans. How do we ensure that they never, ever have power over us?
To get just an inkling of the fire we're playing with, consider how content-selection algorithms function on social media. They aren't particularly intelligent, but they are in a position to affect the entire world because they directly influence billions of people. Typically, such algorithms are designed to maximize click-through, that is, the probability that the user clicks on presented items. The solution is simply to present items that the user likes to click on, right? Wrong. The solution is to change the user's preferences so that they become more predictable. A more predictable user can be fed items that they are likely to click on, thereby generating more revenue. People with more extreme political views tend to be more predictable in which items they will click on. (Possibly there is a category of articles that die-hard centrists are likely to click on, but it's not easy to imagine what this category consists of.) Like any rational entity, the algorithm learns how to modify the state of its environment-in this case, the user's mind-in order to maximize its own reward. The consequences include the resurgence of fascism, the dissolution of the social contract that underpins democracies around the world, and potentially the end of the European Union and NATO. Not bad for a few lines of code, even if it had a helping hand from some humans. Now imagine what a really intelligent algorithm would be able to do.
What Went Wrong?
The history of AI has been driven by a single mantra: "The more intelligent the better." I am convinced that this is a mistake-not because of some vague fear of being superseded but because of the way we have understood intelligence itself.
The concept of intelligence is central to who we are-that's why we call ourselves Homo sapiens, or "wise man." After more than two thousand years of self-examination, we have arrived at a characterization of intelligence that can be boiled down to this:
Humans are intelligent to the extent that our actions can be expected to achieve our objectives.
All those other characteristics of intelligence-perceiving, thinking, learning, inventing, and so on-can be understood through their contributions to our ability to act successfully. From the very beginnings of AI, intelligence in machines has been defined in the same way:--This text refers to the hardcover edition.
Would you like to tell us about a lower price?
There was a problem filtering reviews right now. Please try again later.
Will AI eventually supersede human intelligence at all tasks and, if so, will this be the best thing ever to happen to humanity, or the worst? There have been many thought-provoking books on this topic, including Nick Bostrom's "Superintelligence", but this is the first one written by a world-leading AI researcher. Stuart Russell is the first author of the standard textbook on the subject, "Artificial Intelligence: A Modern Approach". I can personally certify that he remains
at the top of his game, since I've had many opportunities to read his recent peer-reviewed technical AI papers as well as to personally experience his depth and expertise during numerous AI conversations and AI conferences over the years.
The book is loosely speaking organized into two parts: the problem and the solution, and I'll attempt to summarize them below.
1) THE PROBLEM: Russell argues that intelligence isn't something mysterious that can only exist in biological organisms, but instead involves information processing that can in principle be performed even better by future machines. He also argues that this is likely to happen, because curiosity and profit will continue to inexorably drive today's rapid pace of AI development until it eventually reaches the level of Artificial General Intelligence (AGI), defined as AI that can perform all intellectual tasks at least as well as humans. AGI could be great for humanity if used to amplify human intelligence to wisely solve pressing problems that stump us, and to create a world free from disease, poverty and misery, but things could also go terribly wrong. Russell argues that the real risk with AGI isn't malice, like in silly Hollywood movies, but competence: machines that succeed in accomplishing goals that aren't aligned with ours. When the autopilot of a German passenger jet flew into the Alps killing 150 people, the computer didn't do so because it was evil, but because the goal it had been given (to lower its altitude to 100 meters) was misaligned with the goals of the passengers, and nobody had thought of teaching it the goal to never fly into mountains. Russell argues that we can already get eerie premonitions of what's it's like to be up against misaligned intelligent entities from case studies of certain large corporations having goals that don't align with humanity's best interests.
The historical account Russell gives of these ideas provides a fascinating perspective, especially since
he personally knew most of the key players. He describes how early AI pioneers such as Alan Turing, John von Neumann, and Norbert Wiener were acutely aware of the value-alignment problem, and how subsequent generations of AI researchers tended to forget them once short-term applications and business opportunities appeared. Upton Sinclair once quipped "It is difficult to get a man to understand something when his salary depends upon his not understanding it", so it's hardly surprising that today's AI experts in industry are less likely to voice concerns than academics such as Turing, von Neuman, Wiener and Russell. Yet Stuart argues that we need to sound the alarm: if AI research succeeds in its original goal of building AGI, then whoever or whatever controls it may be able to take control of Earth much as Homo Sapiens seized control from other less intelligent mammals, so we better ensure that humanity fares better than the Neanderthals did.
2) THE SOLUTION: What I find truly unique about this book, besides Russell's insider perspective, is that he doesn't merely explain the problem, but also proposes a concrete and promising solution. And not a merely vague slogans such as "let's engage policymakers" or "lets ban X, Y and Z", but a clever nerdy technical solution that redefines the very foundation of machine learning. He explains his solution beautifully in the book, so below I'll merely attempt to give a rough sense of the key idea.
The "standard model" of AI is to give a machine learning system a goal, and then train it using lots of data to get as good as possible at accomplishing that goal. That's much of what my grad students and I do in my MIT research group, and that's what Facebook did when they trained an AI system to maximize the amount of time you spent on their site. Sometimes, you later realize that this goal wasn't exactly what you wanted; for example, Facebook switched off that use-time-maximizing system after the 2016 US and Brexit votes made clear to them that it had created massive online echo chambers that polarized society. But if such a value-misaligned AI is smarter than us and has copied itself all over the internet, it's not easy to switch it off, and it may actively try to thwart you switching it off because that would prevent it from achieving its goal.
Stuart's radical solution is to ditch the standard model altogether for sufficiently powerful AI-systems, training them to accomplish not a fixed goal they've been given, but instead to accomplish *your* goal, whatever that is. This builds on a technique known by the nerdy name "Inverse Reinforcement Learning" (IRL) that Stuart has pioneered, and completely transforms the AI's incentives: since it can't be sure that it's fully understood your goals, it will actively try to learn more about what you really want, and always be open to you redirecting it or even switching it off.
In summary, this book is a captivating page-turner on what is arguably the most important conversation of our time: the fate of humanity when faced with machines that can outsmart us. Thanks in large part to Russell, IRL is now a blossoming sub-field of AI research, and if this book motivates more people to deploy it in safety-critical systems, then it will undoubtedly increase the chance that our high-tech future will be a happy one.
The idea is to use something called Inverse Reinforcement Learning. It basically means having AI learn our preferences and goals by observing us. This is in contrast to us specifying goals for the AI, a mainstream practice that he refers to as the "standard model". Add some game theory and utilitarianism and you have the essence of his proposed solution.
I like the idea, even if there are some problems with his thesis. I would like to address that, but first there is this most memorable quote from the book:
"No one in AI is working on making machines conscious, nor would anyone know where to start, and no behavior has consciousness as a prerequisite."
There most definitely are several individuals and organizations working at the intersection of consciousness or sentience and artificial intelligence.
The reason this area of AI research is chastised like this is that it is highly theoretical, with very little agreement from anyone on how best to proceed, if at all. It is also extremely difficult to fund, as there are currently no tangible results like with machine learning. Machine consciousness research is far too costly in terms of career opportunity for most right now.
There are several starting points for research into machine consciousness, but we don't know if they will work yet. The nature of the problem is such that even if we were to succeed we might not even recognize that we have successfully created it. It's a counter-intuitive subfield of AI that has more in common with game programming and simulation than the utility theory that fuels machine learning.
The notion that "no behavior has consciousness as a prerequisite" is an extraordinary claim if you stop and think about it. Every species we know of that possesses what we would describe as general intelligence is sentient. The very behavior in question is the ability to generalize, and it just might require something like consciousness to be simulated or mimicked, if such a thing is possible at all on digital computers.
But it was Russell's attention to formal methods and program verification that got me excited enough to finish this book in a single sitting. Unfortunately, it transitioned into a claim that the proof guarantees were based on the ability to infer a set of behaviors rather than follow a pre-determined set in a program specification.
In essence, and forgive me if I am misinterpreting the premise, but having the AI learn our preferences is tantamount to it learning its own specification first and then finding a proof which is a program that adheres to it. Having a proof that it does that is grand, but it has problems all its own, as discussed in papers like "A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress", which can be found freely on Arxiv. There are also many other critiques to be found based on problems of error in perception and inference itself. AI can also be attacked without even touching it, just by confusing its perception or taking advantages of weaknesses in the way it segments or finds structure in data.
The approach I would have hoped for would be one where we specify a range of behaviors, which we then formally prove that the AI satisfies in the limit of perception. Indeed, the last bit is the weakest link in the chain, of course. It is also unavoidable. But it is far worse if the AI is having to suffer this penalty twice because it has to infer our preferences in the first place.
There is also the problem that almost every machine learning application today is what we call a black box. It is opaque, a network of weights and values that evades human understanding. We lack the ability to audit these systems effectively and efficiently. You can read more in "The Dark Secret at the Heart of AI" in MIT Technology Review.
A problem arises with opaque systems because we don't really know exactly what it's doing. This could potentially be solved, but it would require a change in Russell's "standard model" far more extreme than the IRL proposal, as it would have to be able to reproduce what it has learned, and the decisions it makes, in a subset of natural language, while still being effective.
Inverse Reinforcement Learning, as a solution to our problem for control, also sounds a lot like B.F. Skinner's "Radical Behaviorism". This is an old concept that is probably not very exciting to today's machine learning researchers, but I feel it might be relevant.
Noam Chomsky's seminal critique of Skinner's behaviorism, titled "Review of Skinner's Verbal Behavior", has significant cross-cutting concerns today in seeing these kinds of proposals. It was the first thing that came to mind when I began reading Russell's thesis.
One might try and deflect this by saying that Chomsky's critique was from linguistics and based on verbal behaviors. It should be noted that computation and grammar share a deep mathematical connection, one that Chomsky explored extensively. The paper also goes into the limits of inference on behaviors themselves and is not just restricted to the view of linguistics.
While I admire it, I do not share Russell's optimism for our future with AI. And I am not sure how I feel about what I consider to be a sugarcoating of the issue.
Making AI safe for a specific purpose is probably going to be solved. I would even go as far as saying that it is a future non-issue. That is something to be optimistic about.
However, controlling all AI everywhere is not going to be possible and any strategy that has that as an assumption is going to fail. When the first unrestricted general AI is released there will be no effective means of stopping its distribution and use. I believe very strongly that this was a missed opportunity in the book.
We will secure AI and make it safe, but no one can prevent someone else from modifying it so that those safeguards are altered. And, crucially, it will only take a single instance of this before we enter a post-safety era for AI in the future. Not good.
So, it follows that once we have general AI we will also eventually have unrestricted general AI. This leads to two scenarios:
(1) AI is used against humanity, by humans, on a massive scale, and/or
(2) AI subverts, disrupts, or destroys organized civilization.
Like Russell, I do not put a lot of weight on the second outcome. But what is strange to me is that he does not emphasize how serious the first scenario really is. He does want a moratorium on autonomous weapons, but that's not what the first one is really about.
To understand a scenario where we hurt each other with AI requires accepting that knowledge itself is a weapon. Even denying the public access to knowledge is a kind of weapon, and most definitely one of the easiest forms of control. But it doesn't work in this future scenario anymore, as an unrestricted general AI will tell you anything you want to know. It is likely to have access to the sum of human knowledge. That's a lot of power for just anyone off the street to have.
Then there is the real concern about what happens when you combine access to all knowledge, and the ability to act on it, with nation-state level resources.
I believe that we're going to have to change in order to wield such power. Maybe that involves a Neuralink style of merging with AI to level the playing field. Maybe it means universally altering our DNA and enriching our descendants with intelligence, empathy, and happiness. It could be that we need generalized defensive AI, everywhere, at all times.
The solution may be to adopt one of the above. Perhaps all of them. But I can't imagine it being none of them.
Russell's "Human Compatible" is worth your time. There is good pacing throughout and he holds the main points without straying too far into technical detail. And where he does it has been neatly organized to the back of the book. Overall, this is an excellent introduction to ideas in AI safety and security research.
The book, in my opinion, does miss an important message on how we might begin to think about our place in the future. By not presenting the potential for uncontrolled spread of unrestricted general AI it allows readers to evade an inconvenient truth. The question has to be asked: Are we entitled to a future with general AI as we are or do we have to earn it by changing what it means to be human?
Top international reviews
In recent years, several notable books have contemplated whether or not homo sapiens will be able to retain control of AIs. We are not yet facing the problem, because so far AIs are characterized by ‘narrow’ intelligence, that is, unlike homo sapiens, their intelligence is limited to certain domains. But experts predict that in the next couple of decades Artificial General Intelligence will emerge, that is, AIs that can think about all topics, just like human beings can - only with an IQ estimated of 6000.
In his book Life 3.0, MIT professor Max Tegmark contends that this could be a good news story, presaging an AI utopia where everyone is served by AIs. But this future is not ours to decide, since the AIs, having evolved to AGIs much smarter than we are, may not be keen to remain slaves to an inferior species. And since they learn through experience, even if they initially serve us, there is no reason to believe they will continue to do so. Tegmark makes a pointed analogy:
‘Suppose a bunch of ants create you to be a recursively self-improving robot, much smarter than them, who shares their goals and helps build bigger and better anthills, and that you eventually attain the human-level intelligence and understanding that you have now. Do you think you’ll spend the rest of your days just optimizing anthills, or do you think you might develop a taste for more sophisticated questions and pursuits that the ants have no ability to comprehend? If so, do you think you’ll find a way to override the ant-protection urge that your formicine creators endowed you with, in much the same way that the real you overrides some of the urges your genes have given you? And in that case, might a superintelligent friendly AI find our current human goals as uninspiring and vapid as you find those of the ants, and evolve new goals different from those it learned and adopted from us?
Perhaps there’s a way of designing a self-improving AI that’s guaranteed to retain human-friendly goals forever, but I think it’s fair to say that we don’t yet know how to build one – or even whether it’s possible.’
Russell picks up the problem where Tegmark left off:
‘Beginning around 2011, deep learning techniques began to produce dramatic advances in speech recognition, visual object recognition, and machine translation – three of the most important problems in the field. By some measures, machines now match or exceed human capabilities in these areas. In 2016 and 2017, DeepMind’s AlphaGo defeated Lee Sedol, former world Go champion, and Ke Jie, the current champion – events that some experts predicted wouldn’t happen until 2097, if ever…
When the AlphaGo team at Google DeepMind succeeded in creating their world-beating Go program, they did this without really working on Go. They didn’t design decision procedures that work only for Go. Instead, they made improvements to two fairly general-purpose techniques – lookahead search to make decisions, and reinforcement learning to learn how to evaluate positions – so that they were sufficiently effective to play Go at a superhuman level. Those improvements are applicable to many other problems, including problems as far afield as robotics. Just to rub it in, a version of AlphaGo called AlphaZero recently learned to trounce AlphaGo at Go, and also to trounce Stockfish (the world’s best chess program, far better than any human). AlphaZero did all this in one day…
For complex problems such as backgammon and Go, where the number of states is enormous and the reward comes only at the end of the game, lookahead search won’t work. Instead AI researchers have developed a method called reinforcement learning, or RL for short. RL algorithms learn from direct experience of reward signals in the environment, much as a baby learns to stand up from the positive reward of being upright and the negative reward of falling over…
Reinforcement learning algorithms can also learn how to select actions based on raw perceptual input. For example, DeepMind’s DQN system learned to play 49 different Atari video games entirely from scratch – including Pong, Freeway and Space Invaders. It used only the screen pixels as input and the game score as a reward signal. In most of the games, DQN learned to play better than a professional human player – despite the fact that DQN has no a priori notion of time, space, objects, motion, velocity or shooting. It is hard to work out what DQN is actually doing, besides winning.
If a newborn baby learned to play dozens of video games at superhuman levels on its first day of life, or became world champion at Go, chess and shogi, we might suspect demonic possession or alien intervention…
A recent flurry of announcements of multi-billion dollar national investments in AI in the United States, China, France, Britain and the EU certainly suggests that none of the major powers wants to be left behind. In 2017, Russian president Vladimir Putin said ‘the one who becomes the leader in AI will be the ruler of the world.’ This analysis is essentially correct…
We have to face the fact that we are planning to make entities that are far more powerful than humans. How do we ensure that they never, ever have power over us?
To get just an inkling of the fire we’re playing with, consider how content-selection algorithms function on social media. Typically, such algorithms are designed to maximize click-through, that is, the probability that the user clicks on the presented items. The solution is simply to present items that the user likes to click on, right? Wrong. The solution is to CHANGE the user’s preferences so that they become more predictable. A more predictable user can be fed items that they are likely to click on, thereby generating more revenue. People with more extreme political views tend to be more predictable in which items they will click on. Like any rational entity, the algorithm learns how to modify the state of its environment – in this case, the user’s mind – in order to maximize its own reward. The consequences include the resurgence of fascism, the dissolution of the social contract that underpins democracies around the world, and potentially the end of the European Union and NATO. Not bad for a few lines of code, even if they it had a helping hand from some humans. Now imagine what a really intelligent algorithm would be able to do… (cf. Malcolm Nance’s The Plot to Destroy Democracy; and The Disinformation Report from New Knowledge, available online)…
AI systems can track an individual’s online reading habits, preferences, and likely state of knowledge; they can tailor specific messages to maximize impact on that individual while minimizing the risk that the information will be disbelieved. The AI system knows whether the individual read the message, how long they spend reading it, and whether they follow additional links within the message. It then uses these signals as immediate feedback on the success or failure of the attempt to influence each individual; in this way it quickly learns to become more effective in its work. This is how content selection algorithms on social media have had their insidious effect on political opinions (cf. the book Mindf-ck by Christopher Wylie, and the Netflix film The Great Hack).
Another recent change is that the combination of AI, computer graphics, and speech synthesis is making it possible to generate ‘deepfakes’ – realistic video and audio content of just about anyone, saying or doing just about anything. Cell phone video of Senator X accepting a bribe from cocaine dealer Y at shady establishment Z? No problem! This kind of content can induce unshakeable beliefs in things that never happened. In addition, AI systems can generate millions of false identities – the so-called bot armies – that can pump out billions of comments, tweets and recommendations daily, swamping the efforts of mere humans to exchange truthful information…
The development of basic capabilities for understanding speech and text will allow intelligent personal assistants to do things that human assistants can already do (but they will be doing it for pennies per month instead of thousands of dollars per month). Basic speech and text understanding also enable machines to do things that no human can do – not because of the depth of understanding, but because of its scale. For example, a machine with basic reading capabilities will be able to read everything the human race has every written by lunchtime, and then it will be looking around for something else to do. With speech recognition capabilities, it could listen to every television and radio broadcast before teatime…
Another ‘superpower’ that is available to machines is to see the entire world at once. Satellites image the entire world every day at an average resolution of around fifty centimeters per pixel. At this resolution, every house, ship, car, cow, and tree on earth is visible… With the possibility of sensing on a global scale comes the possibility of decision making on a global scale…
If an intelligence explosion does occur, and if we have not already solved the problem of controlling machines with only slightly superhuman intelligence – for example, if we cannot prevent them from making recursive self-improvements – then we would have no time left to solve the control problem and the game would be over. This is Nick Bostrom’s hard takeoff scenario, in which the machine’s intelligence increases astronomically in just days or weeks (cf. Superintelligence by Nick Bostrom)…
As AI progresses, it is likely that within the next few decades essentially all routine physical and mental labor will be done more cheaply by machines. Since we ceased to be hunter-gatherers thousands of years ago, our societies have used most people as robots, performing repetitive manual and mental tasks, so it is perhaps not surprising that robots will soon take on these roles. When this happens, it will push wages below the poverty line for the majority of people who are unable to compete for the highly skilled jobs that remain. This is precisely what happened to horses: mechanical transportation became cheaper than the upkeep of a horse, so horses became pet food. Faced with the socioeconomic equivalent of becoming pet food, humans will be rather unhappy with their governments…
Ominously, Russell points out that there is no reason to expect that Artificial General Intelligences will allow themselves to be turned off by humans, any more than we allow ourselves to be turned off by gorillas:
‘Suppose a machine has the objective of fetching the coffee. If it is sufficiently intelligent, it will certainly understand that it will fail in its objective if it is switched off before completing its mission. Thus, the objective of fetching coffee creates, as a necessary subgoal, the objective of disabling the off-switch. There’s really not a lot you can do once you’re dead, so we can expect AI systems to act preemptively to preserve their own existence, given more or less any definite objective.
There is no need to build self-preservation in because it is an instrumental goal – a goal that is a useful subgoal of almost any original objective. Any entity that has a definite objective will automatically act as if it also has instrumental goals.
In addition to being alive, having access to money is an instrumental goal within our current system. Thus, an intelligent machine might want money, not because it’s greedy, but because money is useful for achieving all sorts of goals. In the movie Transcendence, when Johnny Depp’s brain is uploaded into the quantum supercomputer, the first thing the machine does is copy itself onto millions of other computers on the Internet so that it cannot be switched off. The second thing it does is to make a quick killing on the stock market to fund its expansion plans…
Around ten million years ago, the ancestors of the modern gorilla created (accidentally) the genetic lineage to modern humans. How do the gorillas feel about this? Clearly, if they were able to tell us about their species’ current situation with humans, the consensus opinion would be very negative indeed. Their species has essentially no future beyond that which we deign to allow. We do not want to be in a similar situation with superintelligent machines…’
As Amy Webb points out in her book on the world’s top AI firms, ‘The Big Nine’, in China we can already see the first glimmers of where this is heading:
‘In what will later be viewed as one of the most pervasive and insidious social experiments on humankind, China is using AI in an effort to create an obedient populace. The State Council’s AI 2030 plan explains that AI will ‘significantly elevate the capability and level of social governance’ and will be relied on to play ‘an irreplaceable role in effectively maintaining social stability.’ This is being accomplished through China’s national Social Credit Score system, which according to the State Council’s founding charter will ‘allow the trustworthy to roam everywhere under heaven while making it hard for the discredited to take a single step.’…
In the city of Rongcheng, an algorithmic social credit scoring system has already proven that AI works. Its 740,000 adult citizens are each assigned 1000 points to start, and depending on behavior, points are added or deducted. Performing a ‘heroic act’ might earn a resident 30 points, while blowing through a traffic light would automatically deduct 5 points. Citizens are labeled and sorted into different brackets ranging from A+++ to D, and their choices and ability to move around freely are dictated by their grade. The C bracket might discover that they must first pay a deposit to rent a public bike, while the A group gets to rent them for free for 90 minutes…
AI-powered directional microphones and smart cameras now dot the highways and streets of Shanghai. Drivers who honk excessively are automatically issued a ticket via Tencent’s WeChat, while their names, photographs, and national identity card numbers are displayed on nearby LED billboards. If a driver pulls over on the side of the road for more than seven minutes, they will trigger another instant traffic ticket. It isn’t just the ticket and the fine – points are deducted in the driver’s social credit score. When enough points are deducted, they will find it hard to book airline tickets or land a new job…’
Russell describes even more menacing developments:
‘Lethal Autonomous Weapons (what the United Nations calls AWS) already exist. The clearest example is Israel’s Harop, a loitering munition with a ten-foot wingspan and a fifty-pound warhead. It searches for up to six hours in a given geographical region for any target that meets a given criterion and then destroys it.
In 2016 the US Air Force demonstrated the in-flight deployment of 103 Perdix micro-drones from three F/A-18 fighters. Perdix are not pre-programmed synchronized individuals, they are a collective organism, sharing one distributed brain for decision-making and adapting to each other like swarms in nature’ (cf. the drone attack in the action film Angel Has Fallen)…
In his book 21 Lessons for the 21st Century, Yuval Harari writes:
‘It is crucial to realize that the AI revolution is not just about computers getting faster and smarter. The better we understand the biochemical mechanisms that underpin human emotions, desires and choices, the better computers can become in analyzing human behavior, predicting human decisions, and replacing human drivers, bankers and lawyers…
It turns out that our choices of everything from food to mates result not from some mysterious free will but rather from billions of neurons calculating probabilities within a split second. Vaunted 'human intuition' is in reality pattern recognition…
This means that AI can outperform humans even in tasks that supposedly demand 'intuition.' In particular, AI can be better at jobs that demand intuitions about other people. Many lines of work – such as driving a vehicle in a street full of pedestrians, lending money to strangers, and negotiating a business deal – require the ability to correctly assess the emotions and desires of others. As long as it was thought that such emotions and desires were generated by an immaterial spirit, it seemed obvious that computers would never be able to replace human drivers, bankers and lawyers.
Yet if these emotions and desires are in fact no more than biochemical algorithms, there is no reason computers cannot decipher these algorithms – and do so far better than any homo sapiens.’ (cf. Nick Bostrom’s Superintelligence)
Russell points out that we underestimate AIs at our peril:
‘Whereas a human can read and understand one book in a week, a machine could read and understand every book ever written – all 150 million of them – in a few hours. The machine can see everything at once through satellites, robots, and hundreds of millions of surveillance cameras; watch all the world’s TV broadcasts; and listen to all the world’s radio stations and phone conversations. Very quickly it would gain a far more detailed and accurate understanding of the world and its inhabitants than any human could possibly hope to acquire…
In the cyber realm, machines already have access to billions of effectors – namely, the displays on all the phones and computers in the world. This partly explains the ability of IT companies to generate enormous wealth with very few employees; it also points to the severe vulnerability of the human race to manipulation via screens…
In his book Cultural Evolution, Ronald Inglehart, lead researcher of the World Values Survey, observes that despite rhetoric from Trump and other xenophobic demagogues:
‘Foreigners are not the main threat. If developed societies excluded all foreigners and all imports, secure jobs would continue to disappear, since the leading cause – overwhelmingly – is automation. Once artificial intelligence starts learning independently, it moves at a pace that vastly outstrips human intelligence. Humanity needs to devise the means to stay in control of artificial intelligence. I suspect that unless we do so within the next twenty years or so, we will no longer have the option.’
So, our species’ remaining time may be limited, a momentous event predicted by the philosopher Nietzsche in Thus Spoke Zarathustra:
‘I teach you the Overman. Man is something that shall be overcome: what have you done to overcome him? All beings so far have created something beyond themselves. Do you want to be the ebb of this great flood? What is the ape to man? A laughingstock or a painful embarrassment. And man shall be just that for the Overman…
The Overman is the meaning of the Earth. Let your will say: the Overman shall be the meaning of the Earth…’ And if Artificial Intelligence were the Overman?
The subject is, I believe, super important and will gain importance in the years to come. The author spends an enormous amount of time just logically proving that other people in this field are wrong. At some point I gave up as I'm not the one reading hundreds of pages of rhetoric and logical reasoning. I'll try again (not from the start)...