New Go-playing program teaches itself, beating previous program 100-0

A potentially momentous milestone has been reached in the decades-old battle between human intelligence and artificial intelligence.

Go playing board

Until 18 months ago ago, the ancient Chinese game of Go had firmly resisted attempts to apply computer technology — the best human players were substantially better than the best computer programs. This changed abruptly in March 2016, when a Google computer program named “AlphaGo” defeated the reigning world champion 4-1, a defeat that shocked many observers, who had not expected to see this for many years.

Now a new computer program, called “AlphaGo Zero,” which literally taught itself to learn from scratch without any human input, has defeated the previous program 100 games to zero.

What does this mean? First, some background:

IBM’s Deep Blue defeats Kasparov

Many are familiar with the 1997 defeat of Garry Kasparov, then the world’s reigning chess champion, by IBM’s “Deep Blue” computer. Commenting on his experience, Kasparov later reflected, “It was my luck (perhaps my bad luck) to be the world chess champion during the critical years in which computers challenged, then surpassed, human chess players.”

Deep Blue employed some new techniques, but for the most part it simply applied enormous computing power, so that it could store many openings, look ahead many moves, and — except when the programmers erred — never make mistakes.

Watson defeats humans in Jeopardy!

An even more impressive performance was the 2011 defeat of two champion contestants on the American quiz show Jeopardy! by an IBM-developed computer system named “Watson.” The Watson achievement involved natural language understanding, namely the intelligent “understanding,” in some sense, of ordinary (often tricky) English text.

For example, in “Final Jeopardy” culminating the Jeopardy! match, in the category “19th century novelists,” the clue was “William Wilkinson’s ‘An Account of the Principalities of Wallachia and Moldavia’ inspired this author’s most famous novel.” Watson correctly responded “Who is Bram Stoker?” [the author of Dracula], thus sealing the victory. Legendary Jeopardy! champ Ken Jennings conceded by writing on his tablet, “I for one welcome our new computer overlords.”

In the years since the Jeopardy! demonstration, IBM has deployed its Watson AI technology in the health care field. In a recent test of its cancer-diagnosing facility, Watson recommended treatment plans that matched recommendations by oncologists in 99 percent of the cases, and offered options doctors missed 30 percent of them.

The latest Go-playing achievement

In the ancient Chinese game of Go, players place black and white beads on a 19×19 grid. The game is notoriously complicated, with strategies that can only be described in vague, subjective terms. For these reasons, in spite of advances such as the Deep Blue chess-playing program and even the Watson Jeopardy!-playing program, many observers did not expect Go-playing computer programs to be able to beat the best human players for many years.

Thus it was with considerable surprise when in March 2016, a computer program named AlphaGo, developed by researchers working for DeepMind, a subsidiary of Alphabet (Google’s parent company), defeated Lee Se-dol, a South Korean Go master 4-1 in a 5-game tournament. Not resting on laurels, the DeepMind researchers further enhanced their program, and then, on 23 May 2017, defeated Ke Jie, a 19-year-old Chinese Go master thought to be the world’s best human Go player.

In developing the program that defeated Lee and Ke, DeepMind researchers fed their program 100,000 top amateur games and “taught” it to imitate what it observed. Then they had the program play itself and learn from the results, slowly increasing its skill.

In the latest development, the new program, called AlphaGo Zero, bypassed the first step. The DeepMind researchers merely programmed the rules of Go, with a simple reward function that rewarded games won, and had it play games against itself. Initially, the program merely scattered pieces seemingly at random across the board. But it quickly got better at evaluating board positions, and substantially increased its level of skill.

Interestingly, along the way the program rediscovered many basic elements of Go strategies used by human players, including anticipating its opponent’s probable next moves. But unshackled from the experience of humans, it then developed new complex strategies never before seen in human Go games.

After just three days of training and 4.9 million training games (with the program playing against itself), the AlphaGo Zero program had advanced to the point that it defeated the earlier version of the program 100 games to zero.

Skill at Go (and several other games) is quantified by the Elo rating, which is based on the record of their past games. Lee’s rating is 3526, while Ke’s rating is 3661. After 40 days of training, AlphaGo Zero’s Elo rating was over 5000. Thus AlphaGo Zero was as far ahead of Ke as Ke is ahead of a good amateur player.

Additional details are available in a Scientific American article and in a Nature article.

What will the future hold?

So where is all this heading? A recent Time article features an interview with futurist Ray Kurzweil, who predicts an era, roughly in 2045, when machine intelligence will meet, then transcend human intelligence. Such future intelligent systems will then design even more powerful technology, resulting in a dizzying advance that we can only dimly foresee at the present time. Kurzweil outlines this vision in his book The Singularity Is Near.

Futurists such as Kurzweil certainly have their skeptics and detractors. Sun Microsystem founder Bill Joy is concerned that humans could be relegated to minor players in the future, and that out-of-control, nanotech-produced “grey goo” could destroy life on our fragile planet. But even setting aside such concerns, there is considerable concern about the societal, legal, financial and ethical challenges of such technologies, as exhibited by the increasingly strident social backlash against technology, science and “elites” that we see today.

As a single example, the technology of self-driving vehicles, long thought to be the stuff of science fiction, has dramatically advanced in the past two years. Prototype vehicles are already plying the streets of major U.S. and European cities. Some observers now estimate that self-driving vehicles could replace 1.7 million truckers in the next decade. Even drivers of delivery vehicles could see their jobs replaced, such as by Amazon drones.

One implication of all this is that education programs in engineering, finance, medicine, law and other fields will need to change dramatically to train students in the usage of emerging technology. And even the educational system itself will need to change, as evidenced by the rise in massive open online courses (MOOC). Along this line, the big tech firms are aggressively luring top AI talent, including university faculty, with huge salaries. But clearly the field cannot eat its seed corn in this way; some solution is needed to permit faculty to continue teaching while still participating in commercial R&D work.

But one way or the other, intelligent computers are coming. Society must find a way to accommodate this technology, and to deal respectfully with the many people whose lives will be affected.

Welcome to the Brave New World!

Comments are closed.