AlphaGo versus Lee Sedol


AlphaGo versus Lee Sedol, also known as the Google DeepMind Challenge Match, was a five-game Go match between 18-time world champion Lee Sedol and AlphaGo, a computer Go program developed by Google DeepMind, played in Seoul, South Korea between the 9th and 15th of March 2016. AlphaGo won all but the fourth game; all games were won by resignation. The match has been compared with the historic chess match between Deep Blue and Garry Kasparov in 1997.
The winner of the match was slated to win $1 million. Since AlphaGo won, Google DeepMind stated that the prize will be donated to charities, including UNICEF, and Go organisations. Lee received $170,000.
After the match, The Korea Baduk Association awarded AlphaGo the highest Go grandmaster rank – an "honorary 9 dan". It was given in recognition of AlphaGo's "sincere efforts" to master Go. This match was chosen by Science as one of the Breakthrough of the Year runners-up on 22 December 2016.

Background

Difficult challenge in artificial intelligence

Go is a complex board game that requires intuition, creative and strategic thinking. It has long been considered a difficult challenge in the field of artificial intelligence and is considerably more difficult to solve than chess. Many in the field of artificial intelligence consider Go to require more elements that mimic human thought than chess. Mathematician I. J. Good wrote in 1965:
Prior to 2015, the best Go programs only managed to reach amateur dan level. On the small 9×9 board, the computer fared better, and some programs managed to win a fraction of their 9×9 games against professional players. Prior to AlphaGo, some researchers had claimed that computers would never defeat top humans at Go. Elon Musk, an early investor of Deepmind, said in 2016 that experts in the field thought AI was 10 years away from achieving a victory against a Go top professional player.
The match AlphaGo versus Lee Sedol is comparable to the 1997 chess match Deep Blue versus Garry Kasparov. There IBM's Deep Blue computer's defeat of reigning champion Kasparov is seen as the symbolic point where computers became better than humans at chess.
AlphaGo is most significantly different from previous AI efforts in that it applies neural networks, in which evaluation heuristics are not hard-coded by human beings, but instead to a large extent learned by the program itself, through tens of millions of past Go matches as well as its own matches with itself. Not even AlphaGo's developer team are able to point out how AlphaGo evaluates the game position and picks its next move. These networks guide a Monte Carlo tree search which explores many moves into the future.
Related research results are being applied to fields such as cognitive science, pattern recognition and machine learning.

Match against Fan Hui

AlphaGo defeated European champion Fan Hui, a 2 dan professional, 5–0 in October 2015, the first time an AI had beaten a human professional player at the game on a full-sized board without a handicap. Some commentators stressed the gulf between Fan and Lee, who is ranked 9 dan professional. Computer programs Zen and Crazy Stone have previously defeated human players ranked 9 dan professional with handicaps of four or five stones. Canadian AI specialist Jonathan Schaeffer, commenting after the win against Fan, compared AlphaGo with a "child prodigy" that lacked experience, and considered, "the real achievement will be when the program plays a player in the true top echelon." He then believed that Lee would win the match in March 2016. Hajin Lee, a professional Go player and the International Go Federation's secretary-general, commented that she was "very excited" at the prospect of an AI challenging Lee, and thought the two players had an equal chance of winning.
In the aftermath of his match against AlphaGo, Fan Hui noted that the game had taught him to be a better player, and to see things he had not previously seen. By March 2016, Wired reported that his ranking had risen from 633 in the world to around 300.

Preparation

Go experts found errors in AlphaGo's play against Fan, in particular relating to a lack of awareness of the entire board. Before the game against Lee, it was unknown how much the program had improved its game since its October match. AlphaGo's original training dataset started with games of strong amateur players from internet Go servers, after which AlphaGo trained by playing against itself for tens of millions of games.

Players

AlphaGo

AlphaGo is a computer program developed by Google DeepMind to play the board game Go. AlphaGo's algorithm uses a combination of machine learning and tree search techniques, combined with extensive training, both from human and computer play. The system's neural networks were initially bootstrapped from human game-play expertise. AlphaGo was initially trained to mimic human play by attempting to match the moves of expert players from recorded historical games, using a KGS Go Server database of around 30 million moves from 160,000 games by KGS 6 to 9 dan human players. Once it had reached a certain degree of proficiency, it was trained further by being set to play large numbers of games against other instances of itself, using reinforcement learning to improve its play. The system does not use a "database" of moves to play. As one of the creators of AlphaGo explained:
In the match against Lee, AlphaGo used about the same computing power as it had in the match against Fan Hui, where it used 1,202 CPUs and 176 GPUs. The Economist reported that it used 1,920 CPUs and 280 GPUs. Google has also stated that its proprietary tensor processing units were used in the match against Lee Sedol.

Lee Sedol

Lee Sedol is a professional Go player of 9 dan rank and is one of the strongest players in the history of Go. He started his career in 1996, winning 18 world championships since then. He is a "national hero" in his native South Korea, known for his unconventional and creative play. Lee Sedol initially predicted he would defeat AlphaGo in a "landslide". Some weeks before the match he won the Korean Myungin title, a major championship.

Games

The match was a five-game match with one million US dollars as the grand prize, using Chinese rules with a 7.5-point komi. For each game there was a two-hour set time limit for each player followed by three 60-second byo-yomi overtime periods. Each game started at 13:00 KST.
The match was played at the Four Seasons Hotel in Seoul, South Korea in March 2016 and was video-streamed live with commentary by Michael Redmond and Chris Garlock. Aja Huang, a DeepMind team member and amateur 6-dan Go player, placed stones on the Go board for AlphaGo, which ran through the Google Cloud Platform with its server located in the United States.

Summary

Game 1

AlphaGo won the first game. Lee appeared to be in control throughout much of the match, but AlphaGo gained the advantage in the final 20 minutes and Lee resigned. Lee stated afterwards that he had made a critical error at the beginning of the match; he said that the computer's strategy in the early part of the game was "excellent" and that the AI had made one unusual move that no human Go player would have made. David Ormerod, commenting on the game at Go Game Guru, described Lee's seventh stone as "a strange move to test AlphaGo's strength in the opening", characterising the move as a mistake and AlphaGo's response as "accurate and efficient". He described AlphaGo's position as favourable in the first part of the game, considering that Lee started to come back with move 81, before making "questionable" moves at 119 and 123, followed by a "losing" move at 129. Professional Go player Cho Hanseung commented that AlphaGo's game had greatly improved from when it beat Fan Hui in October 2015. Michael Redmond described the computer's game as being more aggressive than against Fan.
According to 9-dan Go grandmaster Kim Seong-ryong, Lee seemed stunned by AlphaGo's strong play on the 102nd stone. After watching AlphaGo make the game's 102nd move, Lee mulled over his options for more than 10 minutes.

Game 2

AlphaGo won the second game. Lee stated afterwards that "AlphaGo played a nearly perfect game", "from very beginning of the game I did not feel like there was a point that I was leading". One of the creators of AlphaGo, Demis Hassabis, said that the system was confident of victory from the midway point of the game, even though the professional commentators could not tell which player was ahead.
Michael Redmond noted that AlphaGo's 19th stone was "creative" and "unique". Lee took an unusually long time to respond to the move. An Younggil called AlphaGo's move 37 "a rare and intriguing shoulder hit" but said Lee's counter was "exquisite". He stated that control passed between the players several times before the endgame, and especially praised AlphaGo's moves 151, 157, and 159, calling them "brilliant".
AlphaGo showed anomalies and moves from a broader perspective which professional Go players described as looking like mistakes at the first sight but an intentional strategy in hindsight. As one of the creators of the system explained, AlphaGo does not attempt to maximize its points or its margin of victory, but tries to maximize its probability of winning. If AlphaGo must choose between a scenario where it will win by 20 points with 80 percent probability and another where it will win by 1 and a half points with 99 percent probability, it will choose the latter, even if it must give up points to achieve it. In particular, move 167 by AlphaGo seemed to give Lee a fighting chance and was declared to look like an obvious mistake by commentators. An Younggil stated "So when AlphaGo plays a slack looking move, we may regard it as a mistake, but perhaps it should more accurately be viewed as a declaration of victory?"

Game 3

AlphaGo won the third game.
After the second game, there had still been strong doubts among players whether AlphaGo was truly a strong player in the sense that a human might be. The third game was described as removing that doubt; with analysts commenting that:
According to An Younggil and David Ormerod, the game showed that "AlphaGo is simply stronger than any known human Go player." AlphaGo was seen to capably navigate tricky situations known as ko that did not come up in the previous two matches. An and Ormerod consider move 148 to be particularly notable: in the middle of a complex ko fight, AlphaGo displayed sufficient "confidence" that it was winning the fight to play a large move elsewhere.
Lee, playing black, opened with a High Chinese formation and generated a large area of black influence, which AlphaGo invaded at move 12. This required the program to defend a weak group, which it did successfully. An Younggil described Lee's move 31 as possibly the "losing move" and Andy Jackson of the American Go Association considered that the outcome had already been decided by move 35. AlphaGo had gained control of the game by move 48, and forced Lee onto the defensive. Lee counterattacked at moves 77/79, but AlphaGo's response was effective and its move 90 succeeded in simplifying the position. It then gained a large area of control at the bottom of the board, strengthening its position with moves from 102 to 112 described by An as "sophisticated". Lee attacked again at moves 115 and 125, but AlphaGo's responses were again effective. Lee eventually attempted a complex ko from move 131, without forcing an error from the program, and he resigned at move 176.

Game 4

Lee won the fourth game. Lee chose to play a type of extreme strategy, known as amashi, in response to AlphaGo's apparent preference for Souba Go, taking territory at the perimeter rather than the center. By doing so, his apparent aim was to force an "all or nothing" style of situation — a possible weakness for an opponent strong at negotiation types of play, and one which might make AlphaGo's capability of deciding slim advantages largely irrelevant.
The first 11 moves were identical to the second game, where Lee also played white. In the early game, Lee concentrated on taking territory in the edges and corners of the board, allowing AlphaGo to gain influence in the top and centre. Lee then invaded AlphaGo's region of influence at the top with moves 40 to 48, following the amashi strategy. AlphaGo responded with a shoulder hit at move 47, subsequently sacrificing four stones elsewhere, and gaining the initiative with moves 47 to 53 and 69. Lee tested AlphaGo with moves 72 to 76 without provoking an error, and by this point in the game commentators had begun to feel Lee's play was a lost cause. However, an unexpected play at white 78, described as "a brilliant tesuji", turned the game around. The move developed a white wedge at the centre, and increased the game's complexity. Gu Li described it as a "divine move" and stated that the move had been completely unforeseen by him.
AlphaGo responded poorly on move 79, at which time it estimated it had a 70% chance to win the game. Lee followed up with a strong move at white 82. AlphaGo's initial response in moves 83 to 85 was appropriate, but at move 87, its estimate of its chances to win suddenly plummeted, provoking it to make a series of very bad moves from black 87 to 101. David Ormerod characterised moves 87 to 101 as typical of Monte Carlo-based program mistakes. Lee took the lead by white 92, and An Younggil described black 105 as the final losing move. Despite good tactics during moves 131 to 141, AlphaGo proved unable to recover during the endgame and resigned. AlphaGo's resignation was triggered when it evaluated its chance of winning to be less than 20%; this is intended to match the decision of professionals who resign rather than play to the end when their position is felt to be irrecoverable.
An Younggil at Go Game Guru concluded that the game was "a masterpiece for Lee Sedol and will almost certainly become a famous game in the history of Go". Lee commented after the match that he considered AlphaGo was strongest when playing white. For this reason, he requested that he play black in the fifth game, which is considered more risky.
David Ormerod of Go Game Guru stated that although an analysis of AlphaGo's play around 79–87 was not yet available, he believed it was a result of a known weakness in play algorithms which use Monte Carlo tree search. In essence, the search attempts to prune sequences which are less relevant. In some cases, a play can lead to a very specific line of play which is significant, but which is overlooked when the tree is pruned, and this outcome is therefore "off the search radar".

Game 5

AlphaGo won the fifth game. The game was described as being close. Hassabis stated that the result came after the program made a "bad mistake" early in the game.
Lee, playing black, opened in a similar fashion to the first game and then began to stake out territory in the right and top left corners – a similar strategy to the one he employed successfully in game 4 – while AlphaGo gained influence in the centre of the board. The game remained even until white moves 48 to 58, which AlphaGo played in the bottom right. These moves unnecessarily lost ko threats and aji, allowing Lee to take the lead. Michael Redmond speculated that perhaps AlphaGo had missed black's "tombstone squeeze" tesuji. Humans are taught to recognize the specific pattern, but it is a long sequence of moves if it has to be computed from scratch.
AlphaGo then started to develop the top of the board as well as the centre, and defended successfully against an attack by Lee in moves 69 to 81 that David Ormerod characterised as over-cautious. By white 90, AlphaGo had regained equality, and then played a series of moves described by Ormerod as "unusual... but subtly impressive" which gained a small advantage. Lee tried a Hail Mary pass with moves 167 and 169 but AlphaGo's defence was successful. An Younggil noted white moves 154, 186 and 194 as being particularly strong, and the program played an impeccable endgame, maintaining its lead until Lee resigned.

Coverage

Live video of the games and associated commentary was broadcast in Korean, Chinese, Japanese, and English. Korean-language coverage was made available through Baduk TV. Chinese-language coverage of game 1 with commentary by 9-dan players Gu Li and Ke Jie was provided by Tencent and LeTV respectively, reaching about 60 million viewers. Online English-language coverage presented by US 9-dan Michael Redmond and Chris Garlock, a vice-president of the American Go Association, reached an average 80 thousand viewers with a peak of 100 thousand viewers near the end of game 1.

Responses

AI community

AlphaGo's victory was a major milestone in artificial intelligence research. Go had previously been regarded as a hard problem in machine learning that was expected to be out of reach for the technology of the time. Most experts thought a Go program as powerful as AlphaGo was at least five years away; some experts thought that it would take at least another decade before computers would beat Go champions. Most observers at the beginning of the 2016 matches expected Lee to beat AlphaGo.
With games such as checkers, chess, and now Go won by computer players, victories at popular board games can no longer serve as major milestones for artificial intelligence in the way that they used to. Deep Blue's Murray Campbell called AlphaGo's victory "the end of an era... board games are more or less done and it's time to move on."
When compared with Deep Blue or with Watson, AlphaGo's underlying algorithms are potentially more general-purpose, and may be evidence that the scientific community is making progress toward artificial general intelligence. Some commentators believe AlphaGo's victory makes for a good opportunity for society to start discussing preparations for the possible future impact of machines with general purpose intelligence. In March 2016, AI researcher Stuart Russell stated that "AI methods are progressing much faster than expected, makes the question of the long-term outcome more urgent," adding that "in order to ensure that increasingly powerful AI systems remain completely under human control... there is a lot of work to do." Some scholars, such as physicist Stephen Hawking, warn that some future self-improving AI could gain actual general intelligence, leading to an unexpected AI takeover; other scholars disagree: AI expert Jean-Gabriel Ganascia believes that "Things like 'common sense'... may never be reproducible", and says "I don't see why we would speak about fears. On the contrary, this raises hopes in many domains such as health and space exploration." Richard Sutton said "I don't think people should be scared... but I do think people should be paying attention."
DeepMind AlphaGo Team received the Inaugural IJCAI Marvin Minsky Medal for Outstanding Achievements in AI. “AlphaGo is a wonderful achievement, and a perfect example of what the Minsky Medal was initiated to recognise”, said Professor Michael Wooldridge, Chair of the IJCAI Awards Committee. “What particularly impressed IJCAI was that AlphaGo achieves what it does through a brilliant combination of classic AI techniques as well as the state-of-the-art machine learning techniques that DeepMind is so closely associated with. It’s a breathtaking demonstration of contemporary AI, and we are delighted to be able to recognise it with this award.”

Go community

Go is a popular game in South Korea, China and Japan, and this match was watched and analyzed by millions of people worldwide. Many top Go players characterized AlphaGo's unorthodox plays as seemingly-questionable moves that initially befuddled onlookers, but made sense in hindsight: "All but the very best Go players craft their style by imitating top players. AlphaGo seems to have totally original moves it creates itself." AlphaGo appeared to have unexpectedly become much stronger, even when compared with its October 2015 match against Fan Hui where a computer had beaten a Go professional for the first time ever without the advantage of a handicap.
China's number one player, Ke Jie, who was at the time the top-ranked player worldwide, initially claimed that he would be able to beat AlphaGo, but declined to play against it for fear that it would "copy my style". As the matches progressed, Ke Jie went back and forth, stating that "it is highly likely that I lose" after analyzing the first three matches, but regaining confidence after the fourth match.
Toby Manning, the referee of AlphaGo's match against Fan Hui, and Hajin Lee, secretary general of the International Go Federation, both reason that in the future, Go players will get help from computers to learn what they have done wrong in games and improve their skills.
Lee apologized for his losses, stating after game three that "I misjudged the capabilities of AlphaGo and felt powerless." He emphasized that the defeat was "Lee Se-dol's defeat" and "not a defeat of mankind". Lee said his eventual loss to a machine was "inevitable" but stated that "robots will never understand the beauty of the game the same way that we humans do." Lee called his game four victory a "priceless win that I not exchange for anything."

Government

In response to the match the South Korean government announced on 17 March 2016 that it would invest $863 million in artificial-intelligence research over the next five years.

Documentary film

A documentary film about the matches, called AlphaGo, was made. On March 13, 2020, the film was made free online on the DeepMind YouTube channel.

Official match commentary

Official match commentary by Michael Redmond and Chris Garlock on Google DeepMind's YouTube channel: