AlphaGo

The file above's purpose is being discussed and/or is being considered for deletion. See files for discussion to help reach a consensus on what to do.

AlphaGo is a computer program developed by Google DeepMind in London to play the board game Go.^[1] In October 2015, it became the first Computer Go program to beat a professional human Go player without handicaps on a full-sized 19×19 board.^[2]^[3] In March 2016, it beat Lee Sedol in a five-game match, the first time a computer Go program has beaten a 9-dan professional without handicaps.^[4] Although it lost to Lee Sedol in the fourth game, Lee resigned the final game, giving a final score of 4 games to 1 in favour of AlphaGo. In recognition of beating Lee Sedol, AlphaGo was awarded an honorary 9-dan by the Korea Baduk Association.

AlphaGo's algorithm uses a Monte Carlo tree search to find its moves based on knowledge previously "learned" by machine learning, specifically by an artificial neural network (a deep learning method) by extensive training, both from human and computer play.

History and competitions[edit]

Go is considered much more difficult for computers to win than other games such as chess, because its much larger branching factor makes it prohibitively difficult to use traditional AI methods such as Alpha–beta pruning, Tree traversal and heuristic search.^[2]^[5]

Almost two decades after IBM's computer Deep Blue beat world chess champion Garry Kasparov in the 1997 match, the strongest Go programs using artificial intelligence techniques only reached about amateur 5-dan level,^[6] and still could not beat a professional Go player without handicaps.^[2]^[3]^[7] In 2012, the software program Zen, running on a four PC cluster, beat Masaki Takemiya (9p) two times at five and four stones handicap.^[8] In 2013, Crazy Stone beat Yoshio Ishida (9p) at four-stones handicap.^[9]

According to AlphaGo's David Silver, the AlphaGo research project was formed around 2014 to test how well a neural network using deep learning can compete at Go.^[10] AlphaGo represents a significant improvement over previous Go programs. In 500 games against other available Go programs, including Crazy Stone and Zen,^[11] AlphaGo running on a single computer won all but one.^[12] In a similar matchup, AlphaGo running on multiple computers won all 500 games played against other Go programs, and 77% of games played against AlphaGo running on a single computer. The distributed version in October 2015 was using 1,202 CPUs and 176 GPUs.^[6]

Match against Fan Hui[edit]

In October 2015, the distributed version of AlphaGo defeated the European Go champion Fan Hui,^[13] a 2-dan (out of 9 dan possible) professional, five to zero.^[3]^[14] This was the first time a computer Go program had beaten a professional human player on a full-sized board without handicap.^[15] The announcement of the news was delayed until 27 January 2016 to coincide with the publication of a paper in the journal Nature^[6] describing the algorithms used.^[3]

Match against Lee Sedol[edit]

AlphaGo played South Korean professional Go player Lee Sedol, ranked 9-dan, one of the best players at Go,^[7]^{[dated info]} with five games taking place at the Four Seasons Hotel in Seoul, South Korea on 9, 10, 12, 13, and 15 March 2016,^[16]^[17] which were video-streamed live.^[18] Aja Huang, a DeepMind team member and amateur 6-dan Go player, placed stones on the Go board for AlphaGo, which ran through Google's cloud computing with its servers located in the United States.^[19] The match used Chinese rules with a 7.5-point komi, and each side had two hours of thinking time plus three 60-second byoyomi periods.^[20] The version of AlphaGo playing against Lee used a similar amount of computing power as was used in the Fan Hui match.^[21]

At the time of play, Lee Sedol had the second-highest number of Go international championship victories in the world.^[22] While there is no single official method of ranking in international Go, some sources ranked Lee Sedol as the fourth-best player in the world at the time.^[23]^[24] AlphaGo was not specifically trained to face Lee.^[25]

The first three games were won by AlphaGo following resignations by Lee Sedol.^[26]^[27] However, Lee Sedol beat AlphaGo in the fourth game, winning by resignation at move 180. AlphaGo then continued to achieve a fourth win, winning the fifth game by resignation.^[28]

The prize was $1 million USD. Since AlphaGo won four out of five and thus the series, the prize will be donated to charities, including UNICEF.^[29] Lee Sedol received $150,000 for participating in all five games and an additional $20,000 for his win.^[20]

Hardware[edit]

An early version of AlphaGo was tested on hardware with various numbers of CPUs and GPUs, running in asynchronous or distributed mode. Two seconds of thinking time was given to each move. The resulting Elo ratings are listed below.^[6] In the matches with more time per move higher ratings are achieved.

Configuration and performance
Configuration	Search threads	No. of CPU	No. of GPU	Elo rating
Single^[6] ^p.10-11	40	48	1	2,151
Single	40	48	2	2,738
Single	40	48	4	2,850
Single	40	48	8	2,890
Distributed	12	428	64	2,937
Distributed	24	764	112	3,079
Distributed	40	1,202	176	3,140
Distributed	64	1,920	280	3,168

Algorithm[edit]

As of 2016, AlphaGo's algorithm uses a combination of machine learning and tree search techniques, combined with extensive training, both from human and computer play. It uses Monte Carlo tree search, guided by a "value network" and a "policy network," both implemented using deep neural network technology.^[2]^[6] A limited amount of game-specific feature detection pre-processing (for example, to highlight whether a move matches a nakade pattern) is applied to the input before it is sent to the neural networks.^[6]

The system's neural networks were initially bootstrapped from human gameplay expertise. AlphaGo was initially trained to mimic human play by attempting to match the moves of expert players from recorded historical games, using a database of around 30 million moves.^[13] Once it had reached a certain degree of proficiency, it was trained further by being set to play large numbers of games against other instances of itself, using reinforcement learning to improve its play.^[2] To avoid "disrespectfully" wasting its opponent's time, the program is specifically programmed to resign if its assessment of win probability falls beneath a certain threshold; for the March 2016 match against Lee, the resignation threshold was set to 20%.^[30]

Style of play[edit]

Toby Manning, the match referee for AlphaGo vs. Fan Hui, has described the program's style as "conservative".^[31] During AlphaGo's match against Lee Sedol, Korean commentators exclaimed the AI's playstyle greatly resembled that of the legendary player Lee Changho.^{[citation needed]} This similarity can be attributed to the fact that like Lee Changho,^{[citation needed]} AlphaGo's playstyle also strongly favors greater probability of winning by fewer points over lesser probability of winning by more points.^[10]

Responses to 2016 victory against Lee Sedol[edit]

AI community[edit]

AlphaGo's March 2016 victory was a major milestone in artificial intelligence research.^[32] Go had previously been regarded as a hard problem in machine learning that was expected to be out of reach for the technology of the time.^[32]^[33]^[34] Most experts thought a Go program as powerful as AlphaGo was at least five years away;^[35] some experts thought that it would take at least another decade before computers would beat Go champions.^[6]^[36]^[37] Most observers at the beginning of the 2016 matches expected Lee to beat AlphaGo.^[32]

With games such as checkers (that has been "solved" by Chinook team), chess, and now Go won by computers, victories at popular board games can no longer serve as major milestones for artificial intelligence in the way that they used to. Deep Blue's Murray Campbell called AlphaGo's victory "the end of an era... board games are more or less done and it's time to move on."^[32]

When compared with Deep Blue or with Watson, AlphaGo's underlying algorithms are potentially more general-purpose, and may be evidence that the scientific community is making progress toward artificial general intelligence.^[10]^[38] Some commentators believe AlphaGo's victory makes for a good opportunity for society to start discussing preparations for the possible future impact of machines with general purpose intelligence. (As noted by entrepreneur Guy Suter, AlphaGo itself only knows how to play Go, and doesn't possess general purpose intelligence: "[It] couldn't just wake up one morning and decide it wants to learn how to use firearms"^[32]) In March 2016, AI researcher Stuart Russell stated that "AI methods are progressing much faster than expected, (which) makes the question of the long-term outcome more urgent," adding that "in order to ensure that increasingly powerful AI systems remain completely under human control... there is a lot of work to do."^[39] Some scholars, such as Stephen Hawking, warned (in May 2015 before the matches) that some future self-improving AI could gain actual general intelligence, leading to an unexpected AI takeover; other scholars disagree: AI expert Jean-Gabriel Ganascia believes that "Things like 'common sense'... may never be reproducible",^[40] and says "I don't see why we would speak about fears. On the contrary, this raises hopes in many domains such as health and space exploration."^[39] Computer scientist Richard Sutton "I don't think people should be scared... but I do think people should be paying attention."^[41]

Go community[edit]

Go is a popular game in China, Japan and Korea, and the 2016 matches were watched by perhaps a hundred million people worldwide.^[32]^[42] Many top Go players characterized AlphaGo's unorthodox plays as seemingly-questionable moves that initially befuddled onlookers, but made sense in hindsight:^[36] "All but the very best Go players craft their style by imitating top players. AlphaGo seems to have totally original moves it creates itself."^[32] AlphaGo appeared to have unexpectedly become much stronger, even when compared with its October 2015 match^[43] where a computer had beat a Go professional for the first time ever without the advantage of a handicap.^[44] The day after Lee's first defeat, Jeong Ahram, the lead Go correspondent for one of South Korea’s biggest daily newspapers, said "Last night was very gloomy... Many people drank alcohol."^[45] The Korea Baduk Association, the organization that oversees Go professionals in South Korea, awarded AlphaGo an honorary 9-dan title for exhibiting creative skills and pushing forward the game's progress.^[46]

China's Ke Jie, an 18-year-old generally recognized as the world's best Go player,^[23]^[47] initially claimed that he would be able to beat AlphaGo, but declined to play against it for fear that it would "copy my style".^[47] As the matches progressed, Ke Jie went back and forth, stating that "it is highly likely that I (could) lose" after analyzing the first three matches,^[48] but regaining confidence after AlphaGo displayed flaws in the fourth match.^[49]

Toby Manning, the referee of AlphaGo's match against Fan Hui, and Hajin Lee, secretary general of the International Go Federation, both reason that in the future, Go players will get help from computers to learn what they have done wrong in games and improve their skills.^[44]

After game two, Lee said he felt "speechless": "From the very beginning of the match, I could never manage an upper hand for one single move. It was AlphaGo's total victory."^[50] Lee apologized for his losses, stating after game three that "I misjudged the capabilities of AlphaGo and felt powerless."^[32] He emphasized that the defeat was "Lee Se-dol's defeat" and "not a defeat of mankind".^[25]^[40] Lee said his eventual loss to a machine was "inevitable" but stated that "robots will never understand the beauty of the game the same way that we humans do."^[40] Lee called his game four victory a "priceless win that I (would) not exchange for anything."^[25]

Similar systems[edit]

Facebook has also been working on their own Go-playing system darkforest, also based on combining machine learning and tree search.^[31]^[51] Although a strong player against other computer Go programs, as of early 2016, it had not yet defeated a professional human player.^[52] darkforest has lost to CrazyStone and Zen and is estimated to be of similar strength to CrazyStone and Zen.^[53]

On March 1 a "Deep Zen Go Project" was announced between the developers of the computer go program Zen (Yoji Ojima, Hideki Kato), telecommunications and media company Dwango and a deep learning research team at Tokyo University (developers of Ponanza - a shogi AI that beat all the human pros). Japanese Go Association is also pledging their support. Their goal is to beat AlphaGo in 6 months to 1 year.^[54]