您現在的位置是:首頁 > 棋牌

採訪“退役”後的ELF OpenGo:精彩仍將繼續

  • 由 弈客圍棋 發表于 棋牌
  • 2022-02-28
簡介我們的目的一直是從科研角度去考慮有沒有新思路和新發現,而不是在現有基礎上去追求更強的AI,這也是我們開源ELF OpenGo的源動力

圍棋引徵是什麼意思

採訪“退役”後的ELF OpenGo:精彩仍將繼續

1。 ELF目前達到了一個什麼水平?和AlphaGo Zero 相比,誰更強?

1。 What level has ELF reached now? Who is stronger comparing with AlphaGo Zero?

ELF OpenGo大致達到了AlphaGo Zero 3天版本的水平。在此之外沒有直接比較的話很難說。

We believe ELF OpenGo has reached AlphaGo Zero 3-day version‘s level。 Beyond that it is hard to say without a direct comparison。

2。 現在很多AI的研發訓練都使用了ELF的棋譜,權重,棋力因此得到了大量的提升,作為資料開源者您怎麼看這件事?

2。Now, a lot of new AI making great progress, as a result of using the ELF weight and Sgfs to train their AI。 As the data sources, what do you think about it?

這是我們開源的初衷。我們堅持讓AI為世界服務,另外,對於AlphaGo Zero這樣優秀的演算法,我們希望提供一個可復現的實現,以用於其他研究,並且可以讓全世界其他研究者在這上面發揮創造力,改進演算法。

This is exactly the purpose of open sourcing。 AI should benefit the community and the world。 Moreover, we would like to have a reliable and reproducible implementation of AlphaGo Zero, such that we can use if for other research and provide a baseline for researchers over the world to work on and improve it。

3。 選擇只用20block的神經網路而不是更深的網路,主要的考慮因素是?

3。 Why you choose only 20 block neural network instead of a deeper network? what is the main consideration about this choice。

我們主要考慮能讓模型能在單卡上很好的執行,這樣大家比較容易使用。更深的網路對於硬體要求更高。將來我們也可能釋出更深的網路。

Our primary consideration is that our work should be accessible to a wide audience。 By starting with an inexpensive model architecture, we enable those with mass-consumer hardware to take advantage of ELF。 In the future, we might release additional deeper models。

4。 傳說ELF為什麼在訓練時選擇輸入黑棋白棋獲勝各半的棋譜,這樣的“平衡性”考慮是什麼?

4。It is said that ELF select and input Sgfs with black /white win 50-50 when training。 If this is true, what is this “balance” considering about?

這個設計主要是為了訓練初期防止出現過擬合而陷入一個黑碾壓白或者白碾壓黑的區域性解。這樣模型將不會繼續進步。

This is by design to avoid overfitting to a local solution in the early stage of training, in which one side always wins。 Such overfitting will prevent the model from improving。

5。 ELF最新的權重顯示黑棋開局勝率更高?這和其他AI的判斷不一樣,這令我們非常驚訝,能談一下箇中的緣由嗎?

5。 The latest weight of the ELF shows that black have a higher winning chance at the start, which is is different to other AI’s judgment。 This judgement is a great surprise to us, can you talk about the reason?

這跟上一條緊密聯絡。

We think #4 is the main reason for this。

6。 在退役棋壇之後,ELF的研究成果會在其他方面得到應用嗎?

6。 After the retirement, how can the ELF‘s research results be applied in other ways?

ELF是一個通用的強化學習平臺,並實現了不少基線演算法。我們將會在其他博弈類和戰略類遊戲上做更多研究。

ELF is a general reinforcement learning platform which implements some common baselines。 We will do more research with ELF on other competitive/strategy games。

7。 ELF相比其他AI,勝率的波動似乎更為激烈,這樣的理由是什麼?

7。 compared with other AI, ELF’s win rate change seems to be more fierce, what is the reason for this?

這會使AI在比較不同招法時更加敏感從而選出更好的下法。理論上來說,越接近圍棋之神,同一局面下不同下法之間的差別應該越大。

This will make the AI more sensitive to the quality differences of the moves, and choose the best one。 In theory, the better the model, the more difference between different moves from the same situation。

8。 現在的AI水平都非常高,遠超人類棋手,但是仍然會出現了“不識徵子”的低階錯誤,這裡的技術困難是什麼?

8。 After reached a level far higher than human now, there still will be some low-level mistakes such as “ladder”。 what is the technical difficulty 。

徵子的手數很多並且每步都要求下的十分準確,並且結果與棋盤另一端的細節緊密相關。而現在AI所用的蒙特卡洛樹搜尋帶有一定的隨機性。在徵子之前,AI會考慮全盤的招法,所以投入在徵子的計算較少,如果引徵情況複雜有可能失誤。在較少算力下問題更明顯一些。

A ladder represents an exact, fairly long sequence of moves, and the presence of one is globally relevant across the game board。 However, MCTS is a randomized algorithm。 Before the ladder occurs, the AI will consider moves from the whole board and only a small fraction of rollouts will go to the ladder branch。 If the ladder situation is complicated, the rollouts might not consider a sufficiently long sequence of moves to detect the ladder。 This is more obvious if the rollout number is low。

9。 3年前facebook的黑暗森林出現江湖,當時的棋力表現並沒有達到一流,而現在ELF一鳴驚人,三年來,團隊一定經歷了很多,請向弈客棋迷介紹一下艱苦研發的心路歷程。

9。 Three years ago, facebook‘s dark forests came on the stage but did not reach the expectation, and now after three years with the ELF, the team must be experienced a lot。 Please introduce something about the development to us。

首先要確認的是,DarkForest和這次的ELF OpenGo並非一個連續的專案,要論科研上的創新性,當時的DarkForest肯定是高於這次的OpenGo的。我們在2015年5月開始就已經開始了圍棋AI的研究,在那時沒人看好電腦圍棋這個方向,甚至可以說成為大家的笑料(2015年的人機比賽,連笑讓AI六子還可以獲勝,當時的輿論都認為圍棋AI還得要等幾十年甚至百年)。但是FAIR開放的研究氛圍讓我們可以進行探索,並且有了DarkForest這個專案。需要強調的是,DarkForest那時的表現早已超過了當時的預期,2015年11月我們放出了論文,接受了一些媒體的採訪,與當時花費十年功夫磨出來的Zen不相上下,只是因為2016年1月AlphaGo的橫空出世,DarkForest才相形見絀。之後由於計算資源的限制和研究興趣的轉變,我們決定不再跟進。這期間我們發了一些強化學習其它方向的文章,包括獲得2016年Doom AI比賽一個分項的冠軍,基於語義的室內3D環境導航,ELF框架的設計,訓練全盤即時戰略AI的嘗試,還有深度學習的一些理論分析,等等。一直到去年10月AlphaGoZero的文章出來之後,覺得圍棋那麼複雜的遊戲能從零開始學習很有意思,復現它在科研上有價值。今年1月份開始ELFOpenGo才啟動,當然,出於效率考慮,OpenGo複用了之前DarkForest的很多程式碼。

FAIR作為一個科研機構,自始至終並沒有一個穩定長期的工程團隊去維護圍棋這個專案。我們的目的一直是從科研角度去考慮有沒有新思路和新發現,而不是在現有基礎上去追求更強的AI,這也是我們開源ELF OpenGo的源動力。

First of all, Darkforest and ELF Opengo is not a continuous project。 In terms of scientific novelty, Darkforest has a higher impact。 We started the project on Go AI in May 2015。 At that time, nobody thought this would work。 The result was even a bit embarrassing - Lian Xiao would win against the top AI with 6 handicaps, and most people thought it would take a few decades for AI to beat the top human。 However, FAIR’s open research strategies allowed us to start working on Go, and actually Darkforest is beyond our expectations。 We published the paper in Nov 2015, and the AI was as strong as Zen, who took 10 years of team effort。 It wasn‘t until Jan 2016 that AlphaGo came out。 Afterwards, we decided to put aside this project due to resource limitations and research interest changes。

Since 2016, we have been working on reinforcement learning。 We have won 2016 Doom AI (FPS) championship, worked on natural language-based house navigation, designed ELF platform for reinforcement learning, trained real-time strategy AI, and worked on deep learning theories。 In Oct 2017 AlphaGo Zero paper came out and we considered it interesting that such a complicated game could be learned from scratch。 It would be impactful to reproduce it scientifically。 We started Opengo project in Jan 2018。 Of course, Opengo reused a lot of code from Darkforest to be more effiecient。

FAIR is a research lab, without a dedicated engineering team to maintain the Go project。 Our primary tasks are always focusing on new scientific discoveries and research breakthroughs。 We do not try to get the best AI possible。 This is also the motivation for open sourcing the project。

10。ELF(Extensive,LightWeight,Flexible) 的命名很有意思,體現了產品的理念,能否解釋一下產品設計中對這三個層面的具體實現方式嗎?

10。 The name “ELF”(Extensive,LightWeight,Flexible) sounds interesting and shows the design concept。 Can you explain the implementation of this product concpets in details?

ELF是一篇2017年發表在頂級人工智慧會議NIPS的論文,最初做的是即時戰略遊戲,但是由於是通用平臺,我們也用它實現了圍棋的演算法。文章連結:https://arxiv。org/pdf/1707。01067。pdf

Extensive 可擴充套件性:平臺本身支援各種設定,例如不完全資訊,長期回報,並行處理,模擬真實世界等。

Lightweight 輕量: 框架最佳化,速度非常快,能每分鐘收集成千上萬的經驗。

Flexible 靈活性:平臺環境很容易更改最佳化,引數和模式框架的調整都很方便,有利於新演算法的研究。

ELF is a paper published at NIPS 2017。 It was originally applied to RTS games, but it is a sufficiently general platform to handle diverse usecases such as Go。 Here is the link to the paper: https://arxiv。org/pdf/1707。01067。pdf

Extensive: the platform supports rich dynamics such as imperfect information, long-term rewards, concurrency, and can simulate the real world。

Lightweight: the platform is optimized so that it can collect large number of experiences in a short time。

Flexible: the platform is easily customizable with a rich choice of environments。 Moreover, easy manipulation of parameters and model architecture accelerates RL research。

採訪前,惴惴不安,坊間流傳“ELF”封刀,就此退出江湖,令人實在不捨。透過這次採訪,才知曉原來精彩還會繼續,期待ELF在不遠的將來推出更深的網路,開源圍棋人工智慧的世界,有你更精彩!

Top