伯克利客座教授：AlphaGo Zero and Deep Learning

知識 11-07

分享內容

本場GAIR大講堂嘉賓將解析AlphaGo Zero如何將白板學習、Resnet、MCTS等技術，將Polic Network和Value Network組合框架下使用Self-play解決零經驗下自學習過程。介紹目前最新的深度學習方式如何將機器感知向機器認知方向的演進，目前王強博士團隊應用深度學習的最新研究方向分享。由於演講人多年擔任SCI期刊編委，也將對學術論文撰寫經驗進行分享。

建議預讀文獻

《Mastering the game of Go without human knowledge》

論文地址：http://t.cn/RWkV1B6

A long-standing goal of artificial intelligence is an algorithm that learns, tabula rasa, superhuman proficiency in challenging domains. Recently, AlphaGo became the first program to defeat a world champion in the game of Go. The tree search in AlphaGo evaluated positions and selected moves using deep neural networks. These neural networks were trained by supervised learning from human expert moves, and by reinforcement learning from self-play. Here we introduce an algorithm based solely on reinforcement learning, without human data, guidance or domain knowledge beyond game rules. AlphaGo becomes its own teacher: a neural network is trained to predict AlphaGo』s own move selections and also the winner of AlphaGo』s games. This neural network improves the strength of the tree search, resulting in higher quality move selection and stronger self-play in the next iteration. Starting tabula rasa, our new program AlphaGo Zero achieved superhuman performance, winning 100–0 against the previously published, champion-defeating AlphaGo.

《How to Escape Saddle Points Efficiently》

論文地址：https://arxiv.org/abs/1703.00887

This paper shows that a perturbed form of gradient descent converges to a second-order stationary point in a number iterations which depends only poly-logarithmically on dimension (i.e., it is almost "dimension-free"). The convergence rate of this procedure matches the well-known convergence rate of gradient descent to first-order stationary points, up to log factors. When all saddle points are non-degenerate, all second-order stationary points are local minima, and our result thus shows that perturbed gradient descent can escape saddle points almost for free. Our results can be directly applied to many machine learning applications, including deep learning. As a particular concrete example of such an application, we show that our results can be used directly to establish sharp global convergence rates for matrix factorization. Our results rely on a novel characterization of the geometry around saddle points, which may be of independent interest to the non-convex optimization community.

《Benchmarking State-of-the-Art Deep Learning Software Tools》

論文地址：https://arxiv.org/abs/1608.07249

Deep learning has been shown as a successful machine learning method for a variety of tasks, and its popularity results in numerous open-source deep learning software tools. Training a deep network is usually a very time-consuming process. To address the computational challenge in deep learning, many tools exploit hardware features such as multi-core CPUs and many-core GPUs to shorten the training time. However, different tools exhibit different features and running performance when training different types of deep networks on different hardware platforms, which makes it difficult for end users to select an appropriate pair of software and hardware. In this paper, we aim to make a comparative study of the state-of-the-art GPU-accelerated deep learning software tools, including Caffe, CNTK, MXNet, TensorFlow, and Torch. We first benchmark the running performance of these tools with three popular types of neural networks on two CPU platforms and three GPU platforms. We then benchmark some distributed versions on multiple GPUs. Our contribution is two-fold. First, for end users of deep learning tools, our benchmarking results can serve as a guide to selecting appropriate hardware platforms and software tools. Second, for software developers of deep learning tools, our in-depth analysis points out possible future directions to further optimize the running performance.

分享提綱

1.AI和深度學習主要論題

2.深度學習在AI的應用

3.AlphaGo與AlphaGo Zero的介紹與對比

4.從機器感知到機器認知

5.團隊最新研究方向

分享主題

AlphaGo Zero and Deep Learning-from Machine Perception to Machine Cognition

分享人簡介

王強博士，本科畢業於西安交通大學計算機科學與技術專業，後獲得卡內基梅隆大學軟體工程專業碩士學位、機器人博士學位。美國貨幣監理署（OCC）審計專家庫成員、IBM商業價值研究院院士及紐約Thomas J. Watson研究院主任研究員。IEEE高級會員，並擔任了2008、2009、2013及未來2018年CVPR的論文評委，同時是PAMI和TIP兩個全球頂級期刊的編委。王強博士在國際頂級期刊發表了90多篇論文，並多次在ICCV，CVPR等大會做論文分享。其主要研究領域圖像理解、機器學習、智能交易、金融反欺詐及風險預測等。

分享時間

北京時間11月6日（周一）晚20:00

參與方式

掃描海報二維碼添加社長微信，備註「王強」

如果你覺得活動不錯，歡迎點擊報名~

喜歡這篇文章嗎？立刻分享出去讓更多人知道吧！

本站內容充實豐富，博大精深，小編精選每日熱門資訊，隨時更新，點擊「搶先收到最新資訊」瀏覽吧！

請您繼續閱讀更多來自 AI研習社 的精彩文章:

※神經網路有什麼理論支持？
※神經網路中容易被忽視的基礎知識
※用於文檔理解的面向對象神經規劃
※緊跟未來深度學習框架需求，TensorFlow 推出 Eager Execution
※如何看待 Hinton 那篇備受關注的Capsules論文？

TAG:AI研習社 |