當前位置：

首頁 > 知識 > 【代碼集合】深度強化學習Pytorch實現集錦

【代碼集合】深度強化學習Pytorch實現集錦

知識 10-25

本次分享的是用PyTorch語言編寫的深度強化學習演算法的高質量實現

，

這些IPython筆記本的目的主要是幫助練習和理解這些論文；因此，在某些情況下，我將選擇可讀性而不是效率。首先，我會上傳論文的實現，然後是標記來解釋代碼的每一部分。

相關論文

Human Level Control Through Deep Reinforement Learning

[Publication]
https://deepmind.com/research/publications/human-level-control-through-deep-reinforcement-learning/

[code]
https://github.com/qfettes/DeepRL-Tutorials/blob/master/01.DQN.ipynb

Multi-Step Learning (from Reinforcement Learning: An Introduction, Chapter 7)

[Publication]
https://github.com/qfettes/DeepRL-Tutorials/blob/master/01.DQN.ipynb

[code]
https://github.com/qfettes/DeepRL-Tutorials/blob/master/02.NStep_DQN.ipynb

Deep Reinforcement Learning with Double Q-learning

[Publication]
https://arxiv.org/abs/1509.06461

[code]
https://github.com/qfettes/DeepRL-Tutorials/blob/master/03.Double_DQN.ipynb

Dueling Network Architectures for Deep Reinforcement Learning

[Publication]
https://arxiv.org/abs/1511.06581

[code]
https://github.com/qfettes/DeepRL-Tutorials/blob/master/04.Dueling_DQN.ipynb

Noisy Networks for Exploration

[Publication]
https://github.com/qfettes/DeepRL-Tutorials/blob/master/04.Dueling_DQN.ipynb

[code]
https://github.com/qfettes/DeepRL-Tutorials/blob/master/05.DQN-NoisyNets.ipynb

Prioritized Experience Replay

[Publication]
https://arxiv.org/abs/1511.05952?context=cs

[code]
https://github.com/qfettes/DeepRL-Tutorials/blob/master/06.DQN_PriorityReplay.ipynb

A Distributional Perspective on Reinforcement Learning

[Publication]
https://arxiv.org/abs/1707.06887

[code]
https://github.com/qfettes/DeepRL-Tutorials/blob/master/07.Categorical-DQN.ipynb

Rainbow: Combining Improvements in Deep Reinforcement Learning

[Publication]
https://arxiv.org/abs/1710.02298

[code]
https://github.com/qfettes/DeepRL-Tutorials/blob/master/08.Rainbow.ipynb

Distributional Reinforcement Learning with Quantile Regression

[Publication]
https://arxiv.org/abs/1710.10044

[code]
https://github.com/qfettes/DeepRL-Tutorials/blob/master/09.QuantileRegression-DQN.ipynb

Rainbow with Quantile Regression

[code]
https://github.com/qfettes/DeepRL-Tutorials/blob/master/10.Quantile-Rainbow.ipynb

Deep Recurrent Q-Learning for Partially Observable MDPs

[Publication]
https://arxiv.org/abs/1507.06527

[code]
https://github.com/qfettes/DeepRL-Tutorials/blob/master/11.DRQN.ipynb

Advantage Actor Critic (A2C)

[Publication1]
https://arxiv.org/abs/1602.01783

[Publication2]
https://blog.openai.com/baselines-acktr-a2c/

[code]
https://github.com/qfettes/DeepRL-Tutorials/blob/master/12.A2C.ipynb

High-Dimensional Continuous Control Using Generalized Advantage Estimation

[Publication]
https://arxiv.org/abs/1506.02438

[code]
https://github.com/qfettes/DeepRL-Tutorials/blob/master/13.GAE.ipynb

Proximal Policy Optimization Algorithms

[Publication]
https://arxiv.org/abs/1707.06347

[code]
https://github.com/qfettes/DeepRL-Tutorials/blob/master/14.PPO.ipynb

PyTorch實現

關注公眾號，後天回復關鍵詞

20181023

推薦閱讀

宿命之戰：程序員VS產品經理

賽事發布 | 數字合肥廣邀智慧城市建設英才，三十萬重金等你來戰

800萬中文詞，騰訊AI Lab開源大規模NLP數據集

pandas入門教程

10 張令人噴飯的程序員漫畫

【資源】機器學習演算法工程師手冊（PDF下載）

源碼 | Python爬蟲之網易雲音樂下載

548頁MIT強化學習教程，收藏備用【PDF下載】

喜歡這篇文章嗎？立刻分享出去讓更多人知道吧！

本站內容充實豐富，博大精深，小編精選每日熱門資訊，隨時更新，點擊「搶先收到最新資訊」瀏覽吧！

TAG: |