深度强化学习实践（影印版英文版）-预付365

商品详情

首页 > 图书> 计算机与互联网> 编程语言与程序设计 > 深度强化学习实践（影印版英文版）

深度强化学习实践（影印版英文版）

商品价格： ¥92.00 [定价 ~~￥109.00~~]

配送至：

无货

商品编号： 12607924

服务：由图书负责发货并提供售后服务

商品运费：全站满99包邮，不满收10元，实际运费以支付页面金额为准。

温馨提示：不支持7天无理由退货

购买数量： - +

加入购物车立即购买

365商城不参加品牌方的满减优惠及赠品活动

热门推荐商品

神奇的逻辑思维游戏书 5-13岁提升孩子逻辑思维训练

¥22.50
正面管教修订版如何不惩罚不娇纵有效管教孩子育儿百科最温柔的教养樊登早教书

¥19.00
一本书读懂中国茶

¥24.90
中国国家地理：最好的时光在路上

¥24.90
正面管教儿童行为心理学

¥19.00
正面管教男孩100招（养育男孩全书）父母的语言话术

¥18.00
陪孩子度过7～9岁叛逆期（7-9岁关键养育叛逆不是孩子的错男孩女孩自驱型成长）

¥16.30
女生呵护指南

¥39.00
西尔斯怀孕百科

¥41.50
协和医院专家教你吃对不生病：糖尿病吃什么宜忌速查

¥14.90

商品介绍

规格与包装

商品名称：深度强化学习实践（影印版英文版）
商品编号：12607924

内容简介

　　强化学习（RL）的新发展结合深度学习（DL），在训练代理以类似人的方式解决复杂问题方面取得了未有的进步。Google使用算法在著名的Atari街机游戏中获胜将该领域推至高峰，研究人员也在源源不断地产生新的想法。
　　《深度强化学习实践（影印版英文版）》介绍了RL的基础知识，为你提供了编写智能学习代理所需的原理，以承担一系列艰巨的实际任务。让你了解如何在“网格世界”环境中实现Q-learning，教你的代理购买和交易股票，发现自然语言模型如何推动了聊天机器人的火爆。

作者简介

　　Maxim Lapan，is a deep learning enthusiast and independent researcher. His background and 15 years' work expertise as a software developer and a systems architect lays from low-level Linux kernel driver development to performance optimization and design of distributed applications working on thousands of servers. With vast work experiences in big data，Machine Learning， and large parallel distributed HPC and nonHPC systems， he has a talent to explain a gist of complicated things in simple words and vivid examples.His current areas of interest lie in practical applications of Deep Learning， such as Deep Natural Language Processing and Deep Reinforcement Learning.
　　Maxim lives in Moscow， Russian Federation， with his family， and he works for an Israeli start-up as a Senior NLP developer.

Preface
Chapter 1： What is Reinforcement Learning?
Learning - supervised， unsupervised， and reinforcement
RL formalisms and relations
Reward
The agent
The environment
Actions
Observations
Markov decision processes
Markov process
Markov reward process
Markov decision process
Summary

Chapter 2： OpenAI Gym
The anatomy of the agent
Hardware and software requirements
OpenAI Gym API
Action space
Observation space
The environment
Creation of the environment
The CartPole session
The random CartPole agent
The extra Gym functionality - wrappers and monitors
Wrappers
Monitor
Summary

Chapter 3： Deep Learning with PyTorch
Tensors
Creation of tensors
Scalar tensors
Tensor operations
GPU tensors
Gradients
Tensors and gradients
NN building blocks
Custom layers
Final glue - loss functions and optimizers
Loss functions
Optimizers
Monitoring with TensorBoard
TensorBoard 101
Plotting stuff
Example -GAN on Atari images
Summary

Chapter 4： The Cross-Entropy Method
Taxonomy of RL methods
Practical cross-entropy
Cross-entropy on CartPole
Cross-entropy on FrozenLake
Theoretical background of the cross-entropy method
Summary

Chapter 5： Tabular Learning and the Bellman Equation
Value， state， and optimality
The Bellman equation of optimality
Value of action
The value iteration method
Value iteration in practice
Q-learning for FrozenLake
Summary

Chapter 6： Deep Q-Networks
Chapter 7： DQN Extensions
Chapter 8： Stocks Trading Using RL
Chapter 9： Policy Gradients - An Alternative
Chapter 10： The Actor-Critic Method
Chapter 11： Asynchronous Advantaqe Actor-Critic
Chapter 12： Chatbots Training with RL
Chapter 13： Web Navigation
Chapter 14： Continuous Action Space
Chapter 15： Trust Regions - TRPO， PPO， and ACKTR
Chapter 16： Black-Box Optimization in RL
Chapter 17： Beyond Model-Free - Imagination
Chapter 18： AlphaGo Zero
Other Books You May Enjoy
Index

神奇的逻辑思维游戏书 5-13岁提升孩子逻辑思维训练

正面管教修订版如何不惩罚不娇纵有效管教孩子育儿百科最温柔的教养樊登早教书

一本书读懂中国茶

中国国家地理：最好的时光在路上

正面管教儿童行为心理学

正面管教男孩100招（养育男孩全书）父母的语言话术

陪孩子度过7～9岁叛逆期（7-9岁关键养育叛逆不是孩子的错男孩女孩自驱型成长）

女生呵护指南

西尔斯怀孕百科

协和医院专家教你吃对不生病：糖尿病吃什么宜忌速查

温馨提示

移动端专享

关于预付365

新手指南

支付方式

商务合作