售 价:¥
温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印
为你推荐
Dedication
About Packt
Why subscribe?
Packt.com
Contributors
About the author
About the reviewer
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Download the color images
Conventions used
Get in touch
Reviews
Up and Running with Reinforcement Learning
Why RL?
Formulating the RL problem
The relationship between an agent and its environment
Defining the states of the agent
Defining the actions of the agent
Understanding policy, value, and advantage functions
Identifying episodes
Identifying reward functions and the concept of discounted rewards
Rewards
Learning the Markov decision process
Defining the Bellman equation
On-policy versus off-policy learning
On-policy method
Off-policy method
Model-free and model-based training
Algorithms covered in this book
Summary
Questions
Further reading
Temporal Difference, SARSA, and Q-Learning
Technical requirements
Understanding TD learning
Relation between the value functions and state
Understanding SARSA and Q-Learning
Learning SARSA
Understanding Q-learning
Cliff walking and grid world problems
Cliff walking with SARSA
Cliff walking with Q-learning
Grid world with SARSA
Summary
Further reading
Deep Q-Network
Technical requirements
Learning the theory behind a DQN
Understanding target networks
Learning about replay buffer
Getting introduced to the Atari environment
Summary of Atari games
Pong
Breakout
Space Invaders
LunarLander
The Arcade Learning Environment
Coding a DQN in TensorFlow
Using the model.py file
Using the funcs.py file
Using the dqn.py file
Evaluating the performance of the DQN on Atari Breakout
Summary
Questions
Further reading
Double DQN, Dueling Architectures, and Rainbow
Technical requirements
Understanding Double DQN
Updating the Bellman equation
Coding DDQN and training to play Atari Breakout
Evaluating the performance of DDQN on Atari Breakout
Understanding dueling network architectures
Coding dueling network architecture and training it to play Atari Breakout
Combining V and A to obtain Q
Evaluating the performance of dueling architectures on Atari Breakout
Understanding Rainbow networks
DQN improvements
Prioritized experience replay
Multi-step learning
Distributional RL
Noisy nets
Running a Rainbow network on Dopamine
Rainbow using Dopamine
Summary
Questions
Further reading
Deep Deterministic Policy Gradient
Technical requirements
Actor-Critic algorithms and policy gradients
Policy gradient
Deep Deterministic Policy Gradient
Coding ddpg.py
Coding AandC.py
Coding TrainOrTest.py
Coding replay_buffer.py
Training and testing the DDPG on Pendulum-v0
Summary
Questions
Further reading
Asynchronous Methods - A3C and A2C
Technical requirements
The A3C algorithm
Loss functions
CartPole and LunarLander
CartPole
LunarLander
The A3C algorithm applied to CartPole
Coding cartpole.py
Coding a3c.py
The AC class
The Worker() class
Coding utils.py
Training on CartPole
The A3C algorithm applied to LunarLander
Coding lunar.py
Training on LunarLander
The A2C algorithm
Summary
Questions
Further reading
Trust Region Policy Optimization and Proximal Policy Optimization
Technical requirements
Learning TRPO
TRPO equations
Learning PPO
PPO loss functions
Using PPO to solve the MountainCar problem
Coding the class_ppo.py file
Coding train_test.py file
Evaluating the performance
Full throttle
Random throttle
Summary
Questions
Further reading
Deep RL Applied to Autonomous Driving
Technical requirements
Car driving simulators
Learning to use TORCS
State space
Support files
Training a DDPG agent to learn to drive
Coding ddpg.py
Coding AandC.py
Coding TrainOrTest.py
Training a PPO agent
Summary
Questions
Further reading
Assessment
Chapter 1
Chapter 3
Chapter 4
Chapter 5
Chapter 6
Chapter 7
Chapter 8
Other Books You May Enjoy
Leave a review - let other readers know what you think
买过这本书的人还买过
读了这本书的人还在读
同类图书排行榜