万本电子书0元读

万本电子书0元读

顶部广告

PyTorch 1.x Reinforcement Learning Cookbook电子书

售       价:¥

10人正在读 | 0人评论 6.2

作       者:Yuxi (Hayden) Liu

出  版  社:Packt Publishing

出版时间:2019-10-31

字       数:38.1万

所属分类: 进口书 > 外文原版书 > 电脑/网络

温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印

为你推荐

  • 读书简介
  • 目录
  • 累计评论(0条)
  • 读书简介
  • 目录
  • 累计评论(0条)
Implement reinforcement learning techniques and algorithms with the help of real-world examples and recipes Key Features * Use PyTorch 1.x to design and build self-learning artificial intelligence (AI) models * Implement RL algorithms to solve control and optimization challenges faced by data scientists today * Apply modern RL libraries to simulate a controlled environment for your projects Book Description Reinforcement learning (RL) is a branch of machine learning that has gained popularity in recent times. It allows you to train AI models that learn from their own actions and optimize their behavior. PyTorch has also emerged as the preferred tool for training RL models because of its efficiency and ease of use. With this book, you'll explore the important RL concepts and the implementation of algorithms in PyTorch 1.x. The recipes in the book, along with real-world examples, will help you master various RL techniques, such as dynamic programming, Monte Carlo simulations, temporal difference, and Q-learning. You'll also gain insights into industry-specific applications of these techniques. Later chapters will guide you through solving problems such as the multi-armed bandit problem and the cartpole problem using the multi-armed bandit algorithm and function approximation. You'll also learn how to use Deep Q-Networks to complete Atari games, along with how to effectively implement policy gradients. Finally, you'll discover how RL techniques are applied to Blackjack, Gridworld environments, internet advertising, and the Flappy Bird game. By the end of this book, you'll have developed the skills you need to implement popular RL algorithms and use RL techniques to solve real-world problems. What you will learn * Use Q-learning and the state–action–reward–state–action (SARSA) algorithm to solve various Gridworld problems * Develop a multi-armed bandit algorithm to optimize display advertising * Scale up learning and control processes using Deep Q-Networks * Simulate Markov Decision Processes, OpenAI Gym environments, and other common control problems * Select and build RL models, evaluate their performance, and optimize and deploy them * Use policy gradient methods to solve continuous RL problems Who this book is for Machine learning engineers, data scientists and AI researchers looking for quick solutions to different reinforcement learning problems will find this book useful. Although prior knowledge of machine learning concepts is required, experience with PyTorch will be useful but not necessary.
目录展开

About Packt

Why subscribe?

Contributors

About the author

About the reviewers

Packt is searching for authors like you

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Sections

Getting ready

How to do it…

How it works…

There's more…

See also

Get in touch

Reviews

Getting Started with Reinforcement Learning and PyTorch

Setting up the working environment

How to do it...

How it works...

There's more...

See also

Installing OpenAI Gym

How to do it...

How it works...

There's more...

See also

Simulating Atari environments

How to do it...

How it works...

There's more...

See also

Simulating the CartPole environment

How to do it...

How it works...

There's more...

Reviewing the fundamentals of PyTorch

How to do it...

There's more...

See also

Implementing and evaluating a random search policy

How to do it...

How it works...

There's more...

Developing the hill-climbing algorithm

How to do it...

How it works...

There's more...

See also

Developing a policy gradient algorithm

How to do it...

How it works...

There's more...

See also

Markov Decision Processes and Dynamic Programming

Technical requirements

Creating a Markov chain

How to do it...

How it works...

There's more...

See also

Creating an MDP

How to do it...

How it works...

There's more...

See also

Performing policy evaluation

How to do it...

How it works...

There's more...

Simulating the FrozenLake environment

Getting ready

How to do it...

How it works...

There's more...

Solving an MDP with a value iteration algorithm

How to do it...

How it works...

There's more...

Solving an MDP with a policy iteration algorithm

How to do it...

How it works...

There's more...

See also

Solving the coin-flipping gamble problem

How to do it...

How it works...

There's more...

Monte Carlo Methods for Making Numerical Estimations

Calculating Pi using the Monte Carlo method

How to do it...

How it works...

There's more...

See also

Performing Monte Carlo policy evaluation

How to do it...

How it works...

There's more...

Playing Blackjack with Monte Carlo prediction

How to do it...

How it works...

There's more...

See also

Performing on-policy Monte Carlo control

How to do it...

How it works...

There's more...

Developing MC control with epsilon-greedy policy

How to do it...

How it works...

Performing off-policy Monte Carlo control

How to do it...

How it works...

There's more...

See also

Developing MC control with weighted importance sampling

How to do it...

How it works...

There's more...

See also

Temporal Difference and Q-Learning

Setting up the Cliff Walking environment playground

Getting ready

How to do it...

How it works...

Developing the Q-learning algorithm

How to do it...

How it works...

There's more...

Setting up the Windy Gridworld environment playground

How to do it...

How it works...

Developing the SARSA algorithm

How to do it...

How it works...

There's more...

Solving the Taxi problem with Q-learning

Getting ready

How to do it...

How it works...

Solving the Taxi problem with SARSA

How to do it...

How it works...

There's more...

Developing the Double Q-learning algorithm

How to do it...

How it works...

See also

Solving Multi-armed Bandit Problems

Creating a multi-armed bandit environment

How to do it...

How it works...

Solving multi-armed bandit problems with the epsilon-greedy policy

How to do it...

How it works...

There's more...

Solving multi-armed bandit problems with the softmax exploration

How to do it...

How it works...

Solving multi-armed bandit problems with the upper confidence bound algorithm

How to do it...

How it works...

There's more...

See also

Solving internet advertising problems with a multi-armed bandit

How to do it...

How it works...

Solving multi-armed bandit problems with the Thompson sampling algorithm

How to do it...

How it works...

See also

Solving internet advertising problems with contextual bandits

How to do it...

How it works...

Scaling Up Learning with Function Approximation

Setting up the Mountain Car environment playground

Getting ready

How to do it...

How it works...

Estimating Q-functions with gradient descent approximation

How to do it...

How it works...

See also

Developing Q-learning with linear function approximation

How to do it...

How it works...

Developing SARSA with linear function approximation

How to do it...

How it works...

Incorporating batching using experience replay

How to do it...

How it works...

Developing Q-learning with neural network function approximation

How to do it...

How it works...

See also

Solving the CartPole problem with function approximation

How to do it...

How it works...

Deep Q-Networks in Action

Developing deep Q-networks

How to do it...

How it works...

See also

Improving DQNs with experience replay

How to do it...

How it works...

Developing double deep Q-Networks

How to do it...

How it works...

Tuning double DQN hyperparameters for CartPole

How to do it...

How it works...

Developing Dueling deep Q-Networks

How to do it...

How it works...

Applying Deep Q-Networks to Atari games

How to do it...

How it works...

Using convolutional neural networks for Atari games

How to do it...

How it works...

See also

Implementing Policy Gradients and Policy Optimization

Implementing the REINFORCE algorithm

How to do it...

How it works...

See also

Developing the REINFORCE algorithm with baseline

How to do it...

How it works...

Implementing the actor-critic algorithm

How to do it...

How it works...

Solving Cliff Walking with the actor-critic algorithm

How to do it...

How it works...

Setting up the continuous Mountain Car environment

How to do it...

How it works...

Solving the continuous Mountain Car environment with the advantage actor-critic network

How to do it...

How it works...

There's more...

See also

Playing CartPole through the cross-entropy method

How to do it...

How it works...

Capstone Project – Playing Flappy Bird with DQN

Setting up the game environment

Getting ready

How to do it...

How it works...

Building a Deep Q-Network to play Flappy Bird

How to do it...

How it works...

Training and tuning the network

How to do it...

How it works...

Deploying the model and playing the game

How to do it...

How it works...

Other Books You May Enjoy

Leave a review - let other readers know what you think

累计评论(0条) 0个书友正在讨论这本书 发表评论

发表评论

发表评论,分享你的想法吧!

买过这本书的人还买过

读了这本书的人还在读

回顶部