当当云阅读 > 进口书 > 外文原版书 > 电脑/网络 > Reinforcement Learning with TensorFlow

| | 手机阅读

扫描下载当当云阅读App

Reinforcement Learning with TensorFlow电子书

售价：¥

11人正在读 | 0人评论

6.2

作者：Sayon Dutta

出版社：Packt Publishing

出版时间：2018-04-24

字数：38.5万

所属分类：进口书 > 外文原版书 > 电脑/网络

温馨提示：数字商品不支持退换货，不提供源文件，不支持导出打印

为你推荐

Learning Cython Programming - Second Edition

￥63.21

Building Web Applications with Python and Neo4j

￥63.21

Learning Python Application Development

￥80.65

NumPy Essentials

￥54.49
Mastering pandas for Finance

￥80.65

Swift Essentials

￥90.46

Learning Python

￥90.46

Learning ServiceNow

￥90.46

读书简介
目录
累计评论(0条)

读书简介
目录
累计评论(0条)

Leverage the power of the Reinforcement Learning techniques to develop self-learning systems using Tensorflow About This Book ? Learn reinforcement learning concepts and their implementation using TensorFlow ? Discover different problem-solving methods for Reinforcement Learning ? Apply reinforcement learning for autonomous driving cars, robobrokers, and more Who This Book Is For If you want to get started with reinforcement learning using TensorFlow in the most practical way, this book will be a useful resource. The book assumes prior knowledge of machine learning and neural network programming concepts, as well as some understanding of the TensorFlow framework. No previous experience with Reinforcement Learning is required. What You Will Learn ? Implement state-of-the-art Reinforcement Learning algorithms from the basics ? Discover various techniques of Reinforcement Learning such as MDP, Q Learning and more ? Learn the applications of Reinforcement Learning in advertisement, image processing, and NLP ? Teach a Reinforcement Learning model to play a game using TensorFlow and the OpenAI gym ? Understand how Reinforcement Learning Applications are used in robotics In Detail Reinforcement Learning (RL), allows you to develop smart, quick and self-learning systems in your business surroundings. It is an effective method to train your learning agents and solve a variety of problems in Artificial Intelligence—from games, self-driving cars and robots to enterprise applications that range from datacenter energy saving (cooling data centers) to smart warehousing solutions. The book covers the major advancements and successes achieved in deep reinforcement learning by synergizing deep neural network architectures with reinforcement learning. The book also introduces readers to the concept of Reinforcement Learning, its advantages and why it’s gaining so much popularity. The book also discusses on MDPs, Monte Carlo tree searches, dynamic programming such as policy and value iteration, temporal difference learning such as Q-learning and SARSA. You will use TensorFlow and OpenAI Gym to build simple neural network models that learn from their own actions. You will also see how reinforcement learning algorithms play a role in games, image processing and NLP. By the end of this book, you will have a firm understanding of what reinforcement learning is and how to put your knowledge to practical use by leveraging the power of TensorFlow and OpenAI Gym. Style and approach An Easy-to-follow, step-by-step guide to help you get to grips with real-world applications of Reinforcement Learning with TensorFlow.

目录展开

Title Page

Reinforcement Learning with TensorFlow

Packt Upsell

Why subscribe?

PacktPub.com

Contributors

About the author

About the reviewer

Packt is searching for authors like you

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Reviews

Deep Learning – Architectures and Frameworks

Deep learning

Activation functions for deep learning

The sigmoid function

The tanh function

The softmax function

The rectified linear unit function

How to choose the right activation function

Logistic regression as a neural network

Notation

Objective

The cost function

The gradient descent algorithm

The computational graph

Steps to solve logistic regression using gradient descent

What is xavier initialization?

Why do we use xavier initialization?

The neural network model

Recurrent neural networks

Long Short Term Memory Networks

Convolutional neural networks

The LeNet-5 convolutional neural network

The AlexNet model

The VGG-Net model

The Inception model

Limitations of deep learning

The vanishing gradient problem

The exploding gradient problem

Overcoming the limitations of deep learning

Reinforcement learning

Basic terminologies and conventions

Optimality criteria

The value function for optimality

The policy model for optimality

The Q-learning approach to reinforcement learning

Asynchronous advantage actor-critic

Introduction to TensorFlow and OpenAI Gym

Basic computations in TensorFlow

An introduction to OpenAI Gym

The pioneers and breakthroughs in reinforcement learning

David Silver

Pieter Abbeel

Google DeepMind

The AlphaGo program

Libratus

Summary

Training Reinforcement Learning Agents Using OpenAI Gym

The OpenAI Gym

Understanding an OpenAI Gym environment

Programming an agent using an OpenAI Gym environment

Q-Learning

The Epsilon-Greedy approach

Using the Q-Network for real-world applications

Summary

Markov Decision Process

Markov decision processes

The Markov property

The S state set

Actions

Transition model

Rewards

Policy

The sequence of rewards - assumptions

The infinite horizons

Utility of sequences

The Bellman equations

Solving the Bellman equation to find policies

An example of value iteration using the Bellman equation

Policy iteration

Partially observable Markov decision processes

State estimation

Value iteration in POMDPs

Training the FrozenLake-v0 environment using MDP

Summary

Policy Gradients

The policy optimization method

Why policy optimization methods?

Why stochastic policy?

Example 1 - rock, paper, scissors

Example 2 - state aliased grid-world

Policy objective functions

Policy Gradient Theorem

Temporal difference rule

TD(1) rule

TD(0) rule

TD() rule

Policy gradients

The Monte Carlo policy gradient

Actor-critic algorithms

Using a baseline to reduce variance

Vanilla policy gradient

Agent learning pong using policy gradients

Summary

Q-Learning and Deep Q-Networks

Why reinforcement learning?

Model based learning and model free learning

Monte Carlo learning

Temporal difference learning

On-policy and off-policy learning

Q-learning

The exploration exploitation dilemma

Q-learning for the mountain car problem in OpenAI gym

Deep Q-networks

Using a convolution neural network instead of a single layer neural network

Use of experience replay

Separate target network to compute the target Q-values

Advancements in deep Q-networks and beyond

Double DQN

Dueling DQN

Deep Q-network for mountain car problem in OpenAI gym

Deep Q-network for Cartpole problem in OpenAI gym

Deep Q-network for Atari Breakout in OpenAI gym

The Monte Carlo tree search algorithm

Minimax and game trees

The Monte Carlo Tree Search

The SARSA algorithm

SARSA algorithm for mountain car problem in OpenAI gym

Summary

Asynchronous Methods

Why asynchronous methods?

Asynchronous one-step Q-learning

Asynchronous one-step SARSA

Asynchronous n-step Q-learning

Asynchronous advantage actor critic

A3C for Pong-v0 in OpenAI gym

Summary

Robo Everything – Real Strategy Gaming

Real-time strategy games

Reinforcement learning and other approaches

Online case-based planning

Drawbacks to real-time strategy games

Why reinforcement learning?

Reinforcement learning in RTS gaming

Deep autoencoder

How is reinforcement learning better?

Summary

AlphaGo – Reinforcement Learning at Its Best

What is Go?

Go versus chess

How did DeepBlue defeat Gary Kasparov?

Why is the game tree approach no good for Go?

AlphaGo – mastering Go

Monte Carlo Tree Search

Architecture and properties of AlphaGo

Energy consumption analysis – Lee Sedol versus AlphaGo

AlphaGo Zero

Architecture and properties of AlphaGo Zero

Training process in AlphaGo Zero

Summary

Reinforcement Learning in Autonomous Driving

Machine learning for autonomous driving

Reinforcement learning for autonomous driving

Creating autonomous driving agents

Why reinforcement learning ?

Proposed frameworks for autonomous driving

Spatial aggregation

Sensor fusion

Spatial features

Recurrent temporal aggregation

Planning

DeepTraffic – MIT simulator for autonomous driving

Summary

Financial Portfolio Management

Introduction

Problem definition

Data preparation

Reinforcement learning

Further improvements

Summary

Reinforcement Learning in Robotics

Reinforcement learning in robotics

Evolution of reinforcement learning

Challenges in robot reinforcement learning

High dimensionality problem

Real-world challenges

Issues due to model uncertainty

What's the final objective a robot wants to achieve?

Open questions and practical challenges

Open questions

Practical challenges for robotic reinforcement learning

Key takeaways

Summary

Deep Reinforcement Learning in Ad Tech

Computational advertising challenges and bidding strategies

Business models used in advertising

支持设备

Hands-On MQTT Programming with Python ￥63.21

Gaston C. Hillar

￥63.21

Creating your MySQL Database: Practical Design Tips and Techniques ￥35.96

Marc Delisle

￥35.96

Building Web Applications with Python and Neo4j ￥63.21

Sumit Gupta

￥63.21

Mastering pandas for Finance ￥80.65

Michael Heydt

￥80.65

NumPy Essentials ￥54.49

Leo (Liang-Huan) Chin

￥54.49

Learning Cython Programming - Second Edition ￥63.21

Philip Herron

￥63.21

Learning Python Application Development ￥80.65

Ninad Sathaye

￥80.65

Learning ServiceNow ￥90.46

Tim Woodruff

￥90.46

Swift Essentials ￥90.46

Dr Alex Blewitt

￥90.46

Learning Python ￥90.46

Fabrizio Romano

￥90.46

更多同类图书 >