万本电子书0元读

万本电子书0元读

顶部广告

Hands-On Computer Vision with TensorFlow 2电子书

售       价:¥

5人正在读 | 0人评论 9.8

作       者:Benjamin Planche

出  版  社:Packt Publishing

出版时间:2019-05-30

字       数:51.9万

所属分类: 进口书 > 外文原版书 > 电脑/网络

温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印

为你推荐

  • 读书简介
  • 目录
  • 累计评论(0条)
  • 读书简介
  • 目录
  • 累计评论(0条)
A practical guide to building high performance systems for object detection, segmentation, video processing, smartphone applications, and more. This book is based on the alpha version of TensorFlow 2. Key Features * Discover how to build, train, and serve your own deep neural networks with TensorFlow 2 and Keras * Apply modern solutions to a wide range of applications such as object detection and video analysis * Learn how to run your models on mobile devices and webpages and improve their performance Book Description Computer vision solutions are becoming increasingly common, making their way in fields such as health, automobile, social media, and robotics. This book will help you explore TensorFlow 2, the brand new version of Google's open source framework for machine learning. You will understand how to benefit from using convolutional neural networks (CNNs) for visual tasks. Hands-On Computer Vision with TensorFlow 2 starts with the fundamentals of computer vision and deep learning, teaching you how to build a neural network from scratch. You will discover the features that have made TensorFlow the most widely used AI library, along with its intuitive Keras interface, and move on to building, training, and deploying CNNs efficiently. Complete with concrete code examples, the book demonstrates how to classify images with modern solutions, such as Inception and ResNet, and extract specific content using You Only Look Once (YOLO), Mask R-CNN, and U-Net. You will also build Generative Adversarial Networks (GANs) and Variational Auto-Encoders (VAEs) to create and edit images, and LSTMs to analyze videos. In the process, you will acquire advanced insights into transfer learning, data augmentation, domain adaptation, and mobile and web deployment, among other key concepts. By the end of the book, you will have both the theoretical understanding and practical skills to solve advanced computer vision problems with TensorFlow 2.0. What you will learn * Create your own neural networks from scratch * Classify images with modern architectures including Inception and ResNet * Detect and segment objects in images with YOLO, Mask R-CNN, and U-Net * Tackle problems in developing self-driving cars and facial emotion recognition systems * Boost your application’s performance with transfer learning, GANs, and domain adaptation * Use recurrent neural networks for video analysis * Optimize and deploy your networks on mobile devices and in the browser Who this book is for If you’re new to deep learning and have some background in Python programming and image processing, like reading/writing image files and editing pixels, this book is for you. Even if you’re an expert curious about the new TensorFlow 2 features, you’ll find this book useful. While some theoretical explanations require knowledge in algebra and calculus, the book covers concrete examples for learners focused on practical applications such as visual recognition for self-driving cars and smartphone apps.
目录展开

Dedication

About Packt

Why subscribe?

Packt.com

Contributors

About the authors

About the reviewers

Packt is searching for authors like you

Preface

Who this book is for

What this book covers

To get the most out of this book

Download and run the example code files

Download the code files

Study and run the experiments

Study the Jupyter notebooks online

Run the Jupyter notebooks on your machine

Run the Jupyter notebooks in Google Colab

Download the color images

Conventions used

Get in touch

Reviews

Section 1: TensorFlow 2 and Deep Learning Applied to Computer Vision

Computer Vision and Neural Networks

Technical requirements

Computer vision in the wild

Introducing computer vision

Main tasks and their applications

Content recognition

Object classification

Object identification

Object detection and localization

Object and instance segmentation

Pose estimation

Video analysis

Instance tracking

Action recognition

Motion estimation

Content-aware image edition

Scene reconstruction

A brief history of computer vision

First steps to first successes

Underestimating the perception task

Hand-crafting local features

Adding some machine learning on top

Rise of deep learning

Early attempts and failures

Rise and fall of the perceptron

Too heavy to scale

Reasons for a comeback

The internet – the new El Dorado of data science

More power than ever

Deep learning or the rebranding of artificial neural networks

What makes learning deep?

Deep learning era

Getting started with neural networks

Building a neural network

Imitating neurons

Biological inspiration

Mathematical model

Implementation

Layering neurons together

Mathematical model

Implementation

Applying our network to classification

Setting up the task

Implementing the network

Training a neural network

Learning strategies

Supervised learning

Unsupervised learning

Reinforcement learning

Teaching time

Evaluating the loss

Back-propagating the loss

Teaching our network to classify

Training considerations – underfitting and overfitting

Summary

Questions

Further reading

TensorFlow Basics and Training a Model

Technical requirements

Getting started with TensorFlow 2 and Keras

Introducing TensorFlow

TensorFlow main architecture

Introducing Keras

A simple computer vision model using Keras

Preparing the data

Building the model

Training the model

Model performance

TensorFlow 2 and Keras in detail

Core concepts

Introducing tensors

TensorFlow graph

Comparing lazy execution to eager execution

Creating graphs in TensorFlow 2

Introducing TensorFlow AutoGraph and tf.function

Backpropagating error using the gradient tape

Keras models and layers

Sequential and Functional APIs

Callbacks

Advanced concepts

How tf.function works

Variables in TensorFlow 2

Distribute strategies

Using the Estimator API

Available pre-made Estimators

Training a custom Estimator

TensorFlow ecosystem

TensorBoard

TensorFlow Addons and TensorFlow Extended

TensorFlow Lite and TensorFlow.js

Where to run your model

On a local machine

On a remote machine

On Google Cloud

Summary

Questions

Modern Neural Networks

Technical requirements

Discovering convolutional neural networks

Neural networks for multidimensional data

Problems with fully-connected networks

Explosive number of parameters

Lack of spatial reasoning

Introducing CNNs

CNN operations

Convolutional layers

Concept

Properties

Hyperparameters

TensorFlow/Keras methods

Pooling layers

Concept and hyperparameters

TensorFlow/Keras methods

Fully-connected layers

Usage in CNNs

TensorFlow/Keras methods

Effective receptive field

Definitions

Formula

CNNs with TensorFlow

Implementing our first CNN

LeNet-5 architecture

TensorFlow and Keras implementations

Application to MNIST

Refining the training process

Modern network optimizers

Gradient descent challenges

Training velocity and trade-off

Suboptimal local minima

A single hyperparameter for heterogeneous parameters

Advanced optimizers

Momentum algorithms

The Ada family

Regularization methods

Early stopping

L1 and L2 regularization

Principles

TensorFlow and Keras implementations

Dropout

Definition

TensorFlow and Keras methods

Batch normalization

Definition

TensorFlow and Keras methods

Summary

Questions

Further reading

Section 2: State-of-the-Art Solutions for Classic Recognition Problems

Influential Classification Tools

Technical requirements

Understanding advanced CNN architectures

VGG, a standard CNN architecture

Overview of the VGG architecture

Motivation

Architecture

Contributions – standardizing CNN architectures

Replacing large convolutions with multiple smaller ones

Increasing the depth of the feature maps

Augmenting data with scale jittering

Replacing fully-connected layers with convolutions

Implementations in TensorFlow and Keras

TensorFlow model

Keras model

GoogLeNet and the Inception module

Overview of the GoogLeNet architecture

Motivation

Architecture

Contributions – popularizing larger blocks and bottlenecks

Capturing various details with Inception modules

Using 1 x 1 convolutions as bottlenecks

Pooling instead of fully-connecting

Fighting vanishing gradient with intermediary losses

Implementations in TensorFlow and Keras

Inception module with the Keras Functional API

TensorFlow model and TensorFlow Hub

Keras model

ResNet – the residual network

Overview of the ResNet architecture

Motivation

Architecture

Contributions – forwarding the information deeper

Estimating a residual function instead of a mapping

Going "ultra-deep"

Implementations in TensorFlow and Keras

Residual blocks with the Keras Functional API

The TensorFlow model and TensorFlow Hub

Keras model

Leveraging transfer learning

Overview

Definition

Human inspiration

Motivation

Transferring CNN knowledge

Use-cases

Similar tasks with limited training data

Similar tasks with abundant training data

Dissimilar tasks with abundant training data

Dissimilar tasks with limited training data

Transfer learning with TensorFlow and Keras

Model surgery

Removing layers

Grafting layers

Selective training

Restoring pre-trained parameters

Freezing layers

Summary

Questions

Further reading

Object Detection Models

Technical requirements

Introducing object detection

Background

Applications

Brief history

Evaluating the performance of a model

Precision and recall

Precision-recall curve

Average precision and mean average precision

Average precision threshold

A fast object detection algorithm – YOLO

Introducing YOLO

Strengths and limitations of YOLO

YOLO's main concepts

Inferring with YOLO

YOLO backbone

YOLO's layers output

Introducing anchor boxes

How YOLO refines anchor boxes

Post-processing the boxes

NMS

YOLO inference summarized

Training YOLO

How the YOLO backbone is trained

YOLO loss

Bounding box loss

Object confidence loss

Classification loss

Full YOLO loss

Training techniques

Faster R-CNN – A powerful object detection model

Faster R-CNN's general architecture

Stage 1 – Region proposals

Stage 2 – Classification

Fast R-CNN architecture

RoI pooling

Training Faster R-CNN

Training the RPN

RPN loss

Fast R-CNN loss

Training regimen

TensorFlow object detection API

Using a pretrained model

Training on a custom dataset

Summary

Questions

Further reading

Enhancing and Segmenting Images

Technical requirements

Transforming images with encoders-decoders

Introduction to encoders-decoders

Encoding and decoding

Auto-encoding

Purpose

Basic example – image denoising

Simplistic fully-connected auto-encoder

Application to image denoising

Convolutional encoders-decoders

Unpooling, transposing, and dilating

Transposed convolution (deconvolution)

Unpooling

Upsampling and resizing

Dilated/atrous convolution

Exemplary architectures – FCN and U-Net

Fully Convolutional Networks

U-Net

Intermediary example – image super-resolution

FCN implementation

Application to upscaling images

Understanding semantic segmentation

Object segmentation with encoders-decoders

Overview

Decoding as label maps

Training with segmentation losses and metrics

Post-processing with conditional random fields

Advanced example – image segmentation for self-driving cars

Task presentation

Exemplary solution

The more difficult case of instance segmentation

From object segmentation to instance segmentation

Respecting boundaries

Post-processing into instance masks

From object detection to instance segmentation – Mask R-CNN

Applying semantic segmentation to bounding boxes

Building an instance segmentation model from Faster-RCNN

Summary

Questions

Further reading

Section 3: Advanced Concepts and New Frontiers of Computer Vision

Training on Complex and Scarce Datasets

Technical requirements

Efficient data serving

Introducing the TensorFlow Data API

Intuition behind the TensorFlow Data API

Feeding fast and data-hungry models

Inspiration from lazy structures

Structure of TensorFlow Data pipelines

Extract-Transform-Load

API interface

Setting up input pipelines

Extracting (from tensors, text files, TFRecord files, and more)

From NumPy and TensorFlow data

From files

From other inputs (generator, SQL database, range, and others)

Transforming the samples (parsing, augmenting, and more)

Parsing images and labels

Parsing TFRecord files

Editing samples

Transforming the datasets (shuffling, zipping, parallelizing, and more)

Structuring datasets

Merging datasets

Loading

Optimizing and monitoring input pipelines

Following best practices for optimization

Parallelizing and prefetching

Fusing operations

Passing options to ensure global properties

Monitoring and reusing datasets

Aggregating performance statistics

Caching and reusing datasets

How to deal with data scarcity?

Augmenting datasets

Overview

Why augmenting datasets?

Considerations

Augmenting images with TensorFlow

TensorFlow Image module

Example: augmenting images for our autonomous driving application

Rendering synthetic datasets

Overview

Rise of 3D databases

Benefits of synthetic data

Generating synthetic images from 3D models

Rendering from 3D models

Post-processing synthetic images

Problem – realism gap

Leveraging domain adaptation and generative models (VAEs and GANs)

Training models to be robust to domain changes

Supervised domain adaptation

Unsupervised domain adaptation

Domain randomization

Generating larger or more realistic datasets with VAEs and GANs

Discriminative versus generative models

VAEs

GANs

Augmenting datasets with conditional GANs

Summary

Questions

Further reading

Video and Recurrent Neural Networks

Technical requirements

Introducing RNNs

Basic formalism

General understanding of RNNs

Learning RNN weights

Backpropagation through time

Truncated backpropagation

Long short-term memory cells

LSTM general principles

LSTM inner workings

Classifying videos

Applying computer vision to video

Classifying videos with an LSTM

Extracting features from videos

Training the LSTM

Defining the model

Loading the data

Training the model

Summary

Questions

Further reading

Optimizing Models and Deploying on Mobile Devices

Technical requirements

Optimizing computational and disk footprint

Measuring inference speed

Measuring latency

Using tracing tools to understand computational performance

Improving model inference speed

Optimizing for hardware

Optimizing on CPU

Optimizing on GPU

Optimizing on specialized hardware

Optimizing input

Optimizing post-processing

When the model is still too slow

Interpolating and tracking

Model distillation

Reducing model size

Quantization

Channel pruning and weight sparsification

On-device machine learning

Considerations of on-device machine learning

Benefits of on-device ML

Latency

Privacy

Cost

Limitations of on-device ML

Practical on-device computer vision

On-device computer vision particularities

Generating a SavedModel

Generating a frozen graph

Importance of pre-processing

Example app – recognizing facial expressions

Introducing MobileNet

Deploying models on-device

Running on iOS devices using Core ML

Converting from TensorFlow or Keras

Loading the model

Using the model

Running on Android using TensorFlow Lite

Converting the model from TensorFlow or Keras

Loading the model

Using the model

Running in the browser using TensorFlow.js

Converting the model to the TensorFlow.js format

Using the model

Running on other devices

Summary

Questions

Appendix

Migrating from TensorFlow 1.x

Automatic migration

Migrating TensorFlow 1 code

Sessions

Placeholders

Variable management

Layers and models

Other concepts

References

Chapter 1: Computer Vision and Neural Networks

Chapter 2: TensorFlow Basics and Training a Model

Chapter 3: Modern Neural Networks

Chapter 4: Influential Classification Tools

Chapter 5: Object Detection Models

Chapter 6: Enhancing and Segmenting Images

Chapter 7: Training on Complex and Scarce Datasets

Chapter 8: Video and Recurrent Neural Networks

Chapter 9: Optimizing Models and Deploying on Mobile Devices

Assessments

Answers

Chapter 1

Chapter 2

Chapter 3

Chapter 4

Chapter 5

Chapter 6

Chapter 7

Chapter 8

Chapter 9

Other Books You May Enjoy

Leave a review - let other readers know what you think

累计评论(0条) 0个书友正在讨论这本书 发表评论

发表评论

发表评论,分享你的想法吧!

买过这本书的人还买过

读了这本书的人还在读

回顶部