万本电子书0元读

万本电子书0元读

顶部广告

Deep Learning for Computer Vision电子书

售       价:¥

31人正在读 | 0人评论 6.2

作       者:Rajalingappaa Shanmugamani

出  版  社:Packt Publishing

出版时间:2018-01-23

字       数:29.2万

所属分类: 进口书 > 外文原版书 > 电脑/网络

温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印

为你推荐

  • 读书简介
  • 目录
  • 累计评论(0条)
  • 读书简介
  • 目录
  • 累计评论(0条)
Learn how to model and train advanced neural networks to implement a variety of Computer Vision tasks About This Book ? Train different kinds of deep learning model from scratch to solve specific problems in Computer Vision ? Combine the power of Python, Keras, and TensorFlow to build deep learning models for object detection, image classification, similarity learning, image captioning, and more ? Includes tips on optimizing and improving the performance of your models under various constraints Who This Book Is For This book is targeted at data scientists and Computer Vision practitioners who wish to apply the concepts of Deep Learning to overcome any problem related to Computer Vision. A basic knowledge of programming in Python—and some understanding of machine learning concepts—is required to get the best out of this book. What You Will Learn ? Set up an environment for deep learning with Python, TensorFlow, and Keras ? Define and train a model for image and video classification ? Use features from a pre-trained Convolutional Neural Network model for image retrieval ? Understand and implement object detection using the real-world Pedestrian Detection scenario ? Learn about various problems in image captioning and how to overcome them by training images and text together ? Implement similarity matching and train a model for face recognition ? Understand the concept of generative models and use them for image generation ? Deploy your deep learning models and optimize them for high performance In Detail Deep learning has shown its power in several application areas of Artificial Intelligence, especially in Computer Vision. Computer Vision is the science of understanding and manipulating images, and finds enormous applications in the areas of robotics, automation, and so on. This book will also show you, with practical examples, how to develop Computer Vision applications by leveraging the power of deep learning. In this book, you will learn different techniques related to object classification, object detection, image segmentation, captioning, image generation, face analysis, and more. You will also explore their applications using popular Python libraries such as TensorFlow and Keras. This book will help you master state-of-the-art, deep learning algorithms and their implementation. Style and approach This book will teach advanced techniques for Computer Vision, applying the deep learning model in reference to various datasets.
目录展开

Deep Learning for Computer Vision

Why subscribe?

PacktPub.com

Foreword

Contributors

About the author

About the reviewers

Packt is searching for authors like you

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Conventions used

Get in touch

Reviews

Getting Started

Understanding deep learning

Perceptron

Activation functions

Sigmoid

The hyperbolic tangent function

The Rectified Linear Unit (ReLU)

Artificial neural network (ANN)

One-hot encoding

Softmax

Cross-entropy

Dropout

Batch normalization

L1 and L2 regularization

Training neural networks

Backpropagation

Gradient descent

Stochastic gradient descent

Playing with TensorFlow playground

Convolutional neural network

Kernel

Max pooling

Recurrent neural networks (RNN)

Long short-term memory (LSTM)

Deep learning for computer vision

Classification

Detection or localization and segmentation

Similarity learning

Image captioning

Generative models

Video analysis

Development environment setup

Hardware and Operating Systems - OS

General Purpose - Graphics Processing Unit (GP-GPU)

Computer Unified Device Architecture - CUDA

CUDA Deep Neural Network - CUDNN

Installing software packages

Python

Open Computer Vision - OpenCV

The TensorFlow library

Installing TensorFlow

TensorFlow example to print Hello, TensorFlow

TensorFlow example for adding two numbers

TensorBoard

The TensorFlow Serving tool

The Keras library

Summary

Image Classification

Training the MNIST model in TensorFlow

The MNIST datasets

Loading the MNIST data

Building a perceptron

Defining placeholders for input data and targets

Defining the variables for a fully connected layer

Training the model with data

Building a multilayer convolutional network

Utilizing TensorBoard in deep learning

Training the MNIST model in Keras

Preparing the dataset

Building the model

Other popular image testing datasets

The CIFAR dataset

The Fashion-MNIST dataset

The ImageNet dataset and competition

The bigger deep learning models

The AlexNet model

The VGG-16 model

The Google Inception-V3 model

The Microsoft ResNet-50 model

The SqueezeNet model

Spatial transformer networks

The DenseNet model

Training a model for cats versus dogs

Preparing the data

Benchmarking with simple CNN

Augmenting the dataset

Augmentation techniques

Transfer learning or fine-tuning of a model

Training on bottleneck features

Fine-tuning several layers in deep learning

Developing real-world applications

Choosing the right model

Tackling the underfitting and overfitting scenarios

Gender and age detection from face

Fine-tuning apparel models

Brand safety

Summary

Image Retrieval

Understanding visual features

Visualizing activation of deep learning models

Embedding visualization

Guided backpropagation

The DeepDream

Adversarial examples

Model inference

Exporting a model

Serving the trained model

Content-based image retrieval

Building the retrieval pipeline

Extracting bottleneck features for an image

Computing similarity between query image and target database

Efficient retrieval

Matching faster using approximate nearest neighbour

Advantages of ANNOY

Autoencoders of raw images

Denoising using autoencoders

Summary

Object Detection

Detecting objects in an image

Exploring the datasets

ImageNet dataset

PASCAL VOC challenge

COCO object detection challenge

Evaluating datasets using metrics

Intersection over Union

The mean average precision

Localizing algorithms

Localizing objects using sliding windows

The scale-space concept

Training a fully connected layer as a convolution layer

Convolution implementation of sliding window

Thinking about localization as a regression problem

Applying regression to other problems

Combining regression with the sliding window

Detecting objects

Regions of the convolutional neural network (R-CNN)

Fast R-CNN

Faster R-CNN

Single shot multi-box detector

Object detection API

Installation and setup

Pre-trained models

Re-training object detection models

Data preparation for the Pet dataset

Object detection training pipeline

Training the model

Monitoring loss and accuracy using TensorBoard

Training a pedestrian detection for a self-driving car

The YOLO object detection algorithm

Summary

Semantic Segmentation

Predicting pixels

Diagnosing medical images

Understanding the earth from satellite imagery

Enabling robots to see

Datasets

Algorithms for semantic segmentation

The Fully Convolutional Network

The SegNet architecture

Upsampling the layers by pooling

Sampling the layers by convolution

Skipping connections for better training

Dilated convolutions

DeepLab

RefiNet

PSPnet

Large kernel matters

DeepLab v3

Ultra-nerve segmentation

Segmenting satellite images

Modeling FCN for segmentation

Segmenting instances

Summary

Similarity Learning

Algorithms for similarity learning

Siamese networks

Contrastive loss

FaceNet

Triplet loss

The DeepNet model

DeepRank

Visual recommendation systems

Human face analysis

Face detection

Face landmarks and attributes

The Multi-Task Facial Landmark (MTFL) dataset

The Kaggle keypoint dataset

The Multi-Attribute Facial Landmark (MAFL) dataset

Learning the facial key points

Face recognition

The labeled faces in the wild (LFW) dataset

The YouTube faces dataset

The CelebFaces Attributes dataset (CelebA)

CASIA web face database

The VGGFace2 dataset

Computing the similarity between faces

Finding the optimum threshold

Face clustering

Summary

Image Captioning

Understanding the problem and datasets

Understanding natural language processing for image captioning

Expressing words in vector form

Converting words to vectors

Training an embedding

Approaches for image captioning and related problems

Using a condition random field for linking image and text

Using RNN on CNN features to generate captions

Creating captions using image ranking

Retrieving captions from images and images from captions

Dense captioning

Using RNN for captioning

Using multimodal metric space

Using attention network for captioning

Knowing when to look

Implementing attention-based image captioning

Summary

Generative Models

Applications of generative models

Artistic style transfer

Predicting the next frame in a video

Super-resolution of images

Interactive image generation

Image to image translation

Text to image generation

Inpainting

Blending

Transforming attributes

Creating training data

Creating new animation characters

3D models from photos

Neural artistic style transfer

Content loss

Style loss using the Gram matrix

Style transfer

Generative Adversarial Networks

Vanilla GAN

Conditional GAN

Adversarial loss

Image translation

InfoGAN

Drawbacks of GAN

Visual dialogue model

Algorithm for VDM

Generator

Discriminator

Summary

Video Classification

Understanding and classifying videos

Exploring video classification datasets

UCF101

YouTube-8M

Other datasets

Splitting videos into frames

Approaches for classifying videos

Fusing parallel CNN for video classification

Classifying videos over long periods

Streaming two CNN's for action recognition

Using 3D convolution for temporal learning

Using trajectory for classification

Multi-modal fusion

Attending regions for classification

Extending image-based approaches to videos

Regressing the human pose

Tracking facial landmarks

Segmenting videos

Captioning videos

Generating videos

Summary

Deployment

Performance of models

Quantizing the models

MobileNets

Deployment in the cloud

AWS

Google Cloud Platform

Deployment of models in devices

Jetson TX2

Android

iPhone

Summary

Other Books You May Enjoy

Leave a review - let other readers know what you think

累计评论(0条) 0个书友正在讨论这本书 发表评论

发表评论

发表评论,分享你的想法吧!

买过这本书的人还买过

读了这本书的人还在读

回顶部