万本电子书0元读

万本电子书0元读

顶部广告

Mastering Java Machine Learning电子书

售       价:¥

2人正在读 | 0人评论 9.8

作       者:Dr. Uday Kamath,Krishna Choppella

出  版  社:Packt Publishing

出版时间:2017-07-11

字       数:1311.3万

所属分类: 进口书 > 外文原版书 > 电脑/网络

温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印

为你推荐

  • 读书简介
  • 目录
  • 累计评论(0条)
  • 读书简介
  • 目录
  • 累计评论(0条)
Become an advanced practitioner with this progressive set of master classes on application-oriented machine learning About This Book ? Comprehensive coverage of key topics in machine learning with an emphasis on both the theoretical and practical aspects ? More than 15 open source Java tools in a wide range of techniques, with code and practical usage. ? More than 10 real-world case studies in machine learning highlighting techniques ranging from data ingestion up to analyzing the results of experiments, all preparing the user for the practical, real-world use of tools and data analysis. Who This Book Is For This book will appeal to anyone with a serious interest in topics in Data Science or those already working in related areas: ideally, intermediate-level data analysts and data scientists with experience in Java. Preferably, you will have experience with the fundamentals of machine learning and now have a desire to explore the area further, are up to grappling with the mathematical complexities of its algorithms, and you wish to learn the complete ins and outs of practical machine learning. What You Will Learn ? Master key Java machine learning libraries, and what kind of problem each can solve, with theory and practical guidance. ? Explore powerful techniques in each major category of machine learning such as classification, clustering, anomaly detection, graph modeling, and text mining. ? Apply machine learning to real-world data with methodologies, processes, applications, and analysis. ? Techniques and experiments developed around the latest specializations in machine learning, such as deep learning, stream data mining, and active and semi-supervised learning. ? Build high-performing, real-time, adaptive predictive models for batch- and stream-based big data learning using the latest tools and methodologies. ? Get a deeper understanding of technologies leading towards a more powerful AI applicable in various domains such as Security, Financial Crime, Internet of Things, social networking, and so on. In Detail Java is one of the main languages used by practicing data scientists; much of the Hadoop ecosystem is Java-based, and it is certainly the language that most production systems in Data Science are written in. If you know Java, Mastering Machine Learning with Java is your next step on the path to becoming an advanced practitioner in Data Science. This book aims to introduce you to an array of advanced techniques in machine learning, including classification, clustering, anomaly detection, stream learning, active learning, semi-supervised learning, probabilistic graph modeling, text mining, deep learning, and big data batch and stream machine learning. Accompanying each chapter are illustrative examples and real-world case studies that show how to apply the newly learned techniques using sound methodologies and the best Java-based tools available today. On completing this book, you will have an understanding of the tools and techniques for building powerful machine learning models to solve data science problems in just about any domain. Style and approach A practical guide to help you explore machine learning—and an array of Java-based tools and frameworks—with the help of practical examples and real-world use cases.
目录展开

Mastering Java Machine Learning

Table of Contents

Mastering Java Machine Learning

Credits

Foreword

About the Authors

About the Reviewers

www.PacktPub.com

eBooks, discount offers, and more

Why subscribe?

Customer Feedback

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Errata

Piracy

Questions

1. Machine Learning Review

Machine learning – history and definition

What is not machine learning?

Machine learning – concepts and terminology

Machine learning – types and subtypes

Datasets used in machine learning

Machine learning applications

Practical issues in machine learning

Machine learning – roles and process

Roles

Process

Machine learning – tools and datasets

Datasets

Summary

2. Practical Approach to Real-World Supervised Learning

Formal description and notation

Data quality analysis

Descriptive data analysis

Basic label analysis

Basic feature analysis

Visualization analysis

Univariate feature analysis

Categorical features

Continuous features

Multivariate feature analysis

Data transformation and preprocessing

Feature construction

Handling missing values

Outliers

Discretization

Data sampling

Is sampling needed?

Undersampling and oversampling

Stratified sampling

Training, validation, and test set

Feature relevance analysis and dimensionality reduction

Feature search techniques

Feature evaluation techniques

Filter approach

Univariate feature selection

Information theoretic approach

Statistical approach

Multivariate feature selection

Minimal redundancy maximal relevance (mRMR)

Correlation-based feature selection (CFS)

Wrapper approach

Embedded approach

Model building

Linear models

Linear Regression

Algorithm input and output

How does it work?

Advantages and limitations

Naïve Bayes

Algorithm input and output

How does it work?

Advantages and limitations

Logistic Regression

Algorithm input and output

How does it work?

Advantages and limitations

Non-linear models

Decision Trees

Algorithm inputs and outputs

How does it work?

Advantages and limitations

K-Nearest Neighbors (KNN)

Algorithm inputs and outputs

How does it work?

Advantages and limitations

Support vector machines (SVM)

Algorithm inputs and outputs

How does it work?

Advantages and limitations

Ensemble learning and meta learners

Bootstrap aggregating or bagging

Algorithm inputs and outputs

How does it work?

Random Forest

Advantages and limitations

Boosting

Algorithm inputs and outputs

How does it work?

Advantages and limitations

Model assessment, evaluation, and comparisons

Model assessment

Model evaluation metrics

Confusion matrix and related metrics

ROC and PRC curves

Gain charts and lift curves

Model comparisons

Comparing two algorithms

McNemar's Test

Paired-t test

Wilcoxon signed-rank test

Comparing multiple algorithms

ANOVA test

Friedman's test

Case Study – Horse Colic Classification

Business problem

Machine learning mapping

Data analysis

Label analysis

Features analysis

Supervised learning experiments

Weka experiments

Sample end-to-end process in Java

Weka experimenter and model selection

RapidMiner experiments

Visualization analysis

Feature selection

Model process flow

Model evaluation metrics

Evaluation on Confusion Metrics

ROC Curves, Lift Curves, and Gain Charts

Results, observations, and analysis

Summary

References

3. Unsupervised Machine Learning Techniques

Issues in common with supervised learning

Issues specific to unsupervised learning

Feature analysis and dimensionality reduction

Notation

Linear methods

Principal component analysis (PCA)

Inputs and outputs

How does it work?

Advantages and limitations

Random projections (RP)

Inputs and outputs

How does it work?

Advantages and limitations

Multidimensional Scaling (MDS)

Inputs and outputs

How does it work?

Advantages and limitations

Nonlinear methods

Kernel Principal Component Analysis (KPCA)

Inputs and outputs

How does it work?

Advantages and limitations

Manifold learning

Inputs and outputs

How does it work?

Advantages and limitations

Clustering

Clustering algorithms

k-Means

Inputs and outputs

How does it work?

Advantages and limitations

DBSCAN

Inputs and outputs

How does it work?

Advantages and limitations

Mean shift

Inputs and outputs

How does it work?

Advantages and limitations

Expectation maximization (EM) or Gaussian mixture modeling (GMM)

Input and output

How does it work?

Advantages and limitations

Hierarchical clustering

Input and output

How does it work?

Advantages and limitations

Self-organizing maps (SOM)

Inputs and outputs

How does it work?

Advantages and limitations

Spectral clustering

Inputs and outputs

How does it work?

Advantages and limitations

Affinity propagation

Inputs and outputs

How does it work?

Advantages and limitations

Clustering validation and evaluation

Internal evaluation measures

Notation

R-Squared

Dunn's Indices

Davies-Bouldin index

Silhouette's index

External evaluation measures

Rand index

F-Measure

Normalized mutual information index

Outlier or anomaly detection

Outlier algorithms

Statistical-based

Inputs and outputs

How does it work?

Advantages and limitations

Distance-based methods

Inputs and outputs

How does it work?

Advantages and limitations

Density-based methods

Inputs and outputs

How does it work?

Advantages and limitations

Clustering-based methods

Inputs and outputs

How does it work?

Advantages and limitations

High-dimensional-based methods

Inputs and outputs

How does it work?

Advantages and limitations

One-class SVM

Inputs and outputs

How does it work?

Advantages and limitations

Outlier evaluation techniques

Supervised evaluation

Unsupervised evaluation

Real-world case study

Tools and software

Business problem

Machine learning mapping

Data collection

Data quality analysis

Data sampling and transformation

Feature analysis and dimensionality reduction

PCA

Random projections

ISOMAP

Observations on feature analysis and dimensionality reduction

Clustering models, results, and evaluation

Observations and clustering analysis

Outlier models, results, and evaluation

Observations and analysis

Summary

References

4. Semi-Supervised and Active Learning

Semi-supervised learning

Representation, notation, and assumptions

Semi-supervised learning techniques

Self-training SSL

Inputs and outputs

How does it work?

Advantages and limitations

Co-training SSL or multi-view SSL

Inputs and outputs

How does it work?

Advantages and limitations

Cluster and label SSL

Inputs and outputs

How does it work?

Advantages and limitations

Transductive graph label propagation

Inputs and outputs

How does it work?

Advantages and limitations

Transductive SVM (TSVM)

Inputs and outputs

How does it work?

Advantages and limitations

Case study in semi-supervised learning

Tools and software

Business problem

Machine learning mapping

Data collection

Data quality analysis

Data sampling and transformation

Datasets and analysis

Feature analysis results

Experiments and results

Analysis of semi-supervised learning

Active learning

Representation and notation

Active learning scenarios

Active learning approaches

Uncertainty sampling

How does it work?

Least confident sampling

Smallest margin sampling

Label entropy sampling

Advantages and limitations

Version space sampling

Query by disagreement (QBD)

How does it work?

Query by Committee (QBC)

How does it work?

Advantages and limitations

Data distribution sampling

How does it work?

Expected model change

Expected error reduction

Variance reduction

Density weighted methods

Advantages and limitations

Case study in active learning

Tools and software

Business problem

Machine learning mapping

Data Collection

Data sampling and transformation

Feature analysis and dimensionality reduction

Models, results, and evaluation

Pool-based scenarios

Stream-based scenarios

Analysis of active learning results

Summary

References

5. Real-Time Stream Machine Learning

Assumptions and mathematical notations

Basic stream processing and computational techniques

Stream computations

Sliding windows

Sampling

Concept drift and drift detection

Data management

Partial memory

Full memory

Detection methods

Monitoring model evolution

Widmer and Kubat

Drift Detection Method or DDM

Early Drift Detection Method or EDDM

Monitoring distribution changes

Welch's t test

Kolmogorov-Smirnov's test

CUSUM and Page-Hinckley test

Adaptation methods

Explicit adaptation

Implicit adaptation

Incremental supervised learning

Modeling techniques

Linear algorithms

Online linear models with loss functions

Inputs and outputs

How does it work?

Advantages and limitations

Online Naïve Bayes

Inputs and outputs

How does it work?

Advantages and limitations

Non-linear algorithms

Hoeffding trees or very fast decision trees (VFDT)

Inputs and outputs

How does it work?

Advantages and limitations

Ensemble algorithms

Weighted majority algorithm

Inputs and outputs

How does it work?

Advantages and limitations

Online Bagging algorithm

Inputs and outputs

How does it work?

Advantages and limitations

Online Boosting algorithm

Inputs and outputs

How does it work?

Advantages and limitations

Validation, evaluation, and comparisons in online setting

Model validation techniques

Prequential evaluation

Holdout evaluation

Controlled permutations

Evaluation criteria

Comparing algorithms and metrics

Incremental unsupervised learning using clustering

Modeling techniques

Partition based

Online k-Means

Inputs and outputs

How does it work?

Advantages and limitations

Hierarchical based and micro clustering

Inputs and outputs

How does it work?

Advantages and limitations

Inputs and outputs

How does it work?

Advantages and limitations

Density based

Inputs and outputs

How does it work?

Advantages and limitations

Grid based

Inputs and outputs

How does it work?

Advantages and limitations

Validation and evaluation techniques

Key issues in stream cluster evaluation

Evaluation measures

Cluster Mapping Measures (CMM)

V-Measure

Other external measures

Unsupervised learning using outlier detection

Partition-based clustering for outlier detection

Inputs and outputs

How does it work?

Advantages and limitations

Distance-based clustering for outlier detection

Inputs and outputs

How does it work?

Exact Storm

Abstract-C

Direct Update of Events (DUE)

Micro Clustering based Algorithm (MCOD)

Approx Storm

Advantages and limitations

Validation and evaluation techniques

Case study in stream learning

Tools and software

Business problem

Machine learning mapping

Data collection

Data sampling and transformation

Feature analysis and dimensionality reduction

Models, results, and evaluation

Supervised learning experiments

Concept drift experiments

Clustering experiments

Outlier detection experiments

Analysis of stream learning results

Summary

References

6. Probabilistic Graph Modeling

Probability revisited

Concepts in probability

Conditional probability

Chain rule and Bayes' theorem

Random variables, joint, and marginal distributions

Marginal independence and conditional independence

Factors

Factor types

Distribution queries

Probabilistic queries

MAP queries and marginal MAP queries

Graph concepts

Graph structure and properties

Subgraphs and cliques

Path, trail, and cycles

Bayesian networks

Representation

Definition

Reasoning patterns

Causal or predictive reasoning

Evidential or diagnostic reasoning

Intercausal reasoning

Combined reasoning

Independencies, flow of influence, D-Separation, I-Map

Flow of influence

D-Separation

I-Map

Inference

Elimination-based inference

Variable elimination algorithm

Input and output

How does it work?

Advantages and limitations

Clique tree or junction tree algorithm

Input and output

How does it work?

Advantages and limitations

Propagation-based techniques

Belief propagation

Factor graph

Messaging in factor graph

Input and output

How does it work?

Advantages and limitations

Sampling-based techniques

Forward sampling with rejection

Input and output

How does it work?

Advantages and limitations

Learning

Learning parameters

Maximum likelihood estimation for Bayesian networks

Bayesian parameter estimation for Bayesian network

Prior and posterior using the Dirichlet distribution

Learning structures

Measures to evaluate structures

Methods for learning structures

Constraint-based techniques

Inputs and outputs

How does it work?

Advantages and limitations

Search and score-based techniques

Inputs and outputs

How does it work?

Advantages and limitations

Markov networks and conditional random fields

Representation

Parameterization

Gibbs parameterization

Factor graphs

Log-linear models

Independencies

Global

Pairwise Markov

Markov blanket

Inference

Learning

Conditional random fields

Specialized networks

Tree augmented network

Input and output

How does it work?

Advantages and limitations

Markov chains

Hidden Markov models

Most probable path in HMM

Posterior decoding in HMM

Tools and usage

OpenMarkov

Weka Bayesian Network GUI

Case study

Business problem

Machine learning mapping

Data sampling and transformation

Feature analysis

Models, results, and evaluation

Analysis of results

Summary

References

7. Deep Learning

Multi-layer feed-forward neural network

Inputs, neurons, activation function, and mathematical notation

Multi-layered neural network

Structure and mathematical notations

Activation functions in NN

Sigmoid function

Hyperbolic tangent ("tanh") function

Training neural network

Empirical risk minimization

Parameter initialization

Loss function

Gradients

Gradient at the output layer

Gradient at the Hidden Layer

Parameter gradient

Feed forward and backpropagation

How does it work?

Regularization

L2 regularization

L1 regularization

Limitations of neural networks

Vanishing gradients, local optimum, and slow training

Deep learning

Building blocks for deep learning

Rectified linear activation function

Restricted Boltzmann Machines

Definition and mathematical notation

Conditional distribution

Free energy in RBM

Training the RBM

Sampling in RBM

Contrastive divergence

Inputs and outputs

How does it work?

Persistent contrastive divergence

Autoencoders

Definition and mathematical notations

Loss function

Limitations of Autoencoders

Denoising Autoencoder

Unsupervised pre-training and supervised fine-tuning

Deep feed-forward NN

Input and outputs

How does it work?

Deep Autoencoders

Deep Belief Networks

Inputs and outputs

How does it work?

Deep learning with dropouts

Definition and mathematical notation

Inputs and outputs

How does it work?

Learning Training and testing with dropouts

Sparse coding

Convolutional Neural Network

Local connectivity

Parameter sharing

Discrete convolution

Pooling or subsampling

Normalization using ReLU

CNN Layers

Recurrent Neural Networks

Structure of Recurrent Neural Networks

Learning and associated problems in RNNs

Long Short Term Memory

Gated Recurrent Units

Case study

Tools and software

Business problem

Machine learning mapping

Data sampling and transfor

Feature analysis

Models, results, and evaluation

Basic data handling

Multi-layer perceptron

Parameters used for MLP

Code for MLP

Convolutional Network

Parameters used for ConvNet

Code for CNN

Variational Autoencoder

Parameters used for the Variational Autoencoder

Code for Variational Autoencoder

DBN

Parameter search using Arbiter

Results and analysis

Summary

References

8. Text Mining and Natural Language Processing

NLP, subfields, and tasks

Text categorization

Part-of-speech tagging (POS tagging)

Text clustering

Information extraction and named entity recognition

Sentiment analysis and opinion mining

Coreference resolution

Word sense disambiguation

Machine translation

Semantic reasoning and inferencing

Text summarization

Automating question and answers

Issues with mining unstructured data

Text processing components and transformations

Document collection and standardization

Inputs and outputs

How does it work?

Tokenization

Inputs and outputs

How does it work?

Stop words removal

Inputs and outputs

How does it work?

Stemming or lemmatization

Inputs and outputs

How does it work?

Local/global dictionary or vocabulary?

Feature extraction/generation

Lexical features

Character-based features

Word-based features

Part-of-speech tagging features

Taxonomy features

Syntactic features

Semantic features

Feature representation and similarity

Vector space model

Binary

Term frequency (TF)

Inverse document frequency (IDF)

Term frequency-inverse document frequency (TF-IDF)

Similarity measures

Euclidean distance

Cosine distance

Pairwise-adaptive similarity

Extended Jaccard coefficient

Dice coefficient

Feature selection and dimensionality reduction

Feature selection

Information theoretic techniques

Statistical-based techniques

Frequency-based techniques

Dimensionality reduction

Topics in text mining

Text categorization/classification

Topic modeling

Probabilistic latent semantic analysis (PLSA)

Input and output

How does it work?

Advantages and limitations

Text clustering

Feature transformation, selection, and reduction

Clustering techniques

Generative probabilistic models

Input and output

How does it work?

Advantages and limitations

Distance-based text clustering

Non-negative matrix factorization (NMF)

Input and output

How does it work?

Advantages and limitations

Evaluation of text clustering

Named entity recognition

Hidden Markov models for NER

Input and output

How does it work?

Advantages and limitations

Maximum entropy Markov models for NER

Input and output

How does it work?

Advantages and limitations

Deep learning and NLP

Tools and usage

Mallet

KNIME

Topic modeling with mallet

Business problem

Machine Learning mapping

Data collection

Data sampling and transformation

Feature analysis and dimensionality reduction

Models, results, and evaluation

Analysis of text processing results

Summary

References

9. Big Data Machine Learning – The Final Frontier

What are the characteristics of Big Data?

Big Data Machine Learning

General Big Data framework

Big Data cluster deployment frameworks

Hortonworks Data Platform

Cloudera CDH

Amazon Elastic MapReduce

Microsoft Azure HDInsight

Data acquisition

Publish-subscribe frameworks

Source-sink frameworks

SQL frameworks

Message queueing frameworks

Custom frameworks

Data storage

HDFS

NoSQL

Key-value databases

Document databases

Columnar databases

Graph databases

Data processing and preparation

Hive and HQL

Spark SQL

Amazon Redshift

Real-time stream processing

Machine Learning

Visualization and analysis

Batch Big Data Machine Learning

H2O as Big Data Machine Learning platform

H2O architecture

Machine learning in H2O

Tools and usage

Case study

Business problem

Machine Learning mapping

Data collection

Data sampling and transformation

Experiments, results, and analysis

Feature relevance and analysis

Evaluation on test data

Analysis of results

Spark MLlib as Big Data Machine Learning platform

Spark architecture

Machine Learning in MLlib

Tools and usage

Experiments, results, and analysis

k-Means

k-Means with PCA

Bisecting k-Means (with PCA)

Gaussian Mixture Model

Random Forest

Analysis of results

Real-time Big Data Machine Learning

SAMOA as a real-time Big Data Machine Learning framework

SAMOA architecture

Machine Learning algorithms

Tools and usage

Experiments, results, and analysis

Analysis of results

The future of Machine Learning

Summary

References

A. Linear Algebra

Vector

Scalar product of vectors

Matrix

Transpose of a matrix

Matrix addition

Scalar multiplication

Matrix multiplication

Properties of matrix product

Linear transformation

Matrix inverse

Eigendecomposition

Positive definite matrix

Singular value decomposition (SVD)

B. Probability

Axioms of probability

Bayes' theorem

Density estimation

Mean

Variance

Standard deviation

Gaussian standard deviation

Covariance

Correlation coefficient

Binomial distribution

Poisson distribution

Gaussian distribution

Central limit theorem

Error propagation

Index

累计评论(0条) 0个书友正在讨论这本书 发表评论

发表评论

发表评论,分享你的想法吧!

买过这本书的人还买过

读了这本书的人还在读

回顶部