万本电子书0元读

万本电子书0元读

顶部广告

scikit-learn Cookbook - Second Edition电子书

售       价:¥

0人正在读 | 0人评论 9.8

作       者:Julian Avila, Trent Hauck

出  版  社:Packt Publishing

出版时间:2017-11-16

字       数:34.2万

所属分类: 进口书 > 外文原版书 > 小说

温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印

为你推荐

  • 读书简介
  • 目录
  • 累计评论(0条)
  • 读书简介
  • 目录
  • 累计评论(0条)
Learn to use scikit-learn operations and functions for Machine Learning and deep learning applications. About This Book Handle a variety of machine learning tasks effortlessly by leveraging the power of scikit-learn Perform supervised and unsupervised learning with ease, and evaluate the performance of your model Practical, easy to understand recipes aimed at helping you choose the right machine learning algorithm Who This Book Is For Data Analysts already familiar with Python but not so much with scikit-learn, who want quick solutions to the common machine learning problems will find this book to be very useful. If you are a Python programmer who wants to take a dive into the world of machine learning in a practical manner, this book will help you too. What You Will Learn Build predictive models in minutes by using scikit-learn Understand the differences and relationships between Classification and Regression, two types of Supervised Learning. Use distance metrics to predict in Clustering, a type of Unsupervised Learning Find points with similar characteristics with Nearest Neighbors. Use automation and cross-validation to find a best model and focus on it for a data product Choose among the best algorithm of many or use them together in an ensemble. Create your own estimator with the simple syntax of sklearn Explore the feed-forward neural networks available in scikit-learn In Detail Python is quickly becoming the go-to language for analysts and data scientists due to its simplicity and flexibility, and within the Python data space, scikit-learn is the unequivocal choice for machine learning. This book includes walk throughs and solutions to the common as well as the not-so-common problems in machine learning, and how scikit-learn can be leveraged to perform various machine learning tasks effectively. The second edition begins with taking you through recipes on evaluating the statistical properties of data and generates synthetic data for machine learning modelling. As you progress through the chapters, you will comes across recipes that will teach you to implement techniques like data pre-processing, linear regression, logistic regression, K-NN, Naive Bayes, classification, decision trees, Ensembles and much more. Furthermore, you'll learn to optimize your models with multi-class classification, cross validation, model evaluation and dive deeper in to implementing deep learning with scikit-learn. Along with covering the enhanced features on model section, API and new features like classifiers, regressors and estimators the book also contains recipes on evaluating and fine-tuning the performance of your model. By the end of this book, you will have explored plethora of features offered by scikit-learn for Python to solve any machine learning problem you come across. Style and Approach This book consists of practical recipes on scikit-learn that target novices as well as intermediate users. It goes deep into the technical issues, covers additional protocols, and many more real-live examples so that you are able to implement it in your daily life scenarios.
目录展开

Title Page

scikit-learn Cookbook

Second Edition

Copyright

scikit-learn Cookbook

Second Edition

Credits

About the Authors

About the Reviewer

www.PacktPub.com

Why subscribe?

Customer Feedback

Preface

What this book covers

Who this book is for

What you need for this book

Conventions

Reader feedback

Customer support

Downloading the example code

Errata

Piracy

Questions

High-Performance Machine Learning – NumPy

Introduction

NumPy basics

How to do it...

The shape and dimension of NumPy arrays

NumPy broadcasting

Initializing NumPy arrays and dtypes

Indexing

Boolean arrays

Arithmetic operations

NaN values

How it works...

Loading the iris dataset

Getting ready

How to do it...

How it works...

Viewing the iris dataset

How to do it...

How it works...

There's more...

Viewing the iris dataset with Pandas

How to do it...

How it works...

Plotting with NumPy and matplotlib

Getting ready

How to do it...

A minimal machine learning recipe – SVM classification

Getting ready

How to do it...

How it works...

There's more...

Introducing cross-validation

Getting ready

How to do it...

How it works...

There's more...

Putting it all together

How to do it...

There's more...

Machine learning overview – classification versus regression

The purpose of scikit-learn

Supervised versus unsupervised

Getting ready

How to do it...

Quick SVC – a classifier and regressor

Making a scorer

How it works...

There's more...

Linear versus nonlinear

Black box versus not

Interpretability

A pipeline

Pre-Model Workflow and Pre-Processing

Introduction

Creating sample data for toy analysis

Getting ready

How to do it...

Creating a regression dataset

Creating an unbalanced classification dataset

Creating a dataset for clustering

How it works...

Scaling data to the standard normal distribution

Getting ready

How to do it...

How it works...

Creating binary features through thresholding

Getting ready

How to do it...

There's more...

Sparse matrices

The fit method

Working with categorical variables

Getting ready

How to do it...

How it works...

There's more...

DictVectorizer class

Imputing missing values through various strategies

Getting ready

How to do it...

How it works...

There's more...

A linear model in the presence of outliers

Getting ready

How to do it...

How it works...

Putting it all together with pipelines

Getting ready

How to do it...

How it works...

There's more...

Using Gaussian processes for regression

Getting ready

How to do it…

Cross-validation with the noise parameter

There's more...

Using SGD for regression

Getting ready

How to do it…

How it works…

Dimensionality Reduction

Introduction

Reducing dimensionality with PCA

Getting ready

How to do it...

How it works...

There's more...

Using factor analysis for decomposition

Getting ready

How to do it...

How it works...

Using kernel PCA for nonlinear dimensionality reduction

Getting ready

How to do it...

How it works...

Using truncated SVD to reduce dimensionality

Getting ready

How to do it...

How it works...

There's more...

Sign flipping

Sparse matrices

Using decomposition to classify with DictionaryLearning

Getting ready

How to do it...

How it works...

Doing dimensionality reduction with manifolds – t-SNE

Getting ready

How to do it...

How it works...

Testing methods to reduce dimensionality with pipelines

Getting ready

How to do it...

How it works...

Linear Models with scikit-learn

Introduction

Fitting a line through data

Getting ready

How to do it...

How it works...

There's more...

Fitting a line through data with machine learning

Getting ready

How to do it...

Evaluating the linear regression model

Getting ready

How to do it...

How it works...

There's more...

Using ridge regression to overcome linear regression's shortfalls

Getting ready

How to do it...

Optimizing the ridge regression parameter

Getting ready

How to do it...

How it works...

There's more...

Bayesian ridge regression

Using sparsity to regularize models

Getting ready

How to do it...

How it works...

LASSO cross-validation – LASSOCV

LASSO for feature selection

Taking a more fundamental approach to regularization with LARS

Getting ready

How to do it...

How it works...

There's more...

References

Linear Models – Logistic Regression

Introduction

Using linear methods for classification – logistic regression

Loading data from the UCI repository

How to do it...

Viewing the Pima Indians diabetes dataset with pandas

How to do it...

Looking at the UCI Pima Indians dataset web page

How to do it...

View the citation policy

Read about missing values and context

Machine learning with logistic regression

Getting ready

Define X, y – the feature and target arrays

How to do it...

Provide training and testing sets

Train the logistic regression

Score the logistic regression

Examining logistic regression errors with a confusion matrix

Getting ready

How to do it...

Reading the confusion matrix

General confusion matrix in context

Varying the classification threshold in logistic regression

Getting ready

How to do it...

Receiver operating characteristic – ROC analysis

Getting ready

Sensitivity

A visual perspective

How to do it...

Calculating TPR in scikit-learn

Plotting sensitivity

There's more...

The confusion matrix in a non-medical context

Plotting an ROC curve without context

How to do it...

Perfect classifier

Imperfect classifier

AUC – the area under the ROC curve

Putting it all together – UCI breast cancer dataset

How to do it...

Outline for future projects

Building Models with Distance Metrics

Introduction

Using k-means to cluster data

Getting ready

How to do it…

How it works...

Optimizing the number of centroids

Getting ready

How to do it...

How it works...

Assessing cluster correctness

Getting ready

How to do it...

There's more...

Using MiniBatch k-means to handle more data

Getting ready

How to do it...

How it works...

Quantizing an image with k-means clustering

Getting ready

How do it…

How it works…

Finding the closest object in the feature space

Getting ready

How to do it...

How it works...

There's more...

Probabilistic clustering with Gaussian mixture models

Getting ready

How to do it...

How it works...

Using k-means for outlier detection

Getting ready

How to do it...

How it works...

Using KNN for regression

Getting ready

How to do it…

How it works..

Cross-Validation and Post-Model Workflow

Introduction

Selecting a model with cross-validation

Getting ready

How to do it...

How it works...

K-fold cross validation

Getting ready

How to do it..

There's more...

Balanced cross-validation

Getting ready

How to do it...

There's more...

Cross-validation with ShuffleSplit

Getting ready

How to do it...

Time series cross-validation

Getting ready

How to do it...

There's more...

Grid search with scikit-learn

Getting ready

How to do it...

How it works...

Randomized search with scikit-learn

Getting ready

How to do it...

Classification metrics

Getting ready

How to do it...

There's more...

Regression metrics

Getting ready

How to do it...

Clustering metrics

Getting ready

How to do it...

Using dummy estimators to compare results

Getting ready

How to do it...

How it works...

Feature selection

Getting ready

How to do it...

How it works...

Feature selection on L1 norms

Getting ready

How to do it...

There's more...

Persisting models with joblib or pickle

Getting ready

How to do it...

Opening the saved model

There's more...

Support Vector Machines

Introduction

Classifying data with a linear SVM

Getting ready

Load the data

Visualize the two classes

How to do it...

How it works...

There's more...

Optimizing an SVM

Getting ready

How to do it...

Construct a pipeline

Construct a parameter grid for a pipeline

Provide a cross-validation scheme

Perform a grid search

There's more...

Randomized grid search alternative

Visualize the nonlinear RBF decision boundary

More meaning behind C and gamma

Multiclass classification with SVM

Getting ready

How to do it...

OneVsRestClassifier

Visualize it

How it works...

Support vector regression

Getting ready

How to do it...

Tree Algorithms and Ensembles

Introduction

Doing basic classifications with decision trees

Getting ready

How to do it...

Visualizing a decision tree with pydot

How to do it...

How it works...

There's more...

Tuning a decision tree

Getting ready

How to do it...

There's more...

Using decision trees for regression

Getting ready

How to do it...

There's more...

Reducing overfitting with cross-validation

How to do it...

There's more...

Implementing random forest regression

Getting ready

How to do it...

Bagging regression with nearest neighbors

Getting ready

How to do it...

Tuning gradient boosting trees

Getting ready

How to do it...

There's more...

Finding the best parameters of a gradient boosting classifier

Tuning an AdaBoost regressor

How to do it...

There's more...

Writing a stacking aggregator with scikit-learn

How to do it...

Text and Multiclass Classification with scikit-learn

Using LDA for classification

Getting ready

How to do it...

How it works...

Working with QDA – a nonlinear LDA

Getting ready

How to do it...

How it works...

Using SGD for classification

Getting ready

How to do it...

There's more...

Classifying documents with Naive Bayes

Getting ready

How to do it...

How it works...

There's more...

Label propagation with semi-supervised learning

Getting ready

How to do it...

How it works...

Neural Networks

Introduction

Perceptron classifier

Getting ready

How to do it...

How it works...

There's more...

Neural network – multilayer perceptron

Getting ready

How to do it...

How it works...

Philosophical thoughts on neural networks

Stacking with a neural network

Getting ready

How to do it...

First base model – neural network

Second base model – gradient boost ensemble

Third base model – bagging regressor of gradient boost ensembles

Some functions of the stacker

Meta-learner – extra trees regressor

There's more...

Create a Simple Estimator

Introduction

Create a simple estimator

Getting ready

How to do it...

How it works...

There's more...

Trying the new GEE classifier on the Pima diabetes dataset

Saving your trained estimator

累计评论(0条) 0个书友正在讨论这本书 发表评论

发表评论

发表评论,分享你的想法吧!

买过这本书的人还买过

读了这本书的人还在读

回顶部