售 价:¥
温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印
为你推荐
scikit-learn: Machine Learning Simplified
Credits
Preface
What this learning path covers
What you need for this learning path
Who this learning path is for
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. Module 1
1. Machine Learning – A Gentle Introduction
Installing scikit-learn
Linux
Mac
Windows
Checking your installation
Datasets
Our first machine learning method –linear classification
Evaluating our results
Machine learning categories
Important concepts related to machine learning
Summary
2. Supervised Learning
Image recognition with Support Vector Machines
Training a Support Vector Machine
Text classification with Naïve Bayes
Preprocessing the data
Training a Naïve Bayes classifier
Evaluating the performance
Explaining Titanic hypothesis with decision trees
Preprocessing the data
Training a decision tree classifier
Interpreting the decision tree
Random Forests – randomizing decisions
Evaluating the performance
Predicting house prices with regression
First try – a linear model
Second try – Support Vector Machines for regression
Third try – Random Forests revisited
Evaluation
Summary
3. Unsupervised Learning
Principal Component Analysis
Clustering handwritten digits with k-means
Alternative clustering methods
Summary
4. Advanced Features
Feature extraction
Feature selection
Model selection
Grid search
Parallel grid search
Summary
2. Module 2
1. Premodel Workflow
Introduction
Getting sample data from external sources
Getting ready
How to do it…
How it works…
There's more…
See also
Creating sample data for toy analysis
Getting ready
How to do it...
How it works...
Scaling data to the standard normal
Getting ready
How to do it...
How it works...
There's more...
Creating idempotent scalar objects
Handling sparse imputations
Creating binary features through thresholding
Getting ready
How to do it...
How it works...
There's more...
Sparse matrices
The fit method
Working with categorical variables
Getting ready
How to do it...
How it works...
There's more...
DictVectorizer
Patsy
Binarizing label features
Getting ready
How to do it...
How it works...
There's more...
Imputing missing values through various strategies
Getting ready
How to do it...
How it works...
There's more...
Using Pipelines for multiple preprocessing steps
Getting ready
How to do it...
How it works...
Reducing dimensionality with PCA
Getting ready
How to do it...
How it works...
There's more...
Using factor analysis for decomposition
Getting ready
How to do it...
How it works...
Kernel PCA for nonlinear dimensionality reduction
Getting ready
How to do it...
How it works...
Using truncated SVD to reduce dimensionality
Getting ready
How to do it...
How it works...
There's more...
Sign flipping
Sparse matrices
Decomposition to classify with DictionaryLearning
Getting ready
How to do it...
How it works...
Putting it all together with Pipelines
Getting ready
How to do it...
How it works...
There's more...
Using Gaussian processes for regression
Getting ready
How to do it…
How it works…
There's more…
Defining the Gaussian process object directly
Getting ready
How to do it…
How it works…
Using stochastic gradient descent for regression
Getting ready
How to do it…
How it works…
2. Working with Linear Models
Introduction
Fitting a line through data
Getting ready
How to do it...
How it works...
There's more...
Evaluating the linear regression model
Getting ready
How to do it...
How it works...
There's more...
Using ridge regression to overcome linear regression's shortfalls
Getting ready
How to do it...
How it works...
Optimizing the ridge regression parameter
Getting ready
How to do it...
How it works...
There's more...
Using sparsity to regularize models
Getting ready
How to do it...
How it works...
Lasso cross-validation
Lasso for feature selection
Taking a more fundamental approach to regularization with LARS
Getting ready
How to do it...
How it works...
There's more...
Using linear methods for classification – logistic regression
Getting ready
How to do it...
There's more...
Directly applying Bayesian ridge regression
Getting ready
How to do it...
How it works...
There's more...
Using boosting to learn from errors
Getting ready
How to do it...
How it works...
3. Building Models with Distance Metrics
Introduction
Using KMeans to cluster data
Getting ready
How to do it…
How it works...
Optimizing the number of centroids
Getting ready
How to do it…
How it works…
Assessing cluster correctness
Getting ready
How to do it...
There's more...
Using MiniBatch KMeans to handle more data
Getting ready
How to do it...
How it works...
Quantizing an image with KMeans clustering
Getting ready
How do it…
How it works…
Finding the closest objects in the feature space
Getting ready
How to do it...
How it works...
There's more...
Probabilistic clustering with Gaussian Mixture Models
Getting ready
How to do it...
How it works...
Using KMeans for outlier detection
Getting ready
How to do it...
How it works...
Using k-NN for regression
Getting ready
How to do it…
How it works...
4. Classifying Data with scikit-learn
Introduction
Doing basic classifications with Decision Trees
Getting ready
How to do it…
How it works…
Tuning a Decision Tree model
Getting ready
How to do it…
How it works…
Using many Decision Trees – random forests
Getting ready
How to do it…
How it works…
There's more…
Tuning a random forest model
Getting ready
How to do it…
How it works…
There's more…
Classifying data with support vector machines
Getting ready
How to do it…
How it works…
There's more…
Generalizing with multiclass classification
Getting ready
How to do it…
How it works…
Using LDA for classification
Getting ready
How to do it…
How it works…
Working with QDA – a nonlinear LDA
Getting ready
How to do it…
How it works…
Using Stochastic Gradient Descent for classification
Getting ready
How to do it…
Classifying documents with Naïve Bayes
Getting ready
How to do it…
How it works…
There's more…
Label propagation with semi-supervised learning
Getting ready
How to do it…
How it works…
5. Postmodel Workflow
Introduction
K-fold cross validation
Getting ready
How to do it...
How it works...
Automatic cross validation
Getting ready
How to do it...
How it works...
Cross validation with ShuffleSplit
Getting ready
How to do it...
Stratified k-fold
Getting ready
How to do it...
How it works...
Poor man's grid search
Getting ready
How to do it...
How it works...
Brute force grid search
Getting ready
How to do it...
How it works...
Using dummy estimators to compare results
Getting ready
How to do it...
How it works...
Regression model evaluation
Getting ready
How to do it...
How it works...
Feature selection
Getting ready
How to do it...
How it works...
Feature selection on L1 norms
Getting ready
How to do it...
How it works...
Persisting models with joblib
Getting ready
How to do it...
How it works...
There's more...
3. Module 3
1. The Fundamentals of Machine Learning
Learning from experience
Machine learning tasks
Training data and test data
Performance measures, bias, and variance
An introduction to scikit-learn
Installing scikit-learn
Installing scikit-learn on Windows
Installing scikit-learn on Linux
Installing scikit-learn on OS X
Verifying the installation
Installing pandas and matplotlib
Summary
2. Linear Regression
Simple linear regression
Evaluating the fitness of a model with a cost function
Solving ordinary least squares for simple linear regression
Evaluating the model
Multiple linear regression
Polynomial regression
Regularization
Applying linear regression
Exploring the data
Fitting and evaluating the model
Fitting models with gradient descent
Summary
3. Feature Extraction and Preprocessing
Extracting features from categorical variables
Extracting features from text
The bag-of-words representation
Stop-word filtering
Stemming and lemmatization
Extending bag-of-words with TF-IDF weights
Space-efficient feature vectorizing with the hashing trick
Extracting features from images
Extracting features from pixel intensities
Extracting points of interest as features
SIFT and SURF
Data standardization
Summary
4. From Linear Regression to Logistic Regression
Binary classification with logistic regression
Spam filtering
Binary classification performance metrics
Accuracy
Precision and recall
Calculating the F1 measure
ROC AUC
Tuning models with grid search
Multi-class classification
Multi-class classification performance metrics
Multi-label classification and problem transformation
Multi-label classification performance metrics
Summary
5. Nonlinear Classification and Regression with Decision Trees
Decision trees
Training decision trees
Selecting the questions
Information gain
Gini impurity
Decision trees with scikit-learn
Tree ensembles
The advantages and disadvantages of decision trees
Summary
6. Clustering with K-Means
Clustering with the K-Means algorithm
Local optima
The elbow method
Evaluating clusters
Image quantization
Clustering to learn features
Summary
7. Dimensionality Reduction with PCA
An overview of PCA
Performing Principal Component Analysis
Variance, Covariance, and Covariance Matrices
Eigenvectors and eigenvalues
Dimensionality reduction with Principal Component Analysis
Using PCA to visualize high-dimensional data
Face recognition with PCA
Summary
8. The Perceptron
Activation functions
The perceptron learning algorithm
Binary classification with the perceptron
Document classification with the perceptron
Limitations of the perceptron
Summary
9. From the Perceptron to Support Vector Machines
Kernels and the kernel trick
Maximum margin classification and support vectors
Classifying characters in scikit-learn
Classifying handwritten digits
Classifying characters in natural images
Summary
10. From the Perceptron to Artificial Neural Networks
Nonlinear decision boundaries
Feedforward and feedback artificial neural networks
Multilayer perceptrons
Minimizing the cost function
Forward propagation
Backpropagation
Approximating XOR with Multilayer perceptrons
Classifying handwritten digits
Summary
Bibliography
Index
买过这本书的人还买过
读了这本书的人还在读
同类图书排行榜