  读书简介
  目录
  累计评论(0条)
Explore the exciting world of machine learning with the fastest growing technology in the world Key Features * Understand various machine learning concepts with real-world examples * Implement a supervised machine learning pipeline from data ingestion to validation * Gain insights into how you can use machine learning in everyday life Book Description Machine learning—the ability of a machine to give right answers based on input data—has revolutionized the way we do business. Applied Supervised Learning with Python provides a rich understanding of how you can apply machine learning techniques in your data science projects using Python. You'll explore Jupyter Notebooks, the technology used commonly in academic and commercial circles with in-line code running support. With the help of fun examples, you'll gain experience working on the Python machine learning toolkit—from performing basic data cleaning and processing to working with a range of regression and classification algorithms. Once you’ve grasped the basics, you'll learn how to build and train your own models using advanced techniques such as decision trees, ensemble modeling, validation, and error metrics. You'll also learn data visualization techniques using powerful Python libraries such as Matplotlib and Seaborn. This book also covers ensemble modeling and random forest classifiers along with other methods for combining results from multiple models, and concludes by delving into cross-validation to test your algorithm and check how well the model works on unseen data. By the end of this book, you'll be equipped to not only work with machine learning algorithms, but also be able to create some of your own! What you will learn * Understand the concept of supervised learning and its applications * Implement common supervised learning algorithms using machine learning Python libraries * Validate models using the k-fold technique * Build your models with decision trees to get results effortlessly * Use ensemble modeling techniques to improve the performance of your model * Apply a variety of metrics to compare machine learning models Who this book is for Applied Supervised Learning with Python is for you if you want to gain a solid understanding of machine learning using Python. It'll help if you to have some experience in any functional or object-oriented language and a basic understanding of Python libraries and expressions, such as arrays and dictionaries.


About the Book

About the Authors




Hardware Requirements

Software Requirements


Installation and Setup

Installing the Code Bundle

Additional Resources

Chapter 1

Python Machine Learning Toolkit


Supervised Machine Learning

When to Use Supervised Learning

Why Python?

Jupyter Notebooks

Exercise 1: Launching a Jupyter Notebook

Exercise 2: Hello World

Exercise 3: Order of Execution in a Jupyter Notebook

Exercise 4: Advantages of Jupyter Notebooks

Python Packages and Modules


Loading Data in pandas

Exercise 5: Loading and Summarizing the Titanic Dataset

Exercise 6: Indexing and Selecting Data

Exercise 7: Advanced Indexing and Selection

pandas Methods

Exercise 8: Splitting, Applying, and Combining Data Sources

Lambda Functions

Exercise 9: Lambda Functions

Data Quality Considerations

Managing Missing Data

Class Imbalance

Low Sample Size

Activity 1: pandas Functions


Chapter 2

Exploratory Data Analysis and Visualization


Exploratory Data Analysis (EDA)

Exercise 10: Importing Libraries for Data Exploration

Summary Statistics and Central Values

Standard Deviation


Exercise 11: Summary Statistics of Our Dataset

Missing Values

Finding Missing Values

Exercise 12: Visualizing Missing Values

Imputation Strategies for Missing Values

Exercise 13: Imputation Using pandas

Exercise 14: Imputation Using scikit-learn

Exercise 15: Imputation Using Inferred Values

Activity 2: Summary Statistics and Missing Values

Distribution of Values

Target Variable

Exercise 16: Plotting a Bar Chart

Categorical Data

Exercise 17: Datatypes for Categorical Variables

Exercise 18: Calculating Category Value Counts

Exercise 19: Plotting a Pie Chart

Continuous Data

Exercise 20: Plotting a Histogram

Exercise 21: Skew and Kurtosis

Activity 3: Visually Representing the Distribution of Values

Relationships within the Data

Relationship between Two Continuous Variables

Exercise 22: Plotting a Scatter Plot

Exercise 23: Correlation Heatmap

Exercise 24: Pairplot

Relationship between a Continuous and a Categorical Variable

Exercise 25: Bar Chart

Exercise 26: Box Plot

Relationship between Two Categorical Variables

Exercise 27: Stacked Bar Chart

Activity 4: Relationships Within the Data


Chapter 3

Regression Analysis


Regression and Classification Problems

Data, Models, Training, and Evaluation

Linear Regression

Exercise 28: Plotting Data with a Moving Average

Activity 5: Plotting Data with a Moving Average

Least Squares Method

The scikit-learn Model API

Exercise 29: Fitting a Linear Model Using the Least Squares Method

Activity 6: Linear Regression Using the Least Squares Method

Linear Regression with Dummy Variables

Exercise 30: Introducing Dummy Variables

Activity 7: Dummy Variables

Parabolic Model with Linear Regression

Exercise 31: Parabolic Models with Linear Regression

Activity 8: Other Model Types with Linear Regression

Generic Model Training

Gradient Descent

Exercise 32: Linear Regression with Gradient Descent

Exercise 33: Optimizing Gradient Descent

Activity 9: Gradient Descent

Multiple Linear Regression

Exercise 34: Multiple Linear Regression

Autoregression Models

Exercise 35: Creating an Autoregression Model

Activity 10: Autoregressors


Chapter 4



Linear Regression as a Classifier

Exercise 36: Linear Regression as a Classifier

Logistic Regression

Exercise 37: Logistic Regression as a Classifier – Two-Class Classifier

Exercise 38: Logistic Regression – Multiclass Classifier

Activity 11: Linear Regression Classifier – Two-Class Classifier

Activity 12: Iris Classification Using Logistic Regression

Classification Using K-Nearest Neighbors

Exercise 39: K-NN Classification

Exercise 40: Visualizing K-NN Boundaries

Activity 13: K-NN Multiclass Classifier

Classification Using Decision Trees

Exercise 41: ID3 Classification

Exercise 42: Iris Classification Using a CART Decision Tree


Chapter 5

Ensemble Modeling


Exercise 43: Importing Modules and Preparing the Dataset

Overfitting and Underfitting



Overcoming the Problem of Underfitting and Overfitting



Bootstrap Aggregation

Exercise 44: Using the Bagging Classifier

Random Forest

Exercise 45: Building the Ensemble Model Using Random Forest


Adaptive Boosting

Exercise 46: Adaptive Boosting

Gradient Boosting

Exercise 47: GradientBoostingClassifier


Exercise 48: Building a Stacked Model

Activity 14: Stacking with Standalone and Ensemble Algorithms


Chapter 6

Model Evaluation


Exercise 49: Importing the Modules and Preparing Our Dataset

Evaluation Metrics


Exercise 50: Regression Metrics


Exercise 51: Classification Metrics

Splitting the Dataset

Hold-out Data

K-Fold Cross-Validation


Exercise 52: K-Fold Cross-Validation with Stratified Sampling

Performance Improvement Tactics

Variation in Train and Test Error

Hyperparameter Tuning

Exercise 53: Hyperparameter Tuning with Random Search

Feature Importance

Exercise 54: Feature Importance Using Random Forest

Activity 15: Final Test Project



Chapter 1: Python Machine Learning Toolkit

Activity 1: pandas Functions

Chapter 2: Exploratory Data Analysis and Visualization

Activity 2: Summary Statistics and Missing Values

Activity 3: Visually Representing the Distribution of Values

Activity 4: Relationships Within the Data

Chapter 3: Regression Analysis

Activity 5: Plotting Data with a Moving Average

Activity 6: Linear Regression Using the Least Squares Method

Activity 7: Dummy Variables

Activity 8: Other Model Types with Linear Regression

Activity 9: Gradient Descent

Activity 10: Autoregressors

Chapter 4: Classification

Activity 11: Linear Regression Classifier – Two-Class Classifier

Activity 12: Iris Classification Using Logistic Regression

Activity 13: K-NN Multiclass Classifier

Chapter 5: Ensemble Modeling

Activity 14: Stacking with Standalone and Ensemble Algorithms

Chapter 6: Model Evaluation

Activity 15: Final Test Project

