售 价:¥
温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印
为你推荐
About the Book
About the Authors
Objectives
Audience
Approach
Minimum Hardware Requirements
Software Requirements
Conventions
Installation and Setup
Installing the Code Bundle
Additional Resources
Chapter 1
Data Preparation and Cleaning
Introduction
Data Models and Structured Data
pandas
Importing and Exporting Data With pandas DataFrames
Viewing and Inspecting Data in DataFrames
Exercise 1: Importing JSON Files into pandas
Exercise 2: Identifying Semi-Structured and Unstructured Data
Structure of a pandas Series
Data Manipulation
Selecting and Filtering in pandas
Creating Test DataFrames in Python
Adding and Removing Attributes and Observations
Exercise 3: Creating and Modifying Test DataFrames
Combining Data
Handling Missing Data
Exercise 4: Combining DataFrames and Handling Missing Values
Applying Functions and Operations on DataFrames
Grouping Data
Exercise 5: Applying Data Transformations
Activity 1: Addressing Data Spilling
Summary
Chapter 2
Data Exploration and Visualization
Introduction
Identifying the Right Attributes
Exercise 6: Exploring the Attributes in Sales Data
Generating Targeted Insights
Selecting and Renaming Attributes
Transforming Values
Exercise 7: Targeting Insights for Specific Use Cases
Reshaping the Data
Exercise 8: Understanding Stacking and Unstacking
Pivot Tables
Visualizing Data
Exercise 9: Visualizing Data With pandas
Visualization through Seaborn
Visualization with Matplotlib
Activity 2: Analyzing Advertisements
Summary
Chapter 3
Unsupervised Learning: Customer Segmentation
Introduction
Customer Segmentation Methods
Traditional Segmentation Methods
Unsupervised Learning (Clustering) for Customer Segmentation
Similarity and Data Standardization
Determining Similarity
Standardizing Data
Exercise 10: Standardizing Age and Income Data of Customers
Calculating Distance
Exercise 11: Calculating Distance Between Three Customers
Activity 3: Loading, Standardizing, and Calculating Distance with a Dataset
k-means Clustering
Understanding k-means Clustering
Exercise 12: k-means Clustering on Income/Age Data
High-Dimensional Data
Exercise 13: Dealing with High-Dimensional Data
Activity 4: Using k-means Clustering on Customer Behavior Data
Summary
Chapter 4
Choosing the Best Segmentation Approach
Introduction
Choosing the Number of Clusters
Simple Visual Inspection
Exercise 14: Choosing the Number of Clusters Based on Visual Inspection
The Elbow Method with Sum of Squared Errors
Exercise 15: Determining the Number of Clusters Using the Elbow Method
Activity 5: Determining Clusters for High-End Clothing Customer Data Using the Elbow Method with the Sum of Squared Errors
Different Methods of Clustering
Mean-Shift Clustering
Exercise 16: Performing Mean-Shift Clustering to Cluster Data
k-modes and k-prototypes Clustering
Exercise 17: Clustering Data Using the k-prototypes Method
Activity 6: Using Different Clustering Techniques on Customer Behavior Data
Evaluating Clustering
Silhouette Score
Exercise 18: Calculating Silhouette Score to Pick the Best k for k-means and Comparing to the Mean-Shift Algorithm
Train and Test Split
Exercise 19: Using a Train-Test Split to Evaluate Clustering Performance
Activity 7: Evaluating Clustering on Customer Behavior Data
Summary
Chapter 5
Predicting Customer Revenue Using Linear Regression
Introduction
Understanding Regression
Feature Engineering for Regression
Feature Creation
Data Cleaning
Exercise 20: Creating Features for Transaction Data
Assessing Features Using Visualizations and Correlations
Exercise 21: Examining Relationships between Predictors and Outcome
Activity 8: Examining Relationships Between Storefront Locations and Features about Their Area
Performing and Interpreting Linear Regression
Exercise 22: Building a Linear Model Predicting Customer Spend
Activity 9: Building a Regression Model to Predict Storefront Location Revenue
Summary
Chapter 6
Other Regression Techniques and Tools for Evaluation
Introduction
Evaluating the Accuracy of a Regression Model
Residuals and Errors
Mean Absolute Error
Root Mean Squared Error
Exercise 23: Evaluating Regression Models of Location Revenue Using MAE and RMSE
Activity 10: Testing Which Variables are Important for Predicting Responses to a Marketing Offer
Using Regularization for Feature Selection
Exercise 24: Using Lasso Regression for Feature Selection
Activity 11: Using Lasso Regression to Choose Features for Predicting Customer Spend
Tree-Based Regression Models
Random Forests
Exercise 25: Using Tree-Based Regression Models to Capture Non-Linear Trends
Activity 12: Building the Best Regression Model for Customer Spend Based on Demographic Data
Summary
Chapter 7
Supervised Learning: Predicting Customer Churn
Introduction
Classification Problems
Understanding Logistic Regression
Revisiting Linear Regression
Logistic Regression
Exercise 26: Plotting the Sigmoid Function
Cost Function for Logistic Regression
Assumptions of Logistic Regression
Exercise 27: Loading, Splitting, and Applying Linear and Logistic Regression to Data
Creating a Data Science Pipeline
Obtaining the Data
Exercise 28: Obtaining the Data
Scrubbing the Data
Exercise 29: Imputing Missing Values
Exercise 30: Renaming Columns and Changing the Data Type
Exploring the Data
Statistical Overview
Correlation
Exercise 31: Obtaining the Statistical Overview and Correlation Plot
Visualizing the Data
Exercise 32: Performing Exploratory Data Analysis (EDA)
Activity 13: Performing OSE of OSEMN
Modeling the Data
Feature Selection
Exercise 33: Performing Feature Selection
Model Building
Exercise 34: Building a Logistic Regression Model
Interpreting the Data
Activity 14: Performing MN of OSEMN
Summary
Chapter 8
Fine-Tuning Classification Algorithms
Introduction
Support Vector Machines
Intuition Behind Maximum Margin
Linearly Inseparable Cases
Linearly Inseparable Cases Using Kernel
Exercise 35: Training an SVM Algorithm Over a Dataset
Decision Trees
Exercise 36: Implementing a Decision Tree Algorithm Over a Dataset
Important Terminology of Decision Trees
Decision Tree Algorithm Formulation
Random Forest
Exercise 37: Implementing a Random Forest Model Over a Dataset
Activity 15: Implementing Different Classification Algorithms
Preprocessing Data for Machine Learning Models
Standardization
Exercise 38: Standardizing Data
Scaling
Exercise 39: Scaling Data After Feature Selection
Normalization
Exercise 40: Performing Normalization on Data
Model Evaluation
Exercise 41: Implementing Stratified k-fold
Fine-Tuning of the Model
Exercise 42: Fine-Tuning a Model
Activity 16: Tuning and Optimizing the Model
Performance Metrics
Precision
Recall
F1 Score
Exercise 43: Evaluating the Performance Metrics for a Model
ROC Curve
Exercise 44: Plotting the ROC Curve
Activity 17: Comparison of the Models
Summary
Chapter 9
Modeling Customer Choice
Introduction
Understanding Multiclass Classification
Classifiers in Multiclass Classification
Exercise 45: Implementing a Multiclass Classification Algorithm on a Dataset
Performance Metrics
Exercise 46: Evaluating Performance Using Multiclass Performance Metrics
Activity 18: Performing Multiclass Classification and Evaluating Performance
Class Imbalanced Data
Exercise 47: Performing Classification on Imbalanced Data
Dealing with Class-Imbalanced Data
Exercise 48: Visualizing Sampling Techniques
Exercise 49: Fitting a Random Forest Classifier Using SMOTE and Building the Confusion Matrix
Activity 19: Dealing with Imbalanced Data
Summary
Appendix
Chapter 1: Data Preparation and Cleaning
Activity 1: Addressing Data Spilling
Chapter 2: Data Exploration and Visualization
Activity 2: Analyzing Advertisements
Chapter 3: Unsupervised Learning: Customer Segmentation
Activity 3: Loading, Standardizing, and Calculating Distance with a Dataset
Activity 4: Using k-means Clustering on Customer Behavior Data
Chapter 4: Choosing the Best Segmentation Approach
Activity 5: Determining Clusters for High-End Clothing Customer Data Using the Elbow Method with the Sum of Squared Errors
Activity 6: Using Different Clustering Techniques on Customer Behavior Data
Activity 7: Evaluating Clustering on Customer Behavior Data
Chapter 5: Predicting Customer Revenue Using Linear Regression
Activity 8: Examining Relationships between Storefront Locations and Features about their Area
Activity 9: Building a Regression Model to Predict Storefront Location Revenue
Chapter 6: Other Regression Techniques and Tools for Evaluation
Activity 10: Testing Which Variables are Important for Predicting Responses to a Marketing Offer
Activity 11: Using Lasso Regression to Choose Features for Predicting Customer Spend
Activity 12: Building the Best Regression Model for Customer Spend Based on Demographic Data
Chapter 7: Supervised Learning: Predicting Customer Churn
Activity 13: Performing OSE from OSEMN
Activity 14: Performing MN of OSEMN
Chapter 8: Fine-Tuning Classification Algorithms
Activity 15: Implementing Different Classification Algorithms
Activity 16: Tuning and Optimizing the Model
Activity 17: Comparison of the Models
Chapter 9: Modeling Customer Choice
Activity 18: Performing Multiclass Classification and Evaluating Performance
Activity 19: Dealing with Imbalanced Data
买过这本书的人还买过
读了这本书的人还在读
同类图书排行榜