售 价:¥
温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印
为你推荐
Preface
About the Book
About the Authors
Elevator Pitch
Key Features
Description
Learning Objectives
Audience
Approach
Hardware Requirements
Software Requirements
Conventions
Installation and Setup
Installing R on Windows
Installing R on macOS X
Installing R on Linux
Chapter 1
Introduction to Clustering Methods
Introduction
Introduction to Clustering
Uses of Clustering
Introduction to the Iris Dataset
Exercise 1: Exploring the Iris Dataset
Types of Clustering
Introduction to k-means Clustering
Euclidean Distance
Manhattan Distance
Cosine Distance
The Hamming Distance
k-means Clustering Algorithm
Steps to Implement k-means Clustering
Exercise 2: Implementing k-means Clustering on the Iris Dataset
Activity 1: k-means Clustering with Three Clusters
Introduction to k-means Clustering with Built-In Functions
k-means Clustering with Three Clusters
Exercise 3: k-means Clustering with R Libraries
Introduction to Market Segmentation
Exercise 4: Exploring the Wholesale Customer Dataset
Activity 2: Customer Segmentation with k-means
Introduction to k-medoids Clustering
The k-medoids Clustering Algorithm
k-medoids Clustering Code
Exercise 5: Implementing k-medoid Clustering
k-means Clustering versus k-medoids Clustering
Activity 3: Performing Customer Segmentation with k-medoids Clustering
Deciding the Optimal Number of Clusters
Types of Clustering Metrics
Silhouette Score
Exercise 6: Calculating the Silhouette Score
Exercise 7: Identifying the Optimum Number of Clusters
WSS/Elbow Method
Exercise 8: Using WSS to Determine the Number of Clusters
The Gap Statistic
Exercise 9: Calculating the Ideal Number of Clusters with the Gap Statistic
Activity 4: Finding the Ideal Number of Market Segments
Summary
Chapter 2
Advanced Clustering Methods
Introduction
Introduction to k-modes Clustering
Steps for k-Modes Clustering
Exercise 10: Implementing k-modes Clustering
Activity 5: Implementing k-modes Clustering on the Mushroom Dataset
Introduction to Density-Based Clustering (DBSCAN)
Steps for DBSCAN
Exercise 11: Implementing DBSCAN
Uses of DBSCAN
Activity 6: Implementing DBSCAN and Visualizing the Results
Introduction to Hierarchical Clustering
Types of Similarity Metrics
Steps to Perform Agglomerative Hierarchical Clustering
Exercise 12: Agglomerative Clustering with Different Similarity Measures
Divisive Clustering
Steps to Perform Divisive Clustering
Exercise 13: Performing DIANA Clustering
Activity 7: Performing Hierarchical Cluster Analysis on the Seeds Dataset
Summary
Chapter 3
Probability Distributions
Introduction
Basic Terminology of Probability Distributions
Uniform Distribution
Exercise 14: Generating and Plotting Uniform Samples in R
Normal Distribution
Exercise 15: Generating and Plotting a Normal Distribution in R
Skew and Kurtosis
Log-Normal Distributions
Exercise 16: Generating a Log-Normal Distribution from a Normal Distribution
The Binomial Distribution
Exercise 17: Generating a Binomial Distribution
The Poisson Distribution
The Pareto Distribution
Introduction to Kernel Density Estimation
KDE Algorithm
Exercise 18: Visualizing and Understanding KDE
Exercise 19: Studying the Effect of Changing Kernels on a Distribution
Activity 8: Finding the Standard Distribution Closest to the Distribution of Variables of the Iris Dataset
Introduction to the Kolmogorov-Smirnov Test
The Kolmogorov-Smirnov Test Algorithm
Exercise 20: Performing the Kolmogorov-Smirnov Test on Two Samples
Activity 9: Calculating the CDF and Performing the Kolmogorov-Smirnov Test with the Normal Distribution
Summary
Chapter 4
Dimension Reduction
Introduction
The Idea of Dimension Reduction
Exercise 21: Examining a Dataset that Contains the Chemical Attributes of Different Wines
Importance of Dimension Reduction
Market Basket Analysis
Exercise 22: Data Preparation for the Apriori Algorithm
Exercise 23: Passing through the Data to Find the Most Common Baskets
Exercise 24: More Passes through the Data
Exercise 25: Generating Associative Rules as the Final Step of the Apriori Algorithm
Principal Component Analysis
Linear Algebra Refresher
Matrices
Variance
Covariance
Exercise 26: Examining Variance and Covariance on the Wine Dataset
Eigenvectors and Eigenvalues
The Idea of PCA
Exercise 27: Performing PCA
Exercise 28: Performing Dimension Reduction with PCA
Activity 10: Performing PCA and Market Basket Analysis on a New Dataset
Summary
Chapter 5
Data Comparison Methods
Introduction
Hash Functions
Exercise 29: Creating and Using a Hash Function
Exercise 30: Verifying Our Hash Function
Analytic Signatures
Exercise 31: Perform the Data Preparation for Creating an Analytic Signature for an Image
Exercise 32: Creating a Brightness Comparison Function
Exercise 33: Creating a Function to Compare Image Sections to All of the Neighboring Sections
Exercise 34: Creating a Function that Generates an Analytic Signature for an Image
Activity 11: Creating an Image Signature for a Photograph of a Person
Comparison of Signatures
Activity 12: Creating an Image Signature for the Watermarked Image
Applying Other Unsupervised Learning Methods to Analytic Signatures
Latent Variable Models – Factor Analysis
Exercise 35: Preparing for Factor Analysis
Linear Algebra behind Factor Analysis
Exercise 36: More Exploration with Factor Analysis
Activity 13: Performing Factor Analysis
Summary
Chapter 6
Anomaly Detection
Introduction
Univariate Outlier Detection
Exercise 37: Performing an Exploratory Visual Check for Outliers Using R's boxplot Function
Exercise 38: Transforming a Fat-Tailed Dataset to Improve Outlier Classification
Exercise 39: Finding Outliers without Using R's Built-In boxplot Function
Exercise 40: Detecting Outliers Using a Parametric Method
Multivariate Outlier Detection
Exercise 41: Calculating Mahalanobis Distance
Detecting Anomalies in Clusters
Other Methods for Multivariate Outlier Detection
Exercise 42: Classifying Outliers based on Comparisons of Mahalanobis Distances
Detecting Outliers in Seasonal Data
Exercise 43: Performing Seasonality Modeling
Exercise 44: Finding Anomalies in Seasonal Data Using a Parametric Method
Contextual and Collective Anomalies
Exercise 45: Detecting Contextual Anomalies
Exercise 46: Detecting Collective Anomalies
Kernel Density
Exercise 47: Finding Anomalies Using Kernel Density Estimation
Continuing in Your Studies of Anomaly Detection
Activity 14: Finding Univariate Anomalies Using a Parametric Method and a Non-parametric Method
Activity 15: Using Mahalanobis Distance to Find Anomalies
Summary
Appendix
Chapter 1: Introduction to Clustering Methods
Activity 1: k-means Clustering with Three Clusters
Activity 2: Customer Segmentation with k-means
Activity 3: Performing Customer Segmentation with k-medoids Clustering
Activity 4: Finding the Ideal Number of Market Segments
Chapter 2: Advanced Clustering Methods
Activity 5: Implementing k-modes Clustering on the Mushroom Dataset
Activity 6: Implementing DBSCAN and Visualizing the Results
Activity 7: Performing a Hierarchical Cluster Analysis on the Seeds Dataset
Chapter 3: Probability Distributions
Activity 8: Finding the Standard Distribution Closest to the Distribution of Variables of the Iris Dataset
Activity 9: Calculating the CDF and Performing the Kolmogorov-Simonov Test with the Normal Distribution
Chapter 4: Dimension Reduction
Activity 10: Performing PCA and Market Basket Analysis on a New Dataset
Chapter 5: Data Comparison Methods
Activity 11: Create an Image Signature for a Photograph of a Person
Activity 12: Create an Image Signature for the Watermarked Image
Activity 13: Performing Factor Analysis
Chapter 6: Anomaly Detection
Activity 14: Finding Univariate Anomalies Using a Parametric Method and a Non-parametric Method
Activity 15: Using Mahalanobis Distance to Find Anomalies
买过这本书的人还买过
读了这本书的人还在读
同类图书排行榜