


Applied Unsupervised Learning with Python电子书

售       价:¥

1人正在读 | 0人评论 9.8

作       者:Benjamin Johnston

出  版  社:Packt Publishing


字       数:886.2万

所属分类: 进口书 > 外文原版书 > 电脑/网络



  • 读书简介
  • 目录
  • 累计评论(0条)
  • 读书简介
  • 目录
  • 累计评论(0条)
Design clever algorithms that can uncover interesting structures and hidden relationships in unstructured, unlabeled data Key Features * Learn how to select the most suitable Python library to solve your problem * Compare k-Nearest Neighbor (k-NN) and non-parametric methods and decide when to use them * Delve into the applications of neural networks using real-world datasets Book Description Unsupervised learning is a useful and practical solution in situations where labeled data is not available. Applied Unsupervised Learning with Python guides you on the best practices for using unsupervised learning techniques in tandem with Python libraries and extracting meaningful information from unstructured data. The course begins by explaining how basic clustering works to find similar data points in a set. Once you are well versed with the k-means algorithm and how it operates, you’ll learn what dimensionality reduction is and where to apply it. As you progress, you’ll learn various neural network techniques and how they can improve your model. While studying the applications of unsupervised learning, you will also understand how to mine topics that are trending on Twitter and Facebook and build a news recommendation engine for users. You will complete the course by challenging yourself through various interesting activities such as performing a Market Basket Analysis and identifying relationships between different merchandises. By the end of this course, you will have the skills you need to confidently build your own models using Python. What you will learn * Understand the basics and importance of clustering * Build k-means, hierarchical, and DBSCAN clustering algorithms from scratch with built-in packages * Explore dimensionality reduction and its applications * Use scikit-learn (sklearn) to implement and analyse principal component analysis (PCA)on the Iris dataset * Employ Keras to build autoencoder models for the CIFAR-10 dataset * Apply the Apriori algorithm with machine learning extensions (Mlxtend) to study transaction data Who this book is for This course is designed for developers, data scientists, and machine learning enthusiasts who are interested in unsupervised learning. Some familiarity with Python programming along with basic knowledge of mathematical concepts including exponents, square roots, means, and medians will be beneficial.


About the Book

About the Authors

Learning Objectives



Hardware Requirements

Software Requirements


Installation and Setup

Install Anaconda on Windows

Install Anaconda on Linux

Install Anaconda on macOS

Install Python on Windows

Install Python on Linux

Install Python on macOS X

Additional Resources

Chapter 1

Introduction to Clustering


Unsupervised Learning versus Supervised Learning


Identifying Clusters

Two-Dimensional Data

Exercise 1: Identifying Clusters in Data

Introduction to k-means Clustering

No-Math k-means Walkthrough

k-means Clustering In-Depth Walkthrough

Alternative Distance Metric – Manhattan Distance

Deeper Dimensions

Exercise 2: Calculating Euclidean Distance in Python

Exercise 3: Forming Clusters with the Notion of Distance

Exercise 4: Implementing k-means from Scratch

Exercise 5: Implementing k-means with Optimization

Clustering Performance: Silhouette Score

Exercise 6: Calculating the Silhouette Score

Activity 1: Implementing k-means Clustering


Chapter 2

Hierarchical Clustering


Clustering Refresher

k-means Refresher

The Organization of Hierarchy

Introduction to Hierarchical Clustering

Steps to Perform Hierarchical Clustering

An Example Walk-Through of Hierarchical Clustering

Exercise 7: Building a Hierarchy


Activity 2: Applying Linkage Criteria

Agglomerative versus Divisive Clustering

Exercise 8: Implementing Agglomerative Clustering with scikit-learn

Activity 3: Comparing k-means with Hierarchical Clustering

k-means versus Hierarchical Clustering


Chapter 3

Neighborhood Approaches and DBSCAN


Clusters as Neighborhoods

Introduction to DBSCAN


Walkthrough of the DBSCAN Algorithm

Exercise 9: Evaluating the Impact of Neighborhood Radius Size

DBSCAN Attributes – Neighborhood Radius

Activity 4: Implement DBSCAN from Scratch

DBSCAN Attributes – Minimum Points

Exercise 10: Evaluating the Impact of Minimum Points Threshold

Activity 5: Comparing DBSCAN with k-means and Hierarchical Clustering

DBSCAN Versus k-means and Hierarchical Clustering


Chapter 4

Dimension Reduction and PCA


What Is Dimensionality Reduction?

Applications of Dimensionality Reduction

The Curse of Dimensionality

Overview of Dimensionality Reduction Techniques

Dimensionality Reduction and Unsupervised Learning



Standard Deviation


Covariance Matrix

Exercise 11: Understanding the Foundational Concepts of Statistics

Eigenvalues and Eigenvectors

Exercise 12: Computing Eigenvalues and Eigenvectors

The Process of PCA

Exercise 13: Manually Executing PCA

Exercise 14: Scikit-Learn PCA

Activity 6: Manual PCA versus scikit-learn

Restoring the Compressed Dataset

Exercise 15: Visualizing Variance Reduction with Manual PCA

Exercise 16: Visualizing Variance Reduction with

Exercise 17: Plotting 3D Plots in Matplotlib

Activity 7: PCA Using the Expanded Iris Dataset


Chapter 5



Fundamentals of Artificial Neural Networks

The Neuron

Sigmoid Function

Rectified Linear Unit (ReLU)

Exercise 18: Modeling the Neurons of an Artificial Neural Network

Activity 8: Modeling Neurons with a ReLU Activation Function

Neural Networks: Architecture Definition

Exercise 19: Defining a Keras Model

Neural Networks: Training

Exercise 20: Training a Keras Neural Network Model

Activity 9: MNIST Neural Network


Exercise 21: Simple Autoencoder

Activity 10: Simple MNIST Autoencoder

Exercise 22: Multi-Layer Autoencoder

Convolutional Neural Networks

Exercise 23: Convolutional Autoencoder

Activity 11: MNIST Convolutional Autoencoder


Chapter 6

t-Distributed Stochastic Neighbor Embedding (t-SNE)


Stochastic Neighbor Embedding (SNE)

t-Distributed SNE

Exercise 24: t-SNE MNIST

Activity 12: Wine t-SNE

Interpreting t-SNE Plots


Exercise 25: t-SNE MNIST and Perplexity

Activity 13: t-SNE Wine and Perplexity


Exercise 26: t-SNE MNIST and Iterations

Activity 14: t-SNE Wine and Iterations

Final Thoughts on Visualizations


Chapter 7

Topic Modeling


Topic Models

Exercise 27: Setting Up the Environment

A High-Level Overview of Topic Models

Business Applications

Exercise 28: Data Loading

Cleaning Text Data

Data Cleaning Techniques

Exercise 29: Cleaning Data Step by Step

Exercise 30: Complete Data Cleaning

Activity 15: Loading and Cleaning Twitter Data

Latent Dirichlet Allocation

Variational Inference

Bag of Words

Exercise 31: Creating a Bag-of-Words Model Using the Count Vectorizer


Exercise 32: Selecting the Number of Topics

Exercise 33: Running Latent Dirichlet Allocation

Exercise 34: Visualize LDA

Exercise 35: Trying Four Topics

Activity 16: Latent Dirichlet Allocation and Health Tweets

Bag-of-Words Follow-Up

Exercise 36: Creating a Bag-of-Words Using TF-IDF

Non-Negative Matrix Factorization

Frobenius Norm

Multiplicative Update

Exercise 37: Non-negative Matrix Factorization

Exercise 38: Visualizing NMF

Activity 17: Non-Negative Matrix Factorization


Chapter 8

Market Basket Analysis


Market Basket Analysis

Use Cases

Important Probabilistic Metrics

Exercise 39: Creating Sample Transaction Data



Lift and Leverage


Exercise 40: Computing Metrics

Characteristics of Transaction Data

Exercise 41: Loading Data

Data Cleaning and Formatting

Exercise 42: Data Cleaning and Formatting

Data Encoding

Exercise 43: Data Encoding

Activity 18: Loading and Preparing Full Online Retail Data

Apriori Algorithm

Computational Fixes

Exercise 44: Executing the Apriori algorithm

Activity 19: Apriori on the Complete Online Retail Dataset

Association Rules

Exercise 45: Deriving Association Rules

Activity 20: Finding the Association Rules on the Complete Online Retail Dataset


Chapter 9

Hotspot Analysis


Spatial Statistics

Probability Density Functions

Using Hotspot Analysis in Business

Kernel Density Estimation

The Bandwidth Value

Exercise 46: The Effect of the Bandwidth Value

Selecting the Optimal Bandwidth

Exercise 47: Selecting the Optimal Bandwidth Using Grid Search

Kernel Functions

Exercise 48: The Effect of the Kernel Function

Kernel Density Estimation Derivation

Exercise 49: Simulating the Derivation of Kernel Density Estimation

Activity 21: Estimating Density in One Dimension

Hotspot Analysis

Exercise 50: Loading Data and Modeling with Seaborn

Exercise 51: Working with Basemaps

Activity 22: Analyzing Crime in London



Chapter 1: Introduction to Clustering

Activity 1: Implementing k-means Clustering

Chapter 2: Hierarchical Clustering

Activity 3: Comparing k-means with Hierarchical Clustering

Chapter 3: Neighborhood Approaches and DBSCAN

Activity 4: Implement DBSCAN from Scratch

Activity 5: Comparing DBSCAN with k-means and Hierarchical Clustering

Chapter 4: Dimension Reduction and PCA

Activity 6: Manual PCA versus scikit-learn

Activity 7: PCA Using the Expanded Iris Dataset

Chapter 5: Autoencoders

Activity 8: Modeling Neurons with a ReLU Activation Function

Activity 9: MNIST Neural Network

Activity 10: Simple MNIST Autoencoder

Activity 11: MNIST Convolutional Autoencoder

Chapter 6: t-Distributed Stochastic Neighbor Embedding (t-SNE)

Activity 12: Wine t-SNE

Activity 13: t-SNE Wine and Perplexity

Activity 14: t-SNE Wine and Iterations

Chapter 7: Topic Modeling

Activity 15: Loading and Cleaning Twitter Data

Activity 16: Latent Dirichlet Allocation and Health Tweets

Activity 17: Non-Negative Matrix Factorization

Chapter 8: Market Basket Analysis

Activity 18: Loading and Preparing Full Online Retail Data

Activity 19: Apriori on the Complete Online Retail Dataset

Activity 20: Finding the Association Rules on the Complete Online Retail Dataset

Chapter 9: Hotspot Analysis

Activity 21: Estimating Density in One Dimension

Activity 22: Analyzing Crime in London

累计评论(0条) 0个书友正在讨论这本书 发表评论




