售 价:¥
温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印
为你推荐
Apache Mahout Clustering Designs
Table of Contents
Apache Mahout Clustering Designs
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers, and more
Why subscribe?
Free access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. Understanding Clustering
The clustering concept
Application of clustering
Understanding distance measures
Understanding different clustering techniques
Hierarchical methods
The partitioning method
The density-based method
Probabilistic clustering
Algorithm support in Mahout
Clustering algorithms in Mahout
Installing Mahout
Building Mahout code using Maven
Setting up the development environment using Eclipse
Setting up Mahout for Windows users
Preparing data for use with clustering techniques
Summary
2. Understanding K-means Clustering
Learning K-means
Running K-means on Mahout
Dataset selection
Executing K-means
The clusterdump result
Visualizing clusters
Summary
3. Understanding Canopy Clustering
Running Canopy clustering on Mahout
The Canopy generation phase
The Canopy clustering phase
Running Canopy clustering
Using the Canopy output for K-means
Visualizing clusters
Working with CSV files
Summary
4. Understanding the Fuzzy K-means Algorithm Using Mahout
Learning Fuzzy K-means clustering
Running Fuzzy K-means on Mahout
Dataset
Creating a vector for the dataset
Vector reader
Visualizing clusters
Summary
5. Understanding Model-based Clustering
Learning model-based clustering
Understanding Dirichlet clustering
Topic modeling
Running LDA using Mahout
Dataset selection
Steps to execute CVB (LDA)
Summary
6. Understanding Streaming K-means
Learning Streaming K-means
The Streaming step
The BallKMeans step
Using Mahout for streaming K-means
Dataset selection
Converting CSV to a vector file
Running Streaming K-means
Summary
7. Spectral Clustering
Understanding spectral clustering
Affinity (similarity) graph
Getting graph Laplacian from the affinity matrix
Eigenvectors and eigenvalues
The spectral clustering algorithm
Normalized spectral clustering
Mahout implementation of spectral clustering
Summary
8. Improving Cluster Quality
Evaluating clusters
Extrinsic methods
Intrinsic methods
Using DistanceMeasure interface
Summary
9. Creating a Cluster Model for Production
Preparing the dataset
Launching the Mahout job on the cluster
Performance tuning for the job
Summary
Index
买过这本书的人还买过
读了这本书的人还在读
同类图书排行榜