售 价:¥
温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印
为你推荐
Spark Cookbook
Table of Contents
Spark Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers, and more
Why Subscribe?
Free Access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Sections
Getting ready
How to do it…
How it works…
There's more…
See also
Conventions
Reader feedback
Customer support
Downloading the color images of this book
Errata
Piracy
Questions
1. Getting Started with Apache Spark
Introduction
Installing Spark from binaries
Getting ready
How to do it...
Building the Spark source code with Maven
Getting ready
How to do it...
Launching Spark on Amazon EC2
Getting ready
How to do it...
See also
Deploying on a cluster in standalone mode
Getting ready
How to do it...
How it works...
See also
Deploying on a cluster with Mesos
How to do it...
Deploying on a cluster with YARN
Getting ready
How to do it...
How it works…
Using Tachyon as an off-heap storage layer
How to do it...
See also
2. Developing Applications with Spark
Introduction
Exploring the Spark shell
How to do it...
Developing Spark applications in Eclipse with Maven
Getting ready
How to do it...
Developing Spark applications in Eclipse with SBT
How to do it...
Developing a Spark application in IntelliJ IDEA with Maven
How to do it...
Developing a Spark application in IntelliJ IDEA with SBT
How to do it...
3. External Data Sources
Introduction
Loading data from the local filesystem
How to do it...
Loading data from HDFS
How to do it...
There's more…
Loading data from HDFS using a custom InputFormat
How to do it...
Loading data from Amazon S3
How to do it...
Loading data from Apache Cassandra
How to do it...
There's more...
Merge strategies in sbt-assembly
Loading data from relational databases
Getting ready
How to do it...
How it works…
4. Spark SQL
Introduction
Understanding the Catalyst optimizer
How it works…
Analysis
Logical plan optimization
Physical planning
Code generation
Creating HiveContext
Getting ready
How to do it...
Inferring schema using case classes
How to do it...
Programmatically specifying the schema
How to do it...
How it works…
Loading and saving data using the Parquet format
How to do it...
How it works…
There's more…
Loading and saving data using the JSON format
How to do it...
How it works…
There's more…
Loading and saving data from relational databases
Getting ready
How to do it...
Loading and saving data from an arbitrary source
How to do it...
There's more…
5. Spark Streaming
Introduction
Word count using Streaming
How to do it...
Streaming Twitter data
How to do it...
Streaming using Kafka
Getting ready
How to do it...
There's more…
6. Getting Started with Machine Learning Using MLlib
Introduction
Creating vectors
How to do it…
How it works...
Creating a labeled point
How to do it…
Creating matrices
How to do it…
Calculating summary statistics
How to do it…
Calculating correlation
Getting ready
How to do it…
Doing hypothesis testing
How to do it…
Creating machine learning pipelines using ML
Getting ready
How to do it…
7. Supervised Learning with MLlib – Regression
Introduction
Using linear regression
Getting ready
How to do it…
Understanding cost function
Doing linear regression with lasso
How to do it…
Doing ridge regression
How to do it…
8. Supervised Learning with MLlib – Classification
Introduction
Doing classification using logistic regression
Getting ready
How to do it…
Doing binary classification using SVM
How to do it…
Doing classification using decision trees
Getting ready
How to do it…
How it works…
Doing classification using Random Forests
Getting ready
How to do it…
How it works…
Doing classification using Gradient Boosted Trees
Getting ready
How to do it…
Doing classification with Naïve Bayes
Getting ready
How to do it…
9. Unsupervised Learning with MLlib
Introduction
Clustering using k-means
Getting ready
How to do it…
Dimensionality reduction with principal component analysis
Getting ready
How to do it…
Dimensionality reduction with singular value decomposition
Getting ready
How to do it…
10. Recommender Systems
Introduction
Collaborative filtering using explicit feedback
Getting ready
How to do it…
Collaborative filtering using implicit feedback
Getting ready
How to do it…
How it works…
There's more…
11. Graph Processing Using GraphX
Introduction
Fundamental operations on graphs
Getting ready
How to do it…
Using PageRank
Getting ready
How to do it…
Finding connected components
Getting ready
How to do it…
Performing neighborhood aggregation
Getting ready
How to do it…
12. Optimizations and Performance Tuning
Introduction
Optimizing memory
Using compression to improve performance
Using serialization to improve performance
How to do it…
Optimizing garbage collection
How to do it…
Optimizing the level of parallelism
How to do it…
Understanding the future of optimization – project Tungsten
Manual memory management by leverage application semantics
Using algorithms and data structures
Code generation
Index
买过这本书的人还买过
读了这本书的人还在读
同类图书排行榜