售 价:¥
温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印
为你推荐
Apache Spark Graph Processing
Table of Contents
Apache Spark Graph Processing
Credits
Foreword
About the Author
About the Reviewer
www.PacktPub.com
Support files, eBooks, discount offers, and more
Why subscribe?
Free access for Packt account holders
Preface
Distinctive features
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. Getting Started with Spark and GraphX
Downloading and installing Spark 1.4.1
Experimenting with the Spark shell
Getting started with GraphX
Building a tiny social network
Loading the data
The property graph
Transforming RDDs to VertexRDD and EdgeRDD
Introducing graph operations
Building and submitting a standalone application
Writing and configuring a Spark program
Building the program with the Scala Build Tool
Deploying and running with spark-submit
Summary
2. Building and Exploring Graphs
Network datasets
The communication network
Flavor networks
Social ego networks
Graph builders
The Graph factory method
edgeListFile
fromEdges
fromEdgeTuples
Building graphs
Building directed graphs
Building a bipartite graph
Building a weighted social ego network
Computing the degrees of the network nodes
In-degree and out-degree of the Enron email network
Degrees in the bipartite food network
Degree histogram of the social ego networks
Summary
3. Graph Analysis and Visualization
Network datasets
The graph visualization
Installing the GraphStream and BreezeViz libraries
Visualizing the graph data
Plotting the degree distribution
The analysis of network connectedness
Finding the connected components
Counting triangles and computing clustering coefficients
The network centrality and PageRank
How PageRank works
Ranking web pages
Scala Build Tool revisited
Organizing build definitions
Managing library dependencies
A preview of the steps
Step 1 – Enable the sbt-assembly plugin
Step 2 – Create a build.sbt file
Step 3 – Declare library dependencies and resolvers
Step 4 – Set up the sbt-assembly plugin
Step 5 – Create the uber JAR
Running tasks with SBT commands
Summary
4. Transforming and Shaping Up Graphs to Your Needs
Transforming the vertex and edge attributes
mapVertices
mapEdges
mapTriplets
Modifying graph structures
The reverse operator
The subgraph operator
The mask operator
The groupEdges operator
Joining graph datasets
joinVertices
outerJoinVertices
Example – Hollywood movie graph
Data operations on VertexRDD and EdgeRDD
Mapping VertexRDD and EdgeRDD
Filtering VertexRDDs
Joining VertexRDDs
Joining EdgeRDDs
Reversing edge directions
Collecting neighboring information
Example – from food network to flavor pairing
Summary
5. Creating Custom Graph Aggregation Operators
NCAA College Basketball datasets
The aggregateMessages operator
EdgeContext
Abstracting out the aggregation
Keeping things DRY
Coach wants more numbers
Calculating average points per game
Defense stats – D matters as in direction
Joining average stats into a graph
Performance optimization
The MapReduceTriplets operator
Summary
6. Iterative Graph-Parallel Processing with Pregel
The Pregel computational model
Example – iterating towards the social equality
The Pregel API in GraphX
Community detection through label propagation
The Pregel implementation of PageRank
Summary
7. Learning Graph Structures
Community clustering in graphs
Spectral clustering
Power iteration clustering
Applications – music fan community detection
Step 1 – load the data into a Spark graph property
Step 2 – extract the features of nodes
Step 3 – define a similarity measure between two nodes
Step 4 – create an affinity matrix
Step 5 – run k-means clustering on the affinity matrix
Exercise – collaborative clustering through playlists
Summary
A. References
Chapter 2, Building and Exploring Graphs
Chapter 3, Graph Analysis and Visualization
Chapter 7, Learning Graph Structures
Index
买过这本书的人还买过
读了这本书的人还在读
同类图书排行榜