售 价:¥
温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印
为你推荐
Real-time Analytics with Storm and Cassandra
Table of Contents
Real-time Analytics with Storm and Cassandra
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers, and more
Why subscribe?
Free access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. Let's Understand Storm
Distributed computing problems
Real-time business solution for credit or debit card fraud detection
Aircraft Communications Addressing and Reporting system
Healthcare
Other applications
Solutions for complex distributed use cases
The Hadoop solution
A custom solution
Licensed proprietary solutions
Other real-time processing tools
A high-level view of various components of Storm
Delving into the internals of Storm
Quiz time
Summary
2. Getting Started with Your First Topology
Prerequisites for setting up Storm
Components of a Storm topology
Spouts
Bolts
Streams
Tuples – the data model in Storm
Executing a sample Storm topology – local mode
WordCount topology from the Storm-starter project
Executing the topology in the distributed mode
Set up Zookeeper (V 3.3.5) for Storm
Setting up Storm in the distributed mode
Launching Storm daemons
Executing the topology from Command Prompt
Tweaking the WordCount topology to customize it
Quiz time
Summary
3. Understanding Storm Internals by Examples
Customizing Storm spouts
Creating FileSpout
Tweaking WordCount topology to use FileSpout
The SocketSpout class
Anchoring and acking
The unreliable topology
Stream groupings
Local or shuffle grouping
Fields grouping
All grouping
Global grouping
Custom grouping
Direct grouping
Quiz time
Summary
4. Storm in a Clustered Mode
The Storm cluster setup
Zookeeper configurations
Cleaning up Zookeeper
Storm configurations
Storm logging configurations
The Storm UI
Section 1
Section 2
Section 3
Section 4
The visualization section
Storm monitoring tools
Quiz time
Summary
5. Storm High Availability and Failover
An overview of RabbitMQ
Installing the RabbitMQ cluster
Prerequisites for the setup of RabbitMQ
Setting up a RabbitMQ server
Testing the RabbitMQ server
Creating a RabbitMQ cluster
Enabling the RabbitMQ UI
Creating mirror queues for high availability
Integrating Storm with RabbitMQ
Creating a RabbitMQ feeder component
Wiring the topology for the AMQP spout
Building high availability of components
High availability of the Storm cluster
Guaranteed processing of the Storm cluster
The Storm isolation scheduler
Quiz time
Summary
6. Adding NoSQL Persistence to Storm
The advantages of Cassandra
Columnar database fundamentals
Types of column families
Types of columns
Setting up the Cassandra cluster
Installing Cassandra
Multiple data centers
Prerequisites for setting up multiple data centers
Installing Cassandra data centers
Introduction to CQLSH
Introduction to CLI
Using different client APIs to access Cassandra
Storm topology wired to the Cassandra store
The best practices for Storm/Cassandra applications
Quiz time
Summary
7. Cassandra Partitioning, High Availability, and Consistency
Consistent hashing
One or more node goes down
One or more node comes back up
Replication in Cassandra and strategies
Cassandra consistency
Write consistency
Read consistency
Consistency maintenance features
Quiz time
Summary
8. Cassandra Management and Maintenance
Cassandra – gossip protocol
Bootstrapping
Failure scenario handling – detection and recovery
Cassandra cluster scaling – adding a new node
Cassandra cluster – replacing a dead node
The replication factor
The nodetool commands
Cassandra fault tolerance
Cassandra monitoring systems
JMX monitoring
Datastax OpsCenter
Quiz time
Summary
9. Storm Management and Maintenance
Scaling the Storm cluster – adding new supervisor nodes
Scaling the Storm cluster and rebalancing the topology
Rebalancing using the GUI
Rebalancing using the CLI
Setting up workers and parallelism to enhance processing
Scenario 1
Scenario 2
Scenario 3
Storm troubleshooting
The Storm UI
Storm logs
Quiz time
Summary
10. Advance Concepts in Storm
Building a Trident topology
Understanding the Trident API
Local partition manipulation operation
Functions
Filters
partitionAggregate
Sum aggregate
CombinerAggregator
ReducerAggregator
Aggregator
Operations related to stream repartitioning
Data aggregations over the streams
Grouping over a field in a stream
Merge and join
Examples and illustrations
Quiz time
Summary
11. Distributed Cache and CEP with Storm
The need for distributed caching in Storm
Introduction to memcached
Setting up memcache
Building a topology with a cache
Introduction to the complex event processing engine
Esper
Getting started with Esper
Integrating Esper with Storm
Quiz time
Summary
A. Quiz Answers
Chapter 1
Chapter 2
Chapter 3
Chapter 4
Chapter 5
Chapter 6
Chapter 7
Chapter 8
Chapter 9
Chapter 10
Chapter 11
Index
买过这本书的人还买过
读了这本书的人还在读
同类图书排行榜