售 价:¥
温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印
为你推荐
Title Page
Credits
About the Author
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
Getting Up and Running with Cassandra
What is big data?
Challenges of modern applications
Why not relational databases?
How to handle big data
What is Cassandra and why Cassandra?
Horizontal scalability
High availability
Write optimization
Structured records
Secondary indexes
Materialized views
Efficient result ordering
Immediate consistency
Discretely writable collections
Relational joins
MapReduce and Spark
Rich and flexible data model
Lightweight transactions
Multidata center replication
Comparing Cassandra to the alternatives
Installing Cassandra
Installing the JDK
Installing on Debian-based systems (Ubuntu)
Installing on RHEL-based systems
Installing on Windows
Installing on Mac OS X
Installing the binary tarball
Bootstrapping the project
CQL—the Cassandra Query Language
Interacting with Cassandra
Getting started with CQL
Creating a keyspace
Selecting a keyspace
Creating a table
Inserting and reading data
New features in Cassandra 2.2, 3.0, and 3.X
Summary
The First Table
How to configure keyspaces
Creating the users table
Structuring of tables
Table and column options
The type system
Strings
Integers
Floating point and decimal numbers
Timestamp
UUIDs
Booleans
Blobs
Collections
Other data types
The purpose of types
Inserting data
Writing data does not yield feedback
Partial inserts
Selecting data
Missing rows
Selecting more than one row
Retrieving all the rows
Paginating through results
Inserts are always upserts
Developing a mental model for Cassandra
Summary
Organizing Related Data
A table for status updates
Creating a table with a compound primary key
The structure of the status updates table
UUIDs and timestamps
Working with status updates
Extracting timestamps
Looking up a specific status update
Automatically generating UUIDs
Anatomy of a compound primary key
Anatomy of a single-column primary key
Beyond two columns
Multiple clustering columns
Composite partition keys
Composite partition key table
Structure of composite partition key tables
Composite partition key with multiple clustering columns
Compound keys represent parent-child relationships
Coupling parents and children using static columns
Defining static columns
Working with static columns
Interacting only with the static columns
Static-only inserts
Static columns act like predefined joins
When to use static columns
Refining our mental model
Summary
Beyond Key-Value Lookup
Looking up rows by partition
The limits of the WHERE keyword
Restricting by clustering column
Restricting by part of a partition key
Retrieving status updates for a specific time range
Creating time UUID ranges
Selecting a slice of a partition
Paginating over rows in a partition
Counting rows
Reversing the order of rows
Reversing clustering order at query time
Reversing clustering order in the schema
Limitations of ORDER BY
ORDER BY summary
Paginating over multiple partitions
JSON support
INSERT JSON
SELECT JSON
Building an autocomplete function
Summary
Establishing Relationships
Modeling follow relationships
Outbound follows
Inbound follows
Storing follow relationships
Cassandra data modelling
Conceptual data model (entity relationship model)
Logical data model (query-driven design)
Physical data model
Denormalization
Looking up follow relationships
Unfollowing users
Using secondary indexes to avoid denormalization
The form of the single table
Adding a secondary index
Other uses of secondary indexes
Limitations of secondary indexes
Secondary indexes can only have one column
Secondary indexes can only be tested for equality
Secondary index lookup is not as efficient as primary key lookup
Materialized views
Adding a view
Summary
Denormalizing Data for Maximum Performance
A normalized approach
Generating the timeline
Ordering and pagination
Multiple partitions and read efficiency
Partial denormalization
Displaying the home timeline
Read performance and write complexity
Fully denormalizing the home timeline
Creating a status update
Displaying the home timeline
Write complexity and data integrity
Batching in Cassandra
Logged batches
Unlogged batches
When to use unlogged batches
Misuse of BATCH statements
Summary
Expanding Your Data Model
Viewing a keyspace schema
Viewing a table schema in cqlsh
Adding columns to tables
Deleting columns
Updating the existing rows
Updating multiple columns
Updating multiple rows
Removing a value from a column
Missing columns in Cassandra
Deleting specific columns
Syntactic sugar for deletion
Deleting table data (TRUNCATE)
Deleting table/keyspace with schema (DROP)
Inserts, updates, and upserts
Inserts can overwrite existing data
Checking before inserting isn't enough
Another advantage of UUIDs
Conditional inserts and lightweight transactions
Updates can create new rows
Optimistic locking with conditional updates
Optimistic locking in action
Optimistic locking and accidental updates
Lightweight transactions and their cost
When lightweight transactions aren't necessary
Summary
Collections, Tuples, and User-Defined Types
The problem with concurrent updates
Serializing the collection
Introducing concurrency
Collection columns and concurrent updates
Defining collection columns
Reading and writing sets
Advanced set manipulation
Removing values from a set
Sets and uniqueness
Collections and upserts
Using lists for ordered, non-unique values
Defining a list column
Writing a list
Discrete list manipulation
Writing data at a specific index
Removing elements from the list
Using maps to store key-value pairs
Writing a map
Updating discrete values in a map
Removing values from maps
Collections in inserts
Collections and secondary indexes
Secondary indexes on map columns
The limitations of collections
Reading discrete values from collections
Collection size limit
Reading a collection column from multiple rows
Unable to reuse collection names
Performance of collection operations
Working with tuples
Creating a tuple column
Writing to tuples
Indexing tuples
User-defined types
Creating a user-defined type
Assigning a user-defined type to a column
Adding data to a user-defined column
Indexing and querying user-defined types
Partial selection of user-defined types
Choosing between tuples and user-defined types
Nested collections
Nested tuples/UDTs
Comparing data structures
Summary
Aggregating Time-Series Data
Recording discrete analytics observations
Using discrete analytics observations
Slicing and dicing our data
Recording aggregate analytics observations
Answering the right question
Precomputation versus read-time aggregation
The many possibilities for aggregation
The role of discrete observations
Recording analytics observations
Updating a counter column
Counters and upserts
Setting and resetting counter columns
Counter columns and deletion
Counter columns need their own table
Cassandra configuration
Configuration location
Modifying configuration
Restarting Cassandra
User-defined functions
User-defined aggregate functions
Standard aggregate functions
Summary
How Cassandra Distributes Data
Data distribution in Cassandra
Cassandra's partitioning strategy - partition key tokens
Distributing partition tokens
Partitioners
Partition keys group data on the same node
Virtual nodes
Virtual nodes facilitate redistribution
Data replication in Cassandra
Masterless replication
Replication without a master
Gossip protocol
Multidata center cluster
Snitch
Replication strategy
Durable writes
Consistency
Immediate and eventual consistency
Consistency in Cassandra
The anatomy of a successful request
Tuning consistency
Eventual consistency with ONE
Immediate consistency with ALL
Fault-tolerant immediate consistency with QUORUM
Local consistency levels
Comparing consistency levels
Choosing the right consistency level
The CAP theorem
Handling conflicting data
Last-write-wins conflict resolution
Introspecting write timestamps
Overriding write timestamps
Distributed deletion
Stumbling on tombstones
Expiring columns with TTL
Table configuration options
Summary
Cassandra Multi-Node Cluster
3 - node cluster
Prerequisites
Tuning configuration options setting up a 3-node cluster
Tuning configuration
Cassandra.yaml
Cassandra-env.sh
Starting the 3-node cluster
Consistency in action
Write consistency
Consistency QUORUM
Consistency ANY
Cassandra internals
The write path
Compaction
The read path
Cassandra repair mechanisms
Hinted handoff
Read repair
Anti-entropy repair
Summary
Application Development Using the Java Driver
A simple query
Cluster API
Getting metadata
Querying
Prepared statements
QueryBuilder API
Building an INSERT statement
Building an UPDATE statement
Building a SELECT statement
Asynchronous querying
Execute asynchronously
Processing future results
Driver policies
Load-balancing policy
RoundRobinPolicy
DCAwareRoundRobinPolicy
TokenAwarePolicy
Retry Policy
Summary
Peeking under the Hood
Using cassandra-cli
The structure of a simple primary key table
Exploring cells
A model of column families: RowKey and cells
Compound primary keys in column families
A complete mapping
The wide row data structure
The empty cell
Collection columns in column families
Set columns in column families
Map columns in column families
List columns in column families
Appending and prepending values to lists
Other list operations
Summary
Authentication and Authorization
Enabling authentication and authorization
Authentication, authorization, and fault-tolerance
Authentication with cqlsh
Authentication in your application
Setting up a user
Changing a user's password
Viewing user accounts
Controlling access
Viewing permissions
Revoking access
Authorization in action
Authorization as a hedge against mistakes
Security beyond authentication and authorization
Security protects against vulnerabilities
Summary
Wrapping up
买过这本书的人还买过
读了这本书的人还在读
同类图书排行榜