售 价:¥
温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印
为你推荐
Learning HBase
Table of Contents
Learning HBase
Credits
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers, and more
Why subscribe?
Free access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. Understanding the HBase Ecosystem
HBase layout on top of Hadoop
Comparing architectural differences between RDBMs and HBase
HBase features
HBase in the Hadoop ecosystem
Data representation in HBase
Hadoop
Core daemons of Hadoop
Comparing HBase with Hadoop
Comparing functional differences between RDBMs and HBase
Logical view of row-oriented databases
Logical view of column-oriented databases
Pros and cons of column-oriented databases
About the internal storage architecture of HBase
Getting started with HBase
When it started
HBase components and functionalities
ZooKeeper
Why an odd number of ZooKeepers?
HMaster
If a master node goes down
RegionServer
Components of a RegionServer
Client
Catalog tables
Who is using HBase and why?
When should we think of using HBase?
When not to use HBase
Understanding some open source HBase tools
The Hadoop-HBase version compatibility table
Applications of HBase
HBase pros and cons
Summary
2. Let's Begin with HBase
Understanding HBase components in detail
HFile
Region
Scalability – understanding the scale up and scale out processes
Scale in
Scale out
Reading and writing cycle
Write-Ahead Logs
MemStore
HBase housekeeping
Compaction
Minor compaction
Major compaction
Region split
Region assignment
Region merge
RegionServer failovers
The HBase delete request
The reading and writing cycle
List of available HBase distributions
Prerequisites and capacity planning for HBase
The forward DNS resolution
The reverse DNS resolution
Java
SSH
Domain Name Server
Using Network Time Protocol to keep your node on time
OS-level changes and tuning up OS for HBase
Summary
3. Let's Start Building It
Downloading Java on Ubuntu
Considering host configurations
Host file based
Command based
File based
DNS based
Installing and configuring SSH
Installing SSH on Ubuntu/Red Hat/CentOS
Configuring SSH
Installing and configuring NTP
Performing capacity planning
Installing and configuring Hadoop
core-site.xml
hdfs-site.xml
yarn-site.xml
mapred-site.xml
hadoop-env.sh
yarn-env.sh
Slaves file
Hadoop start up steps
Configuring Apache HBase
Configuring HBase in the standalone mode
Configuring HBase in the distributed mode
hbase-site.xml
HBase-env.sh
regionservers
Installing and configuring ZooKeeper
Installing Cloudera Hadoop and HBase
Downloading the required RPM packages
Installing Cloudera in an easier way
Installing the Hadoop and MapReduce packages
Installing Hadoop on Windows
Summary
4. Optimizing the HBase/Hadoop Cluster
Setup types for Hadoop and HBase clusters
Recommendations for CDH cluster configuration
Capacity planning
Hadoop optimization
General optimization tips
Optimizing Java GC
Optimizing Linux OS
Optimizing the Hadoop parameter
Optimizing MapReduce
Rack awareness in Hadoop
Number of Map and Reduce limits in configuration files
Considering and deciding the maximum number of Map and Reduce tasks
Optimizing HBase
Hadoop
Memory
Java
OS
HBase
Optimizing ZooKeeper
Important files in Hadoop
Important files in HBase
Summary
5. The Storage, Structure Layout, and Data Model of HBase
Data types in HBase
Storing data in HBase – logical view versus actual physical view
Namespace
Commands available for namespaces
Services of HBase
Row key
Column family
Column
Cell
Version
Timestamp
Data model operations
Get
Put
Scan
Delete
Versioning and why
Deciding the number of the version
Lower bound of versions
Upper bound of versions
Schema designing
Types of table designs
Benefits of Short Wide and Tall-Thin design patterns
Composite key designing
Real-time use case of schema in an HBase table
Schema change operations
Calculating the data size stored in HBase
Summary
6. HBase Cluster Maintenance and Troubleshooting
Hadoop shell commands
Types of Hadoop shell commands
Administration commands
User commands
File system-related commands
Difference between copyToLocal/copyFromLocal and get/put
HBase shell commands
HBase administration tools
hbck – HBase check
HBase health check script
Writing HBase shell scripts
Using the Hadoop tool or JARs for HBase
Connecting HBase with Hive
HBase region management
Compaction
Merge
HBase node management
Commissioning
Decommissioning
Implementing security
Secure access
Requirement
Kerberos KDC
Client-side security configuration
Client-side security configuration for thrift requests
Server-side security configuration
Simple security
Server-side configuration
Client-side configuration
The tag security feature
Access control in HBase
Server-side access control
Cell-level access using tags
Configuring ZooKeeper for security
Troubleshooting the most frequent HBase errors and their explanations
What might fail in cluster
Monitoring HBase health
HBase web UI
Master
RegionServer
ZooKeeper command line
Linux tools
Summary
7. Scripting in HBase
HBase backup and restore techniques
Offline backup / full-shutdown backup
Backup
Restore
Online backup
The HBase snapshot
Online
Offline
The HBase replication method
Setting up cluster replication
Backup and restore using Export and Import commands
Export
Import
Miscellaneous utilities
CopyTable
HTable API
Backup using a Mozilla tool
HBase on Windows
Scripting in HBase
The .irbrc file
Getting the HBase timestamp from HBase shell
Enabling debugging shell
Enabling the debug level in HBase shell
Enabling SQL in HBase
Contributing to HBase
Summary
8. Coding HBase in Java
Setting up the environment for development
Building a Java client to code in HBase
Data types
Data model Java operations
Read
Get()
Constructors
Supported methods
Scan()
Constructors
Methods
Write
Put()
Constructors
Methods
Modify
Delete()
Constructors
Methods
HBase filters
Types of filters
Client APIs
Summary
9. Advance Coding in Java for HBase
Interfaces, classes, and exceptions
Code related to administrative tasks
Data operation code
MapReduce and HBase
RESTful services and Thrift services interface
REST service interfaces
Thrift
Coding for HDFS operations
Some advance topics in brief
Coprocessors
Types of coprocessors
Bloom filters
The Lily project
Features
Summary
10. HBase Use Cases
HBase in industry today
The future of HBase against relational databases
Some real-world project examples' use cases
HBase at Facebook
Choosing HBase
Storing in HBase
The architecture of a Facebook message
Facts and figures
HBase at Pinterest
The layout architecture
HBase at Groupon
The layout architecture
HBase at LongTail Video
The layout architecture
HBase at Aadhaar (UIDAI)
The layout architecture
Useful links and references
Summary
Index
买过这本书的人还买过
读了这本书的人还在读
同类图书排行榜