万本电子书0元读

万本电子书0元读

顶部广告

Learning HBase电子书

售       价:¥

9人正在读 | 0人评论 9.8

作       者:Shashwat Shriparv

出  版  社:Packt Publishing

出版时间:2014-11-25

字       数:237.2万

所属分类: 进口书 > 外文原版书 > 电脑/网络

温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印

为你推荐

  • 读书简介
  • 目录
  • 累计评论(0条)
  • 读书简介
  • 目录
  • 累计评论(0条)
If you are an administrator or developer who wants to enter the world of Big Data and BigTables and would like to learn about HBase, this is the book for you.
目录展开

Learning HBase

Table of Contents

Learning HBase

Credits

About the Author

Acknowledgments

About the Reviewers

www.PacktPub.com

Support files, eBooks, discount offers, and more

Why subscribe?

Free access for Packt account holders

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Errata

Piracy

Questions

1. Understanding the HBase Ecosystem

HBase layout on top of Hadoop

Comparing architectural differences between RDBMs and HBase

HBase features

HBase in the Hadoop ecosystem

Data representation in HBase

Hadoop

Core daemons of Hadoop

Comparing HBase with Hadoop

Comparing functional differences between RDBMs and HBase

Logical view of row-oriented databases

Logical view of column-oriented databases

Pros and cons of column-oriented databases

About the internal storage architecture of HBase

Getting started with HBase

When it started

HBase components and functionalities

ZooKeeper

Why an odd number of ZooKeepers?

HMaster

If a master node goes down

RegionServer

Components of a RegionServer

Client

Catalog tables

Who is using HBase and why?

When should we think of using HBase?

When not to use HBase

Understanding some open source HBase tools

The Hadoop-HBase version compatibility table

Applications of HBase

HBase pros and cons

Summary

2. Let's Begin with HBase

Understanding HBase components in detail

HFile

Region

Scalability – understanding the scale up and scale out processes

Scale in

Scale out

Reading and writing cycle

Write-Ahead Logs

MemStore

HBase housekeeping

Compaction

Minor compaction

Major compaction

Region split

Region assignment

Region merge

RegionServer failovers

The HBase delete request

The reading and writing cycle

List of available HBase distributions

Prerequisites and capacity planning for HBase

The forward DNS resolution

The reverse DNS resolution

Java

SSH

Domain Name Server

Using Network Time Protocol to keep your node on time

OS-level changes and tuning up OS for HBase

Summary

3. Let's Start Building It

Downloading Java on Ubuntu

Considering host configurations

Host file based

Command based

File based

DNS based

Installing and configuring SSH

Installing SSH on Ubuntu/Red Hat/CentOS

Configuring SSH

Installing and configuring NTP

Performing capacity planning

Installing and configuring Hadoop

core-site.xml

hdfs-site.xml

yarn-site.xml

mapred-site.xml

hadoop-env.sh

yarn-env.sh

Slaves file

Hadoop start up steps

Configuring Apache HBase

Configuring HBase in the standalone mode

Configuring HBase in the distributed mode

hbase-site.xml

HBase-env.sh

regionservers

Installing and configuring ZooKeeper

Installing Cloudera Hadoop and HBase

Downloading the required RPM packages

Installing Cloudera in an easier way

Installing the Hadoop and MapReduce packages

Installing Hadoop on Windows

Summary

4. Optimizing the HBase/Hadoop Cluster

Setup types for Hadoop and HBase clusters

Recommendations for CDH cluster configuration

Capacity planning

Hadoop optimization

General optimization tips

Optimizing Java GC

Optimizing Linux OS

Optimizing the Hadoop parameter

Optimizing MapReduce

Rack awareness in Hadoop

Number of Map and Reduce limits in configuration files

Considering and deciding the maximum number of Map and Reduce tasks

Optimizing HBase

Hadoop

Memory

Java

OS

HBase

Optimizing ZooKeeper

Important files in Hadoop

Important files in HBase

Summary

5. The Storage, Structure Layout, and Data Model of HBase

Data types in HBase

Storing data in HBase – logical view versus actual physical view

Namespace

Commands available for namespaces

Services of HBase

Row key

Column family

Column

Cell

Version

Timestamp

Data model operations

Get

Put

Scan

Delete

Versioning and why

Deciding the number of the version

Lower bound of versions

Upper bound of versions

Schema designing

Types of table designs

Benefits of Short Wide and Tall-Thin design patterns

Composite key designing

Real-time use case of schema in an HBase table

Schema change operations

Calculating the data size stored in HBase

Summary

6. HBase Cluster Maintenance and Troubleshooting

Hadoop shell commands

Types of Hadoop shell commands

Administration commands

User commands

File system-related commands

Difference between copyToLocal/copyFromLocal and get/put

HBase shell commands

HBase administration tools

hbck – HBase check

HBase health check script

Writing HBase shell scripts

Using the Hadoop tool or JARs for HBase

Connecting HBase with Hive

HBase region management

Compaction

Merge

HBase node management

Commissioning

Decommissioning

Implementing security

Secure access

Requirement

Kerberos KDC

Client-side security configuration

Client-side security configuration for thrift requests

Server-side security configuration

Simple security

Server-side configuration

Client-side configuration

The tag security feature

Access control in HBase

Server-side access control

Cell-level access using tags

Configuring ZooKeeper for security

Troubleshooting the most frequent HBase errors and their explanations

What might fail in cluster

Monitoring HBase health

HBase web UI

Master

RegionServer

ZooKeeper command line

Linux tools

Summary

7. Scripting in HBase

HBase backup and restore techniques

Offline backup / full-shutdown backup

Backup

Restore

Online backup

The HBase snapshot

Online

Offline

The HBase replication method

Setting up cluster replication

Backup and restore using Export and Import commands

Export

Import

Miscellaneous utilities

CopyTable

HTable API

Backup using a Mozilla tool

HBase on Windows

Scripting in HBase

The .irbrc file

Getting the HBase timestamp from HBase shell

Enabling debugging shell

Enabling the debug level in HBase shell

Enabling SQL in HBase

Contributing to HBase

Summary

8. Coding HBase in Java

Setting up the environment for development

Building a Java client to code in HBase

Data types

Data model Java operations

Read

Get()

Constructors

Supported methods

Scan()

Constructors

Methods

Write

Put()

Constructors

Methods

Modify

Delete()

Constructors

Methods

HBase filters

Types of filters

Client APIs

Summary

9. Advance Coding in Java for HBase

Interfaces, classes, and exceptions

Code related to administrative tasks

Data operation code

MapReduce and HBase

RESTful services and Thrift services interface

REST service interfaces

Thrift

Coding for HDFS operations

Some advance topics in brief

Coprocessors

Types of coprocessors

Bloom filters

The Lily project

Features

Summary

10. HBase Use Cases

HBase in industry today

The future of HBase against relational databases

Some real-world project examples' use cases

HBase at Facebook

Choosing HBase

Storing in HBase

The architecture of a Facebook message

Facts and figures

HBase at Pinterest

The layout architecture

HBase at Groupon

The layout architecture

HBase at LongTail Video

The layout architecture

HBase at Aadhaar (UIDAI)

The layout architecture

Useful links and references

Summary

Index

累计评论(0条) 0个书友正在讨论这本书 发表评论

发表评论

发表评论,分享你的想法吧!

买过这本书的人还买过

读了这本书的人还在读

回顶部