售 价:¥
温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印
为你推荐
Cloudera Administration Handbook
Table of Contents
Cloudera Administration Handbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers, and more
Why subscribe?
Free access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. Getting Started with Apache Hadoop
History of Apache Hadoop and its trends
Components of Apache Hadoop
Understanding the Apache Hadoop daemons
Namenode
Secondary namenode
Jobtracker
Tasktracker
ResourceManager
NodeManager
Job submission in YARN
Introducing Cloudera
Introducing CDH
Responsibilities of a Hadoop administrator
Summary
2. HDFS and MapReduce
Essentials of HDFS
Configuring HDFS
The read/write operational flow in HDFS
Writing files in HDFS
Reading files in HDFS
Understanding the namenode UI
Understanding the secondary namenode UI
Exploring HDFS commands
Commonly used HDFS commands
Commands to administer HDFS
Getting acquainted with MapReduce
Understanding the map phase
Understanding the reduce phase
Learning all about the MapReduce job flow
Configuring MapReduce
Understanding the jobtracker UI
Getting MapReduce job information
Summary
3. Cloudera's Distribution Including Apache Hadoop
Getting started with CDH
Understanding the CDH components
Apache Hadoop
Apache Flume NG
Apache Sqoop
Apache Pig
Apache Hive
Apache ZooKeeper
Apache HBase
Apache Whirr
Snappy – previously known as Zippy
Apache Mahout
Apache Avro
Apache Oozie
Cloudera Search
Cloudera Impala
Cloudera Hue
Beeswax – Hive UI
Cloudera Impala UI
Pig UI
File Browser
Metastore Manager
Sqoop Jobs
Job Browser
Job Designs
Dashboard
Collection Manager
Hue Shell
HBase Browser
Installing CDH
Stopping Hadoop services
Understanding a YARN cluster
Installing the CDH components
Installing Apache Flume
Installing Apache Sqoop
Installing Apache Sqoop 2
Installing Apache Pig
Installing Apache Hive
Installing Apache Oozie
Installing Apache ZooKeeper
Summary
4. Exploring HDFS Federation and Its High Availability
Implementing HDFS Federation
Configuring HDFS Federation
Configuring ViewFS for a federated HDFS
Implementing HDFS High Availability
The Quorum-based storage
Configuring HDFS high availability by theQuorum-based storage
Shared storage using NFS
Configuring HDFS high availability by shared storage using NFS
NameNode Journal Status for Quorum-based storage approach
NameNode Journal Status for the Shared Storage-based approach
Configuring automatic failover for HDFS high availability
Jobtracker high availability
Configuring jobtracker high availability
Configuring automatic failover for jobtracker high availability
Summary
5. Using Cloudera Manager
Introducing Cloudera Manager
Understanding the Cloudera Manager architecture
Installing Cloudera Manager
Navigating the Cloudera Manager Web console
Navigating the Home screen
Navigating the Clusters menu
Exploring the Hosts menu
Understanding the Diagnostics menu
Understanding the Audits screen
Understanding the Charts menu
Understanding the Backup menu
Understanding the Administration menu
Configuring High Availability using Cloudera Manager
Summary
6. Implementing Security Using Kerberos
Understanding authentication and authorization
Introducing Kerberos
Understanding the Kerberos Architecture
Authenticating a user
Accessing a secure file server
Understanding important Kerberos terms
Installing Kerberos
Configuring the KDC Server
Testing the KDC installation
Configuring the Kerberos clients
Configuring Kerberos for Apache Hadoop
Configuring Kerberos principal for Cloudera Manager Server
Configuring the Cloudera Manager Server for Kerberos
Authorization in Apache Hadoop
Configuring access control lists in Hadoop
Summary
7. Managing an Apache Hadoop Cluster
Configuring Hadoop services using Cloudera Manager
Adding a service to the cluster
Removing a service from the cluster
Role management in Cloudera Manager
Adding a role instance to a host
Adding a DataNode role to a host
Adding a TaskTracker role to a host
Managing hosts using Cloudera Manager
Adding a new host
Removing an existing host
Managing multiple clusters with Cloudera Manager
Rebalancing a Hadoop cluster from Cloudera Manager
Adding the Balancer service to the cluster
Rebalancing the cluster
Summary
8. Cluster Monitoring Using Events and Alerts
Monitoring Hadoop services from Cloudera Manager
Understanding events and alerts
Configuring events and alerts
Configuring the alert delivery by an e-mail
Summary
9. Configuring Backups
Understanding backups
Types of backups
Types of storage media for backups
Using cloud services for backups
Understanding HDFS backups
Using the distributed copy (DistCp)
Configuring backups using Cloudera Manager
Configuring HDFS replication
Configuring Hive replication
Configuring snapshots
Enabling snapshot paths in HDFS
Configuring a snapshot policy
Summary
Index
买过这本书的人还买过
读了这本书的人还在读
同类图书排行榜