万本电子书0元读

万本电子书0元读

顶部广告

Cloudera Administration Handbook电子书

售       价:¥

24人正在读 | 0人评论 6.2

作       者:Rohit Menon

出  版  社:Packt Publishing

出版时间:2014-07-18

字       数:139.9万

所属分类: 进口书 > 外文原版书 > 电脑/网络

温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印

为你推荐

  • 读书简介
  • 目录
  • 累计评论(0条)
  • 读书简介
  • 目录
  • 累计评论(0条)
An easy-to-follow Apache Hadoop administrator’s guide filled with practical screenshots and explanations for each step and configuration. This book is great for administrators interested in setting up and managing a large Hadoop cluster. If you are an administrator, or want to be an administrator, and you are ready to build and maintain a production-level cluster running CDH5, then this book is for you.
目录展开

Cloudera Administration Handbook

Table of Contents

Cloudera Administration Handbook

Credits

About the Author

About the Reviewers

www.PacktPub.com

Support files, eBooks, discount offers, and more

Why subscribe?

Free access for Packt account holders

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Errata

Piracy

Questions

1. Getting Started with Apache Hadoop

History of Apache Hadoop and its trends

Components of Apache Hadoop

Understanding the Apache Hadoop daemons

Namenode

Secondary namenode

Jobtracker

Tasktracker

ResourceManager

NodeManager

Job submission in YARN

Introducing Cloudera

Introducing CDH

Responsibilities of a Hadoop administrator

Summary

2. HDFS and MapReduce

Essentials of HDFS

Configuring HDFS

The read/write operational flow in HDFS

Writing files in HDFS

Reading files in HDFS

Understanding the namenode UI

Understanding the secondary namenode UI

Exploring HDFS commands

Commonly used HDFS commands

Commands to administer HDFS

Getting acquainted with MapReduce

Understanding the map phase

Understanding the reduce phase

Learning all about the MapReduce job flow

Configuring MapReduce

Understanding the jobtracker UI

Getting MapReduce job information

Summary

3. Cloudera's Distribution Including Apache Hadoop

Getting started with CDH

Understanding the CDH components

Apache Hadoop

Apache Flume NG

Apache Sqoop

Apache Pig

Apache Hive

Apache ZooKeeper

Apache HBase

Apache Whirr

Snappy – previously known as Zippy

Apache Mahout

Apache Avro

Apache Oozie

Cloudera Search

Cloudera Impala

Cloudera Hue

Beeswax – Hive UI

Cloudera Impala UI

Pig UI

File Browser

Metastore Manager

Sqoop Jobs

Job Browser

Job Designs

Dashboard

Collection Manager

Hue Shell

HBase Browser

Installing CDH

Stopping Hadoop services

Understanding a YARN cluster

Installing the CDH components

Installing Apache Flume

Installing Apache Sqoop

Installing Apache Sqoop 2

Installing Apache Pig

Installing Apache Hive

Installing Apache Oozie

Installing Apache ZooKeeper

Summary

4. Exploring HDFS Federation and Its High Availability

Implementing HDFS Federation

Configuring HDFS Federation

Configuring ViewFS for a federated HDFS

Implementing HDFS High Availability

The Quorum-based storage

Configuring HDFS high availability by theQuorum-based storage

Shared storage using NFS

Configuring HDFS high availability by shared storage using NFS

NameNode Journal Status for Quorum-based storage approach

NameNode Journal Status for the Shared Storage-based approach

Configuring automatic failover for HDFS high availability

Jobtracker high availability

Configuring jobtracker high availability

Configuring automatic failover for jobtracker high availability

Summary

5. Using Cloudera Manager

Introducing Cloudera Manager

Understanding the Cloudera Manager architecture

Installing Cloudera Manager

Navigating the Cloudera Manager Web console

Navigating the Home screen

Navigating the Clusters menu

Exploring the Hosts menu

Understanding the Diagnostics menu

Understanding the Audits screen

Understanding the Charts menu

Understanding the Backup menu

Understanding the Administration menu

Configuring High Availability using Cloudera Manager

Summary

6. Implementing Security Using Kerberos

Understanding authentication and authorization

Introducing Kerberos

Understanding the Kerberos Architecture

Authenticating a user

Accessing a secure file server

Understanding important Kerberos terms

Installing Kerberos

Configuring the KDC Server

Testing the KDC installation

Configuring the Kerberos clients

Configuring Kerberos for Apache Hadoop

Configuring Kerberos principal for Cloudera Manager Server

Configuring the Cloudera Manager Server for Kerberos

Authorization in Apache Hadoop

Configuring access control lists in Hadoop

Summary

7. Managing an Apache Hadoop Cluster

Configuring Hadoop services using Cloudera Manager

Adding a service to the cluster

Removing a service from the cluster

Role management in Cloudera Manager

Adding a role instance to a host

Adding a DataNode role to a host

Adding a TaskTracker role to a host

Managing hosts using Cloudera Manager

Adding a new host

Removing an existing host

Managing multiple clusters with Cloudera Manager

Rebalancing a Hadoop cluster from Cloudera Manager

Adding the Balancer service to the cluster

Rebalancing the cluster

Summary

8. Cluster Monitoring Using Events and Alerts

Monitoring Hadoop services from Cloudera Manager

Understanding events and alerts

Configuring events and alerts

Configuring the alert delivery by an e-mail

Summary

9. Configuring Backups

Understanding backups

Types of backups

Types of storage media for backups

Using cloud services for backups

Understanding HDFS backups

Using the distributed copy (DistCp)

Configuring backups using Cloudera Manager

Configuring HDFS replication

Configuring Hive replication

Configuring snapshots

Enabling snapshot paths in HDFS

Configuring a snapshot policy

Summary

Index

累计评论(0条) 0个书友正在讨论这本书 发表评论

发表评论

发表评论,分享你的想法吧!

买过这本书的人还买过

读了这本书的人还在读

回顶部