售 价:¥
温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印
为你推荐
Hadoop Real-World Solutions Cookbook Second Edition
Table of Contents
Hadoop Real-World Solutions Cookbook Second Edition
Credits
About the Author
Acknowledgements
About the Reviewer
www.PacktPub.com
eBooks, discount offers, and more
Why Subscribe?
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
1. Getting Started with Hadoop 2.X
Introduction
Installing a single-node Hadoop Cluster
Getting ready
How to do it...
How it works...
Hadoop Distributed File System (HDFS)
Yet Another Resource Negotiator (YARN)
There's more
Installing a multi-node Hadoop cluster
Getting ready
How to do it...
How it works...
Adding new nodes to existing Hadoop clusters
Getting ready
How to do it...
How it works...
Executing the balancer command for uniform data distribution
Getting ready
How to do it...
How it works...
There's more...
Entering and exiting from the safe mode in a Hadoop cluster
How to do it...
How it works...
Decommissioning DataNodes
Getting ready
How to do it...
How it works...
Performing benchmarking on a Hadoop cluster
Getting ready
How to do it...
TestDFSIO
NNBench
MRBench
How it works...
2. Exploring HDFS
Introduction
Loading data from a local machine to HDFS
Getting ready
How to do it...
How it works...
Exporting HDFS data to a local machine
Getting ready
How to do it...
How it works...
Changing the replication factor of an existing file in HDFS
Getting ready
How to do it...
How it works...
Setting the HDFS block size for all the files in a cluster
Getting ready
How to do it...
How it works...
Setting the HDFS block size for a specific file in a cluster
Getting ready
How to do it...
How it works...
Enabling transparent encryption for HDFS
Getting ready
How to do it...
How it works...
Importing data from another Hadoop cluster
Getting ready
How to do it...
How it works...
Recycling deleted data from trash to HDFS
Getting ready
How to do it...
How it works...
Saving compressed data in HDFS
Getting ready
How to do it...
How it works...
3. Mastering Map Reduce Programs
Introduction
Writing the Map Reduce program in Java to analyze web log data
Getting ready
How to do it...
How it works...
Executing the Map Reduce program in a Hadoop cluster
Getting ready
How to do it
How it works...
Adding support for a new writable data type in Hadoop
Getting ready
How to do it...
How it works...
Implementing a user-defined counter in a Map Reduce program
Getting ready
How to do it...
How it works...
Map Reduce program to find the top X
Getting ready
How to do it...
How it works
Map Reduce program to find distinct values
Getting ready
How to do it
How it works...
Map Reduce program to partition data using a custom partitioner
Getting ready
How to do it...
How it works...
Writing Map Reduce results to multiple output files
Getting ready
How to do it...
How it works...
Performing Reduce side Joins using Map Reduce
Getting ready
How to do it
How it works...
Unit testing the Map Reduce code using MRUnit
Getting ready
How to do it...
How it works...
4. Data Analysis Using Hive, Pig, and Hbase
Introduction
Storing and processing Hive data in a sequential file format
Getting ready
How to do it...
How it works...
Storing and processing Hive data in the RC file format
Getting ready
How to do it...
How it works...
Storing and processing Hive data in the ORC file format
Getting ready
How to do it...
How it works...
Storing and processing Hive data in the Parquet file format
Getting ready
How to do it...
How it works...
Performing FILTER By queries in Pig
Getting ready
How to do it...
How it works...
Performing Group By queries in Pig
Getting ready
How to do it...
How it works...
Performing Order By queries in Pig
Getting ready
How to do it..
How it works...
Performing JOINS in Pig
Getting ready
How to do it...
How it works
Replicated Joins
Skewed Joins
Merge Joins
Writing a user-defined function in Pig
Getting ready
How to do it...
How it works...
There's more...
Analyzing web log data using Pig
Getting ready
How to do it...
How it works...
Performing the Hbase operation in CLI
Getting ready
How to do it
How it works...
Performing Hbase operations in Java
Getting ready
How to do it
How it works...
Executing the MapReduce programming with an Hbase Table
Getting ready
How to do it
How it works
5. Advanced Data Analysis Using Hive
Introduction
Processing JSON data in Hive using JSON SerDe
Getting ready
How to do it...
How it works...
Processing XML data in Hive using XML SerDe
Getting ready
How to do it...
How it works
Processing Hive data in the Avro format
Getting ready
How to do it...
How it works...
Writing a user-defined function in Hive
Getting ready
How to do it
How it works...
Performing table joins in Hive
Getting ready
How to do it...
Left outer join
Right outer join
Full outer join
Left semi join
How it works...
Executing map side joins in Hive
Getting ready
How to do it...
How it works...
Performing context Ngram in Hive
Getting ready
How to do it...
How it works...
Call Data Record Analytics using Hive
Getting ready
How to do it...
How it works...
Twitter sentiment analysis using Hive
Getting ready
How to do it...
How it works
Implementing Change Data Capture using Hive
Getting ready
How to do it
How it works
Multiple table inserting using Hive
Getting ready
How to do it
How it works
6. Data Import/Export Using Sqoop and Flume
Introduction
Importing data from RDMBS to HDFS using Sqoop
Getting ready
How to do it...
How it works...
Exporting data from HDFS to RDBMS
Getting ready
How to do it...
How it works...
Using query operator in Sqoop import
Getting ready
How to do it...
How it works...
Importing data using Sqoop in compressed format
Getting ready
How to do it...
How it works...
Performing Atomic export using Sqoop
Getting ready
How to do it...
How it works...
Importing data into Hive tables using Sqoop
Getting ready
How to do it...
How it works...
Importing data into HDFS from Mainframes
Getting ready
How to do it...
How it works...
Incremental import using Sqoop
Getting ready
How to do it...
How it works...
Creating and executing Sqoop job
Getting ready
How to do it...
How it works...
Importing data from RDBMS to Hbase using Sqoop
Getting ready
How to do it...
How it works...
Importing Twitter data into HDFS using Flume
Getting ready
How to do it...
How it works
Importing data from Kafka into HDFS using Flume
Getting ready
How to do it...
How it works
Importing web logs data into HDFS using Flume
Getting ready
How to do it...
How it works...
7. Automation of Hadoop Tasks Using Oozie
Introduction
Implementing a Sqoop action job using Oozie
Getting ready
How to do it...
How it works
Implementing a Map Reduce action job using Oozie
Getting ready
How to do it...
How it works...
Implementing a Java action job using Oozie
Getting ready
How to do it
How it works
Implementing a Hive action job using Oozie
Getting ready
How to do it...
How it works...
Implementing a Pig action job using Oozie
Getting ready
How to do it...
How it works
Implementing an e-mail action job using Oozie
Getting ready
How to do it...
How it works...
Executing parallel jobs using Oozie (fork)
Getting ready
How to do it...
How it works...
Scheduling a job in Oozie
Getting ready
How to do it...
How it works...
8. Machine Learning and Predictive Analytics Using Mahout and R
Introduction
Setting up the Mahout development environment
Getting ready
How to do it...
How it works...
Creating an item-based recommendation engine using Mahout
Getting ready
How to do it...
How it works...
Creating a user-based recommendation engine using Mahout
Getting ready
How to do it...
How it works...
Predictive analytics on Bank Data using Mahout
Getting ready
How to do it...
How it works...
Text data clustering using K-Means using Mahout
Getting ready
How to do it...
How it works...
Population Data Analytics using R
Getting ready
How to do it...
How it works...
Twitter Sentiment Analytics using R
Getting ready
How to do it...
How it works...
Performing Predictive Analytics using R
Getting ready
How to do it...
How it works...
9. Integration with Apache Spark
Introduction
Running Spark standalone
Getting ready
How to do it...
How it works...
Running Spark on YARN
Getting ready
How to do it...
How it works...
Performing Olympics Athletes analytics using the Spark Shell
Getting ready
How to do it...
How it works...
Creating Twitter trending topics using Spark Streaming
Getting ready
How to do it...
How it works...
Twitter trending topics using Spark streaming
Getting ready
How to do it...
How it works...
Analyzing Parquet files using Spark
Getting ready
How to do it...
How it works...
Analyzing JSON data using Spark
Getting ready
How to do it...
How it works...
Processing graphs using Graph X
Getting ready
How to do it...
How it works...
Conducting predictive analytics using Spark MLib
Getting ready
How to do it...
How it works...
10. Hadoop Use Cases
Introduction
Call Data Record analytics
Getting ready
How to do it...
Problem Statement
Solution
How it works...
Web log analytics
Getting ready
How to do it...
Problem statement
Solution
How it works...
Sensitive data masking and encryption using Hadoop
Getting ready
How to do it...
Problem statement
Solution
How it works...
Index
买过这本书的人还买过
读了这本书的人还在读
同类图书排行榜