万本电子书0元读

万本电子书0元读

顶部广告

Building Python Real-Time Applications with Storm电子书

售       价:¥

3人正在读 | 0人评论 9.8

作       者:Kartik Bhatnagar

出  版  社:Packt Publishing

出版时间:2015-12-02

字       数:61.5万

所属分类: 进口书 > 外文原版书 > 电脑/网络

温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印

为你推荐

  • 读书简介
  • 目录
  • 累计评论(0条)
  • 读书简介
  • 目录
  • 累计评论(0条)
Learn to process massive real-time data streams using Storm and Python—no Java required!About This Book Learn to use Apache Storm and the Python Petrel library to build distributed applications that process large streams of data Explore sample applications in real-time and analyze them in the popular NoSQL databases MongoDB and Redis Discover how to apply software development best practices to improve performance, productivity, and quality in your Storm projects Who This Book Is For This book is intended for Python developers who want to benefit from Storm’s real-time data processing capabilities. If you are new to Python, you’ll benefit from the attention to key supporting tools and techniques such as automated testing, virtual environments, and logging. If you’re an experienced Python developer, you’ll appreciate the thorough and detailed examples What You Will Learn Install Storm and learn about the prerequisites Get to know the components of a Storm topology and how to control the flow of data between them Ingest Twitter data directly into Storm Use Storm with MongoDB and Redis Build topologies and run them in Storm Use an interactive graphical debugger to debug your topology as it’s running in Storm Test your topology components outside of Storm Configure your topology using YAML In Detail Big data is a trending concept that everyone wants to learn about. With its ability to process all kinds of data in real time, Storm is an important addition to your big data “bag of tricks.” At the same time, Python is one of the fastest-growing programming languages today. It has become a top choice for both data science and everyday application development. Together, Storm and Python enable you to build and deploy real-time big data applications quickly and easily. You will begin with some basic command tutorials to set up storm and learn about its configurations in detail. You will then go through the requirement scenarios to create a Storm cluster. Next, you’ll be provided with an overview of Petrel, followed by an example of Twitter topology and persistence using Redis and MongoDB. Finally, you will build a production-quality Storm topology using development best practices. Style and approach This book takes an easy-to-follow and a practical approach to help you understand all the concepts related to Storm and Python.
目录展开

Building Python Real-Time Applications with Storm

Table of Contents

Building Python Real-Time Applications with Storm

Credits

About the Authors

About the Reviewers

www.PacktPub.com

Support files, eBooks, discount offers, and more

Why subscribe?

Free access for Packt account holders

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Errata

Piracy

Questions

1. Getting Acquainted with Storm

Overview of Storm

Before the Storm era

Key features of Storm

Storm cluster modes

Developer mode

Single-machine Storm cluster

Multimachine Storm cluster

The Storm client

Prerequisites for a Storm installation

Zookeeper installation

Storm installation

Enabling native (Netty only) dependency

Netty configuration

Starting daemons

Playing with optional configurations

Summary

2. The Storm Anatomy

Storm processes

Supervisor

Zookeeper

The Storm UI

Storm-topology-specific terminologies

The worker process, executor, and task

Worker processes

Executors

Tasks

Interprocess communication

A physical view of a Storm cluster

Stream grouping

Fault tolerance in Storm

Guaranteed tuple processing in Storm

XOR magic in acking

Tuning parallelism in Storm – scaling a distributed computation

Summary

3. Introducing Petrel

What is Petrel?

Building a topology

Packaging a topology

Logging events and errors

Managing third-party dependencies

Installing Petrel

Creating your first topology

Sentence spout

Splitter bolt

Word Counting Bolt

Defining a topology

Running the topology

Troubleshooting

Productivity tips with Petrel

Improving startup performance

Enabling and using logging

Automatic logging of fatal errors

Summary

4. Example Topology – Twitter

Twitter analysis

Twitter's Streaming API

Creating a Twitter app to use the Streaming API

The topology configuration file

The Twitter stream spout

Splitter bolt

Rolling word count bolt

The intermediate rankings bolt

The total rankings bolt

Defining the topology

Running the topology

Summary

5. Persistence Using Redis and MongoDB

Finding the top n ranked topics using Redis

The topology configuration file – the Redis case

Rolling word count bolt – the Redis case

Total rankings bolt – the Redis case

Defining the topology – the Redis case

Running the topology – the Redis case

Finding the hourly count of tweets by city name using MongoDB

Defining the topology – the MongoDB case

Running the topology – the MongoDB case

Summary

6. Petrel in Practice

Testing a bolt

Example – testing SplitSentenceBolt

Example – testing SplitSentenceBolt with WordCountBolt

Debugging

Installing Winpdb

Add Winpdb breakpoint

Launching and attaching the debugger

Profiling your topology's performance

Split sentence bolt log

Word count bolt log

Summary

A. Managing Storm Using Supervisord

Storm administration over a cluster

Introducing supervisord

Supervisord components

Supervisord installation

Configuration of supervisord.conf

Configuration of supervisord.conf on 172-31-19-62

Summary

Index

累计评论(0条) 0个书友正在讨论这本书 发表评论

发表评论

发表评论,分享你的想法吧!

买过这本书的人还买过

读了这本书的人还在读

回顶部