万本电子书0元读

万本电子书0元读

顶部广告

Hands-On Data Warehousing with Azure Data Factory电子书

售       价:¥

7人正在读 | 0人评论 6.2

作       者:Christian Coté,Michelle Gutzait,Giuseppe Ciaburro

出  版  社:Packt Publishing

出版时间:2018-05-31

字       数:16.4万

所属分类: 进口书 > 外文原版书 > 电脑/网络

温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印

为你推荐

  • 读书简介
  • 目录
  • 累计评论(0条)
  • 读书简介
  • 目录
  • 累计评论(0条)
Leverage the power of Microsoft Azure Data Factory v2 to build hybrid data solutions About This Book ? Combine the power of Azure Data Factory v2 and SQL Server Integration Services ? Design and enhance performance and scalability of a modern ETL hybrid solution ? Interact with the loaded data in data warehouse and data lake using Power BI Who This Book Is For This book is for you if you are a software professional who develops and implements ETL solutions using Microsoft SQL Server or Azure cloud. It will be an added advantage if you are a software engineer, DW/ETL architect, or ETL developer, and know how to create a new ETL implementation or enhance an existing one with ADF or SSIS. What You Will Learn ? Understand the key components of an ETL solution using Azure Data Factory and Integration Services ? Design the architecture of a modern ETL hybrid solution ? Implement ETL solutions for both on-premises and Azure data ? Improve the performance and scalability of your ETL solution ? Gain thorough knowledge of new capabilities and features added to Azure Data Factory and Integration Services In Detail ETL is one of the essential techniques in data processing. Given data is everywhere, ETL will always be the vital process to handle data from different sources. Hands-On Data Warehousing with Azure Data Factory starts with the basic concepts of data warehousing and ETL process. You will learn how Azure Data Factory and SSIS can be used to understand the key components of an ETL solution. You will go through different services offered by Azure that can be used by ADF and SSIS, such as Azure Data Lake Analytics, Machine Learning and Databrick’s Spark with the help of practical examples. You will explore how to design and implement ETL hybrid solutions using different integration services with a step-by-step approach. Once you get to grips with all this, you will use Power BI to interact with data coming from different sources in order to reveal valuable insights. By the end of this book, you will not only learn how to build your own ETL solutions but also address the key challenges that are faced while building them. Style and approach A step-by-step guide to develop data movement code using SSIS, Azure Data Factory, and database stored procedures for implementing intelligent BI solutions.
目录展开

Title Page

Copyright and Credits

Hands-On Data Warehousing with Azure Data Factory

Packt Upsell

Why subscribe?

PacktPub.com

Contributors

About the authors

About the reviewer

Packt is searching for authors like you

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Reviews

The Modern Data Warehouse

The need for a data warehouse

Driven by IT

Self-service BI

Cloud-based BI – big data and artificial intelligence

The modern data warehouse

Main components of a data warehouse

Staging area

Data warehouse

Cubes

Consumption layer – BI and analytics

What is Azure Data Factory

Limitations of ADF V1.0

What's new in V2.0?

Integration runtime

Linked services

Datasets

Pipelines

Activities

Parameters

Expressions

Controlling the flow of activities

SSIS package deployment in Azure

Spark cluster data store

Summary

Getting Started with Our First Data Factory

Resource group

Azure Data Factory

Datasets

Linked services

Integration runtimes

Activities

Monitoring the data factory pipeline runs

Azure Blob storage

Blob containers

Types of blobs

Block blobs

Page blobs

Replication of storage

Creating an Azure Blob storage account

SQL Azure database

Creating the Azure SQL Server

Attaching the BACPAC to our database

Copying data using our data factory

Summary

SSIS Lift and Shift

SSIS in ADF

Sample setup

Sample databases

SSIS components

Integration services catalog setup

Sample solution in Visual Studio

Deploying the project on-premises

Leveraging our package in ADF V2

Integration runtimes

Azure integration runtime

Self-hosted runtime

SSIS integration runtime

Adding an SSIS integration runtime to the factory

SSIS execution from a pipeline

Summary

Azure Data Lake

Creating and configuring Data Lake Store

Next Steps

Ways to copy/import data from a database to the Data Lake

Ways to store imported data in files in the Data Lake

Easily moving data to the Data Lake Store

Ways to directly copy files into the Data Lake

Prerequisites for the next steps

Creating a Data Lake Analytics resource

Using the data factory to manipulate data in the Data Lake

Task 1 – copy/import data from SQL Server to a blob storage file using data factory

Task 2 – run a U-SQL task from the data factory pipeline to summarize data

Service principal authentication

Run U-SQL from a job in the Data Lake Analytics

Summary

Machine Learning on the Cloud

Machine learning overview

Machine learning algorithms

Supervised learning

Unsupervised learning

Reinforcement learning

Machine learning tasks

Making predictions with regression algorithms

Automated classification using machine learning

Identifying groups using clustering methods

Dimensionality reduction to improve performance

Feature selection

Feature extraction

Azure Machine Learning Studio

Azure Machine Learning Studio account

Azure Machine Learning Studio experiment

Dataset

Module

Work area

Breast cancer detection

Get the data

Prepare the data

Train the model

Score and evaluate the model

Summary

Introduction to Azure Databricks

Azure Databricks setup

Prepare the data to ingest

Setting up the folder in the Azure storage account

Self-hosted integration runtime

Linked service setup

Datasets setup

SQL Server dataset

Blob storage dataset

Linked service

Dataset

Copy data from SQL Server to sales-data

Publish and trigger the copy activity

Databricks notebook

Calling Databricks notebook execution in ADF

Summary

Reporting on the Modern Data Warehouse

Different types of BI

Self-service – personal

Team BI – sharing personal BI data

Corporate BI

Power BI Premium

Power BI Report Server

Power BI consumption

Creating our Power BI reports

Reporting with on-premise data sources

Incorporating Spark data

Summary

累计评论(0条) 0个书友正在讨论这本书 发表评论

发表评论

发表评论,分享你的想法吧!

买过这本书的人还买过

读了这本书的人还在读

回顶部