万本电子书0元读

万本电子书0元读

顶部广告

Modern Scala Projects电子书

售       价:¥

2人正在读 | 0人评论 9.8

作       者:ilango gurusamy

出  版  社:Packt Publishing

出版时间:2018-07-30

字       数:33.7万

所属分类: 进口书 > 外文原版书 > 电脑/网络

温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印

为你推荐

  • 读书简介
  • 目录
  • 累计评论(0条)
  • 读书简介
  • 目录
  • 累计评论(0条)
Use an open source firewall and features such as failover, load balancer, OpenVPN, IPSec, and Squid to protect your network Key Features *Explore pfSense, a trusted open source network security solution *Configure pfSense as a firewall and create and manage firewall rules *Test pfSense for failover and load balancing across multiple WAN connections Book Description While connected to the internet, you’re a potential target for an array of cyber threats, such as hackers, keyloggers, and Trojans that attack through unpatched security holes. A firewall works as a barrier (or ‘shield’) between your computer and cyberspace. pfSense is highly versatile firewall software. With thousands of enterprises using pfSense, it is fast becoming the world's most trusted open source network security solution. Network Security with pfSense begins with an introduction to pfSense, where you will gain an understanding of what pfSense is, its key features, and advantages. Next, you will learn how to configure pfSense as a firewall and create and manage firewall rules. As you make your way through the chapters, you will test pfSense for failover and load balancing across multiple wide area network (WAN) connections. You will then configure pfSense with OpenVPN for secure remote connectivity and implement IPsec VPN tunnels with pfSense. In the concluding chapters, you’ll understand how to configure and integrate pfSense as a Squid proxy server. By the end of this book, you will be able to leverage the power of pfSense to build a secure network. What you will learn *Understand what pfSense is, its key features, and advantages *Configure pfSense as a firewall *Set up pfSense for failover and load balancing *Connect clients through an OpenVPN client *Configure an IPsec VPN tunnel with pfSense *Integrate the Squid proxy into pfSense Who this book is for Network Security with pfSense is for IT administrators, security administrators, technical architects, chief experience officers, and individuals who own a home or small office network and want to secure it.
目录展开

Title Page

Copyright and Credits

Modern Scala Projects

Packt Upsell

Why subscribe?

PacktPub.com

Contributors

About the author

About the reviewer

Packt is searching for authors like you

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Reviews

Predict the Class of a Flower from the Iris Dataset

A multivariate classification problem

Understanding multivariate

Different kinds of variables

Categorical variables

Fischer's Iris dataset

The Iris dataset represents a multiclass, multidimensional classification task

The training dataset

The mapping function

An algorithm and its mapping function

Supervised learning – how it relates to the Iris classification task

Random Forest classification algorithm

Project overview – problem formulation

Getting started with Spark

Setting up prerequisite software

Installing Spark in standalone deploy mode

Developing a simple interactive data analysis utility

Reading a data file and deriving DataFrame out of it

Implementing the Iris pipeline

Iris pipeline implementation objectives

Step 1 – getting the Iris dataset from the UCI Machine Learning Repository

Step 2 – preliminary EDA

Firing up Spark shell

Loading the iris.csv file and building a DataFrame

Calculating statistics

Inspecting your SparkConf again

Calculating statistics again

Step 3 – creating an SBT project

Step 4 – creating Scala files in SBT project

Step 5 – preprocessing, data transformation, and DataFrame creation

DataFrame Creation

Step 6 – creating, training, and testing data

Step 7 – creating a Random Forest classifier

Step 8 – training the Random Forest classifier

Step 9 – applying the Random Forest classifier to test data

Step 10 – evaluate Random Forest classifier

Step 11 – running the pipeline as an SBT application

Step 12 – packaging the application

Step 13 – submitting the pipeline application to Spark local

Summary

Questions

Build a Breast Cancer Prognosis Pipeline with the Power of Spark and Scala

Breast cancer classification problem

Breast cancer dataset at a glance

Logistic regression algorithm

Salient characteristics of LR

Binary logistic regression assumptions

A fictitious dataset and LR

LR as opposed to linear regression

Formulation of a linear regression classification model

Logit function as a mathematical equation

LR function

Getting started

Setting up prerequisite software

Implementation objectives

Implementation objective 1 – getting the breast cancer dataset

Implementation objective 2 – deriving a dataframe for EDA

Step 1 – conducting preliminary EDA

Step 2 – loading data and converting it to an RDD[String]

Step 3 – splitting the resilient distributed dataset and reorganizing individual rows into an array

Step 4 – purging the dataset of rows containing question mark characters

Step 5 – running a count after purging the dataset of rows with questionable characters

Step 6 – getting rid of header

Step 7 – creating a two-column DataFrame

Step 8 – creating the final DataFrame

Random Forest breast cancer pipeline

Step 1 – creating an RDD and preprocessing the data

Step 2 – creating training and test data

Step 3 – training the Random Forest classifier

Step 4 – applying the classifier to the test data

Step 5 – evaluating the classifier

Step 6 – running the pipeline as an SBT application

Step 7 – packaging the application

Step 8 – deploying the pipeline app into Spark local

LR breast cancer pipeline

Implementation objectives

Implementation objectives 1 and 2

Implementation objective 3 – Spark ML workflow for the breast cancer classification task

Implementation objective 4 – coding steps for building the indexer and logit machine learning model

Extending our pipeline object with the WisconsinWrapper trait

Importing the StringIndexer algorithm and using it

Splitting the DataFrame into training and test datasets

Creating a LogisticRegression classifier and setting hyperparameters on it

Running the LR model on the test dataset

Building a breast cancer pipeline with two stages

Implementation objective 5 – evaluating the binary classifier's performance

Summary

Questions

Stock Price Predictions

Stock price binary classification problem

Stock price prediction dataset at a glance

Getting started

Support for hardware virtualization

Installing the supported virtualization application

Downloading the HDP Sandbox and importing it

Hortonworks Sandbox virtual appliance overview

Turning on the virtual machine and powering up the Sandbox

Setting up SSH access for data transfer between Sandbox and the host machine

Setting up PuTTY, a third-party SSH and Telnet client

Setting up WinSCP, an SFTP client for Windows

Updating the default Python required by Zeppelin

What is Zeppelin?

Updating our Zeppelin instance

Launching the Ambari Dashboard and Zeppelin UI

Updating Zeppelin Notebook configuration by adding or updating interpreters

Updating a Spark 2 interpreter

Implementation objectives

List of implementation goals

Step 1 – creating a Scala representation of the path to the dataset file

Step 2 – creating an RDD[String]

Step 3 – splitting the RDD around the newline character in the dataset

Step 4 – transforming the RDD[String]

Step 5 – carrying out preliminary data analysis

Creating DataFrame from the original dataset

Dropping the Date and Label columns from the DataFrame

Having Spark describe the DataFrame

Adding a new column to the DataFrame and deriving Vector out of it

Removing stop words – a preprocessing step

Transforming the merged DataFrame

Transforming a DataFrame into an array of NGrams

Adding a new column to the DataFrame, devoid of stop words

Constructing a vocabulary from our dataset corpus

Training CountVectorizer

Using StringIndexer to transform our input label column

Dropping the input label column

Adding a new column to our DataFrame

Dividing the DataSet into training and test sets

Creating labelIndexer to index the indexedLabel column

Creating StringIndexer to index a column label

Creating RandomForestClassifier

Creating a new data pipeline with three stages

Creating a new data pipeline with hyperparameters

Training our new data pipeline

Generating stock price predictions

Summary

Questions

Building a Spam Classification Pipeline

Spam classification problem

Relevant background topics

Multidimensional data

Features and their importance

Classification task

Classification outcomes

Two possible classification outcomes

Project overview – problem formulation

Getting started

Setting up prerequisite software

Spam classification pipeline

Implementation steps

Step 1 – setting up your project folder

Step 2 – upgrading your build.sbt file

Step 3 – creating a trait called SpamWrapper

Step 4 – describing the dataset

Description of the SpamHam dataset

Step 5 – creating a new spam classifier class

Step 6 – listing the data preprocessing steps

Step 7 – regex to remove punctuation marks and whitespaces

Step 8 – creating a ham dataframe with punctuation removed

Creating a labeled ham dataframe

Step 9 – creating a spam dataframe devoid of punctuation

Step 10 – joining the spam and ham datasets

Step 11 – tokenizing our features

Step 12 – removing stop words

Step 13 – feature extraction

Step 14 – creating training and test datasets

Summary

Questions

Further reading

Build a Fraud Detection System

Fraud detection problem

Fraud detection dataset at a glance

Precision, recall, and the F1 score

Feature selection

The Gaussian Distribution function

Where does Spark fit in all this?

Fraud detection approach

Project overview – problem formulation

Getting started

Setting up Hortonworks Sandbox in the cloud

Creating your Azure free account, and signing in

The Azure Marketplace

The HDP Sandbox home page

Implementation objectives

Implementation steps

Create the FraudDetection trait

Broadcasting mean and standard deviation vectors

Calculating PDFs

F1 score

Calculating the best error term and best F1 score

Maximum and minimum values of a probability density

Step size for best error term calculation

A loop to generate the best F1 and the best error term

Generating predictions – outliers that represent fraud

Generating the best error term and best F1 measure

Preparing to compute precision and recall

A recap of how we looped through a ranger of Epsilons, the best error term, and the best F1 measure

Function to calculate false positives

Summary

Questions

Further reading

Build Flights Performance Prediction Model

Overview of flight delay prediction

The flight dataset at a glance

Problem formulation of flight delay prediction

Getting started

Setting up prerequisite software

Increasing Java memory

Reviewing the JDK version

MongoDB installation

Implementation and deployment

Implementation objectives

Creating a new Scala project

Building the AirlineWrapper Scala trait

Summary

Questions

Further reading

Building a Recommendation Engine

Problem overviews

Recommendations on Amazon

Brief overview

Detailed overview

On-site recommendations

Recommendation systems

Definition

Categorizing recommendations

Implicit recommendations

Explicit recommendations

Recommendations for machine learning

Collaborative filtering algorithms

Recommendations problem formulation

Understanding datasets

Detailed overview

Recommendations regarding problem formulation

Defining explicit feedback

Building a narrative

Sales leads and past sales

Weapon sales leads and past sales data

Implementation and deployment

Implementation

Step 1 – creating the Scala project

Step 2 – creating the AirlineWrapper definition

Step 3 – creating a weapon sales orders schema

Step 4 – creating a weapon sales leads schema

Step 5 – building a weapon sales order dataframe

Step 6 – displaying the weapons sales dataframe

Step 7 – displaying the customer-weapons-system dataframe

Step 8 – generating predictions

Step 9 – displaying predictions

Compilation and deployment

Compiling the project

What is an assembly.sbt file?

Creating assembly.sbt

Contents of assembly.sbt

Running the sbt assembly task

Upgrading the build.sbt file

Rerunning the assembly command

Deploying the recommendation application

Summary

Other Books You May Enjoy

Leave a review - let other readers know what you think

累计评论(0条) 0个书友正在讨论这本书 发表评论

发表评论

发表评论,分享你的想法吧!

买过这本书的人还买过

读了这本书的人还在读

回顶部