万本电子书0元读

万本电子书0元读

顶部广告

Scala Machine Learning Projects电子书

售       价:¥

4人正在读 | 0人评论 9.8

作       者:Md. Rezaul Karim

出  版  社:Packt Publishing

出版时间:2018-01-31

字       数:55.1万

所属分类: 进口书 > 外文原版书 > 电脑/网络

温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印

为你推荐

  • 读书简介
  • 目录
  • 累计评论(0条)
  • 读书简介
  • 目录
  • 累计评论(0条)
Powerful smart applications using deep learning algorithms to dominate numerical computing, deep learning, and functional programming. About This Book ? Explore machine learning techniques with prominent open source Scala libraries such as Spark ML, H2O, MXNet, Zeppelin, and DeepLearning4j ? Solve real-world machine learning problems by delving complex numerical computing with Scala functional programming in a scalable and faster way ? Cover all key aspects such as collection, storing, processing, analyzing, and evaluation required to build and deploy machine models on computing clusters using Scala Play framework. Who This Book Is For If you want to leverage the power of both Scala and Spark to make sense of Big Data, then this book is for you. If you are well versed with machine learning concepts and wants to expand your knowledge by delving into the practical implementation using the power of Scala, then this book is what you need! Strong understanding of Scala Programming language is recommended. Basic familiarity with machine Learning techniques will be more helpful. What You Will Learn ? Apply advanced regression techniques to boost the performance of predictive models ? Use different classification algorithms for business analytics ? Generate trading strategies for Bitcoin and stock trading using ensemble techniques ? Train Deep Neural Networks (DNN) using H2O and Spark ML ? Utilize NLP to build scalable machine learning models ? Learn how to apply reinforcement learning algorithms such as Q-learning for developing ML application ? Learn how to use autoencoders to develop a fraud detection application ? Implement LSTM and CNN models using DeepLearning4j and MXNet In Detail Machine learning has had a huge impact on academia and industry by turning data into actionable information. Scala has seen a steady rise in adoption over the past few years, especially in the fields of data science and analytics. This book is for data scientists, data engineers, and deep learning enthusiasts who have a background in complex numerical computing and want to know more hands-on machine learning application development. If you're well versed in machine learning concepts and want to expand your knowledge by delving into the practical implementation of these concepts using the power of Scala, then this book is what you need! Through 11 end-to-end projects, you will be acquainted with popular machine learning libraries such as Spark ML, H2O, DeepLearning4j, and MXNet. At the end, you will be able to use numerical computing and functional programming to carry out complex numerical tasks to develop, build, and deploy research or commercial projects in a production-ready environment. Style and approach Leverage the power of machine learning and deep learning in different domains, giving best practices and tips from a real world case studies and help you to avoid pitfalls and fallacies towards decision making based on predictive analytics with ML models.
目录展开

Title Page

Copyright and Credits

Scala Machine Learning Projects

Packt Upsell

Why subscribe?

PacktPub.com

Contributors

About the author

About the reviewer

Packt is searching for authors like you

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Reviews

Analyzing Insurance Severity Claims

Machine learning and learning workflow

Typical machine learning workflow

Hyperparameter tuning and cross-validation

Analyzing and predicting insurance severity claims

Motivation

Description of the dataset

Exploratory analysis of the dataset

Data preprocessing

LR for predicting insurance severity claims

Developing insurance severity claims predictive model using LR

GBT regressor for predicting insurance severity claims

Boosting the performance using random forest regressor

Random Forest for classification and regression

Comparative analysis and model deployment

Spark-based model deployment for large-scale dataset

Summary

Analyzing and Predicting Telecommunication Churn

Why do we perform churn analysis, and how do we do it?

Developing a churn analytics pipeline

Description of the dataset

Exploratory analysis and feature engineering

LR for churn prediction

SVM for churn prediction

DTs for churn prediction

Random Forest for churn prediction

Selecting the best model for deployment

Summary

High Frequency Bitcoin Price Prediction from Historical and Live Data

Bitcoin, cryptocurrency, and online trading

State-of-the-art automated trading of Bitcoin

Training

Prediction

High-level data pipeline of the prototype

Historical and live-price data collection

Historical data collection

Transformation of historical data into a time series

Assumptions and design choices

Data preprocessing

Real-time data through the Cryptocompare API

Model training for prediction

Scala Play web service

Concurrency through Akka actors

Web service workflow

JobModule

Scheduler

SchedulerActor

PredictionActor and the prediction step

TraderActor

Predicting prices and evaluating the model

Demo prediction using Scala Play framework

Why RESTful architecture?

Project structure

Running the Scala Play web app

Summary

Population-Scale Clustering and Ethnicity Prediction

Population scale clustering and geographic ethnicity

Machine learning for genetic variants

1000 Genomes Projects dataset description

Algorithms, tools, and techniques

H2O and Sparkling water

ADAM for large-scale genomics data processing

Unsupervised machine learning

Population genomics and clustering

How does K-means work?

DNNs for geographic ethnicity prediction

Configuring programming environment

Data pre-processing and feature engineering

Model training and hyperparameter tuning

Spark-based K-means for population-scale clustering

Determining the number of optimal clusters

Using H2O for ethnicity prediction

Using random forest for ethnicity prediction

Summary

Topic Modeling - A Better Insight into Large-Scale Texts

Topic modeling and text clustering

How does LDA algorithm work?

Topic modeling with Spark MLlib and Stanford NLP

Implementation

Step 1 - Creating a Spark session

Step 2 - Creating vocabulary and tokens count to train the LDA after text pre-processing

Step 3 - Instantiate the LDA model before training

Step 4 - Set the NLP optimizer

Step 5 - Training the LDA model

Step 6 - Prepare the topics of interest

Step 7 - Topic modelling

Step 8 - Measuring the likelihood of two documents

Other topic models versus the scalability of LDA

Deploying the trained LDA model

Summary

Developing Model-based Movie Recommendation Engines

Recommendation system

Collaborative filtering approaches

Content-based filtering approaches

Hybrid recommender systems

Model-based collaborative filtering

The utility matrix

Spark-based movie recommendation systems

Item-based collaborative filtering for movie similarity

Step 1 - Importing necessary libraries and creating a Spark session

Step 2 - Reading and parsing the dataset

Step 3 - Computing similarity

Step 4 - Testing the model

Model-based recommendation with Spark

Data exploration

Movie recommendation using ALS

Step 1 - Import packages, load, parse, and explore the movie and rating dataset

Step 2 - Register both DataFrames as temp tables to make querying easier

Step 3 - Explore and query for related statistics

Step 4 - Prepare training and test rating data and check the counts

Step 5 - Prepare the data for building the recommendation model using ALS

Step 6 - Build an ALS user product matrix

Step 7 - Making predictions

Step 8 - Evaluating the model

Selecting and deploying the best model

Summary

Options Trading Using Q-learning and Scala Play Framework

Reinforcement versus supervised and unsupervised learning

Using RL

Notation, policy, and utility in RL

Policy

Utility

A simple Q-learning implementation

Components of the Q-learning algorithm

States and actions in QLearning

The search space

The policy and action-value

QLearning model creation and training

QLearning model validation

Making predictions using the trained model

Developing an options trading web app using Q-learning

Problem description

Implementating an options trading web application

Creating an option property

Creating an option model

Putting it altogether

Evaluating the model

Wrapping up the options trading app as a Scala web app

The backend

The frontend

Running and Deployment Instructions

Model deployment

Summary

Clients Subscription Assessment for Bank Telemarketing using Deep Neural Networks

Client subscription assessment through telemarketing

Dataset description

Installing and getting started with Apache Zeppelin

Building from the source

Starting and stopping Apache Zeppelin

Creating notebooks

Exploratory analysis of the dataset

Label distribution

Job distribution

Marital distribution

Education distribution

Default distribution

Housing distribution

Loan distribution

Contact distribution

Month distribution

Day distribution

Previous outcome distribution

Age feature

Duration distribution

Campaign distribution

Pdays distribution

Previous distribution

emp_var_rate distributions

cons_price_idx features

cons_conf_idx distribution

Euribor3m distribution

nr_employed distribution

Statistics of numeric features

Implementing a client subscription assessment model

Hyperparameter tuning and feature selection

Number of hidden layers

Number of neurons per hidden layer

Activation functions

Weight and bias initialization

Regularization

Summary

Fraud Analytics Using Autoencoders and Anomaly Detection

Outlier and anomaly detection

Autoencoders and unsupervised learning

Working principles of an autoencoder

Efficient data representation with autoencoders

Developing a fraud analytics model

Description of the dataset and using linear models

Problem description

Preparing programming environment

Step 1 - Loading required packages and libraries

Step 2 - Creating a Spark session and importing implicits

Step 3 - Loading and parsing input data

Step 4 - Exploratory analysis of the input data

Step 5 - Preparing the H2O DataFrame

Step 6 - Unsupervised pre-training using autoencoder

Step 7 - Dimensionality reduction with hidden layers

Step 8 - Anomaly detection

Step 9 - Pre-trained supervised model

Step 10 - Model evaluation on the highly-imbalanced data

Step 11 - Stopping the Spark session and H2O context

Auxiliary classes and methods

Hyperparameter tuning and feature selection

Summary

Human Activity Recognition using Recurrent Neural Networks

Working with RNNs

Contextual information and the architecture of RNNs

RNN and the long-term dependency problem

LSTM networks

Human activity recognition using the LSTM model

Dataset description

Setting and configuring MXNet for Scala

Implementing an LSTM model for HAR

Step 1 - Importing necessary libraries and packages

Step 2 - Creating MXNet context

Step 3 - Loading and parsing the training and test set

Step 4 - Exploratory analysis of the dataset

Step 5 - Defining internal RNN structure and LSTM hyperparameters

Step 6 - LSTM network construction

Step 7 - Setting up an optimizer

Step 8 - Training the LSTM network

Step 9 - Evaluating the model

Tuning LSTM hyperparameters and GRU

Summary

Image Classification using Convolutional Neural Networks

Image classification and drawbacks of DNNs

CNN architecture

Convolutional operations

Pooling layer and padding operations

Subsampling operations

Convolutional and subsampling operations in DL4j

Configuring DL4j, ND4s, and ND4j

Convolutional and subsampling operations in DL4j

Large-scale image classification using CNN

Problem description

Description of the image dataset

Workflow of the overall project

Implementing CNNs for image classification

Image processing

Extracting image metadata

Image feature extraction

Preparing the ND4j dataset

Training the CNNs and saving the trained models

Evaluating the model

Wrapping up by executing the main() method

Tuning and optimizing CNN hyperparameters

Summary

Other Books You May Enjoy

Leave a review - let other readers know what you think

累计评论(0条) 0个书友正在讨论这本书 发表评论

发表评论

发表评论,分享你的想法吧!

买过这本书的人还买过

读了这本书的人还在读

回顶部