万本电子书0元读

万本电子书0元读

顶部广告

Mastering Predictive Analytics with R电子书

售       价:¥

2人正在读 | 0人评论 9.8

作       者:Rui Miguel Forte

出  版  社:Packt Publishing

出版时间:2015-06-17

字       数:262.6万

所属分类: 进口书 > 外文原版书 > 电脑/网络

温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印

为你推荐

  • 读书简介
  • 目录
  • 累计评论(0条)
  • 读书简介
  • 目录
  • 累计评论(0条)
This book is intended for the budding data scientist, predictive modeler, or quantitative analyst with only a basic exposure to R and statistics. It is also designed to be a reference for experienced professionals wanting to brush up on the details of a particular type of predictive model. Mastering Predictive Analytics with R assumes familiarity with only the fundamentals of R, such as the main data types, simple functions, and how to move data around. No prior experience with machine learning or predictive modeling is assumed, however you should have a basic understanding of statistics and calculus at a high school level.
目录展开

Mastering Predictive Analytics with R

Table of Contents

Mastering Predictive Analytics with R

Credits

About the Author

Acknowledgments

About the Reviewers

www.PacktPub.com

Support files, eBooks, discount offers, and more

Why subscribe?

Free access for Packt account holders

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Errata

Piracy

Questions

1. Gearing Up for Predictive Modeling

Models

Learning from data

The core components of a model

Our first model: k-nearest neighbors

Types of models

Supervised, unsupervised, semi-supervised, and reinforcement learning models

Parametric and nonparametric models

Regression and classification models

Real-time and batch machine learning models

The process of predictive modeling

Defining the model's objective

Collecting the data

Picking a model

Preprocessing the data

Exploratory data analysis

Feature transformations

Encoding categorical features

Missing data

Outliers

Removing problematic features

Feature engineering and dimensionality reduction

Training and assessing the model

Repeating with different models and final model selection

Deploying the model

Performance metrics

Assessing regression models

Assessing classification models

Assessing binary classification models

Summary

2. Linear Regression

Introduction to linear regression

Assumptions of linear regression

Simple linear regression

Estimating the regression coefficients

Multiple linear regression

Predicting CPU performance

Predicting the price of used cars

Assessing linear regression models

Residual analysis

Significance tests for linear regression

Performance metrics for linear regression

Comparing different regression models

Test set performance

Problems with linear regression

Multicollinearity

Outliers

Feature selection

Regularization

Ridge regression

Least absolute shrinkage and selection operator (lasso)

Implementing regularization in R

Summary

3. Logistic Regression

Classifying with linear regression

Introduction to logistic regression

Generalized linear models

Interpreting coefficients in logistic regression

Assumptions of logistic regression

Maximum likelihood estimation

Predicting heart disease

Assessing logistic regression models

Model deviance

Test set performance

Regularization with the lasso

Classification metrics

Extensions of the binary logistic classifier

Multinomial logistic regression

Predicting glass type

Ordinal logistic regression

Predicting wine quality

Summary

4. Neural Networks

The biological neuron

The artificial neuron

Stochastic gradient descent

Gradient descent and local minima

The perceptron algorithm

Linear separation

The logistic neuron

Multilayer perceptron networks

Training multilayer perceptron networks

Predicting the energy efficiency of buildings

Evaluating multilayer perceptrons for regression

Predicting glass type revisited

Predicting handwritten digits

Receiver operating characteristic curves

Summary

5. Support Vector Machines

Maximal margin classification

Support vector classification

Inner products

Kernels and support vector machines

Predicting chemical biodegration

Cross-validation

Predicting credit scores

Multiclass classification with support vector machines

Summary

6. Tree-based Methods

The intuition for tree models

Algorithms for training decision trees

Classification and regression trees

CART regression trees

Tree pruning

Missing data

Regression model trees

CART classification trees

C5.0

Predicting class membership on synthetic 2D data

Predicting the authenticity of banknotes

Predicting complex skill learning

Tuning model parameters in CART trees

Variable importance in tree models

Regression model trees in action

Summary

7. Ensemble Methods

Bagging

Margins and out-of-bag observations

Predicting complex skill learning with bagging

Predicting heart disease with bagging

Limitations of bagging

Boosting

AdaBoost

Predicting atmospheric gamma ray radiation

Predicting complex skill learning with boosting

Limitations of boosting

Random forests

The importance of variables in random forests

Summary

8. Probabilistic Graphical Models

A little graph theory

Bayes' Theorem

Conditional independence

Bayesian networks

The Naïve Bayes classifier

Predicting the sentiment of movie reviews

Hidden Markov models

Predicting promoter gene sequences

Predicting letter patterns in English words

Summary

9. Time Series Analysis

Fundamental concepts of time series

Time series summary functions

Some fundamental time series

White noise

Fitting a white noise time series

Random walk

Fitting a random walk

Stationarity

Stationary time series models

Moving average models

Autoregressive models

Autoregressive moving average models

Non-stationary time series models

Autoregressive integrated moving average models

Autoregressive conditional heteroscedasticity models

Generalized autoregressive heteroscedasticity models

Predicting intense earthquakes

Predicting lynx trappings

Predicting foreign exchange rates

Other time series models

Summary

10. Topic Modeling

An overview of topic modeling

Latent Dirichlet Allocation

The Dirichlet distribution

The generative process

Fitting an LDA model

Modeling the topics of online news stories

Model stability

Finding the number of topics

Topic distributions

Word distributions

LDA extensions

Summary

11. Recommendation Systems

Rating matrix

Measuring user similarity

Collaborative filtering

User-based collaborative filtering

Item-based collaborative filtering

Singular value decomposition

R and Big Data

Predicting recommendations for movies and jokes

Loading and preprocessing the data

Exploring the data

Evaluating binary top-N recommendations

Evaluating non-binary top-N recommendations

Evaluating individual predictions

Other approaches to recommendation systems

Summary

Index

累计评论(0条) 0个书友正在讨论这本书 发表评论

发表评论

发表评论,分享你的想法吧!

买过这本书的人还买过

读了这本书的人还在读

回顶部