万本电子书0元读

万本电子书0元读

顶部广告

Python Machine Learning By Example电子书

售       价:¥

2人正在读 | 0人评论 9.8

作       者:Yuxi (Hayden) Liu

出  版  社:Packt Publishing

出版时间:2019-02-28

字       数:45.8万

所属分类: 进口书 > 外文原版书 > 电脑/网络

温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印

为你推荐

  • 读书简介
  • 目录
  • 累计评论(0条)
  • 读书简介
  • 目录
  • 累计评论(0条)
Grasp machine learning concepts, techniques, and algorithms with the help of real-world examples using Python libraries such as TensorFlow and scikit-learn Key Features * Exploit the power of Python to explore the world of data mining and data analytics * Discover machine learning algorithms to solve complex challenges faced by data scientists today * Use Python libraries such as TensorFlow and Keras to create smart cognitive actions for your projects Book Description The surge in interest in machine learning (ML) is due to the fact that it revolutionizes automation by learning patterns in data and using them to make predictions and decisions. If you’re interested in ML, this book will serve as your entry point to ML. Python Machine Learning By Example begins with an introduction to important ML concepts and implementations using Python libraries. Each chapter of the book walks you through an industry adopted application. You’ll implement ML techniques in areas such as exploratory data analysis, feature engineering, and natural language processing (NLP) in a clear and easy-to-follow way. With the help of this extended and updated edition, you’ll understand how to tackle data-driven problems and implement your solutions with the powerful yet simple Python language and popular Python packages and tools such as TensorFlow, scikit-learn, gensim, and Keras. To aid your understanding of popular ML algorithms, the book covers interesting and easy-to-follow examples such as news topic modeling and classification, spam email detection, stock price forecasting, and more. By the end of the book, you’ll have put together a broad picture of the ML ecosystem and will be well-versed with the best practices of applying ML techniques to make the most out of new opportunities. What you will learn * Understand the important concepts in machine learning and data science * Use Python to explore the world of data mining and analytics * Scale up model training using varied data complexities with Apache Spark * Delve deep into text and NLP using Python libraries such NLTK and gensim * Select and build an ML model and evaluate and optimize its performance * Implement ML algorithms from scratch in Python, TensorFlow, and scikit-learn Who this book is for If you’re a machine learning aspirant, data analyst, or data engineer highly passionate about machine learning and want to begin working on ML assignments, this book is for you. Prior knowledge of Python coding is assumed and basic familiarity with statistical concepts will be beneficial although not necessary.
目录展开

Title Page

Copyright and Credits

Python Machine Learning By Example Second Edition

About Packt

Why subscribe?

Packt.com

Dedication

Foreword

Contributors

About the author

About the reviewer

Packt is searching for authors like you

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Reviews

Section 1: Fundamentals of Machine Learning

Getting Started with Machine Learning and Python

Defining machine learning and why we need it

A very high-level overview of machine learning technology

Types of machine learning tasks

A brief history of the development of machine learning algorithms

Core of machine learning – generalizing with data

Overfitting, underfitting, and the bias-variance trade-off

Avoiding overfitting with cross-validation

Avoiding overfitting with regularization

Avoiding overfitting with feature selection and dimensionality reduction

Preprocessing, exploration, and feature engineering

Missing values

Label encoding

One hot encoding

Scaling

Polynomial features

Power transform

Binning

Combining models

Voting and averaging

Bagging

Boosting

Stacking

Installing software and setting up

Setting up Python and environments

Installing the various packages

NumPy

SciPy

Pandas

Scikit-learn

TensorFlow

Summary

Exercises

Section 2: Practical Python Machine Learning By Example

Exploring the 20 Newsgroups Dataset with Text Analysis Techniques

How computers understand language - NLP

Picking up NLP basics while touring popular NLP libraries

Corpus

Tokenization

PoS tagging

Named-entity recognition

Stemming and lemmatization

Semantics and topic modeling

Getting the newsgroups data

Exploring the newsgroups data

Thinking about features for text data

Counting the occurrence of each word token

Text preprocessing

Dropping stop words

Stemming and lemmatizing words

Visualizing the newsgroups data with t-SNE

What is dimensionality reduction?

t-SNE for dimensionality reduction

Summary

Exercises

Mining the 20 Newsgroups Dataset with Clustering and Topic Modeling Algorithms

Learning without guidance – unsupervised learning

Clustering newsgroups data using k-means

How does k-means clustering work?

Implementing k-means from scratch

Implementing k-means with scikit-learn

Choosing the value of k

Clustering newsgroups data using k-means

Discovering underlying topics in newsgroups

Topic modeling using NMF

Topic modeling using LDA

Summary

Exercises

Detecting Spam Email with Naive Bayes

Getting started with classification

Types of classification

Applications of text classification

Exploring Naïve Bayes

Learning Bayes' theorem by examples

The mechanics of Naïve Bayes

Implementing Naïve Bayes from scratch

Implementing Naïve Bayes with scikit-learn

Classification performance evaluation

Model tuning and cross-validation

Summary

Exercise

Classifying Newsgroup Topics with Support Vector Machines

Finding separating boundary with support vector machines

Understanding how SVM works through different use cases

Case 1 – identifying a separating hyperplane

Case 2 – determining the optimal hyperplane

Case 3 – handling outliers

Implementing SVM

Case 4 – dealing with more than two classes

The kernels of SVM

Case 5 – solving linearly non-separable problems

Choosing between linear and RBF kernels

Classifying newsgroup topics with SVMs

More example – fetal state classification on cardiotocography

A further example – breast cancer classification using SVM with TensorFlow

Summary

Exercise

Predicting Online Ad Click-Through with Tree-Based Algorithms

Brief overview of advertising click-through prediction

Getting started with two types of data – numerical and categorical

Exploring decision tree from root to leaves

Constructing a decision tree

The metrics for measuring a split

Implementing a decision tree from scratch

Predicting ad click-through with decision tree

Ensembling decision trees – random forest

Implementing random forest using TensorFlow

Summary

Exercise

Predicting Online Ad Click-Through with Logistic Regression

Converting categorical features to numerical – one-hot encoding and ordinal encoding

Classifying data with logistic regression

Getting started with the logistic function

Jumping from the logistic function to logistic regression

Training a logistic regression model

Training a logistic regression model using gradient descent

Predicting ad click-through with logistic regression using gradient descent

Training a logistic regression model using stochastic gradient descent

Training a logistic regression model with regularization

Training on large datasets with online learning

Handling multiclass classification

Implementing logistic regression using TensorFlow

Feature selection using random forest

Summary

Exercises

Scaling Up Prediction to Terabyte Click Logs

Learning the essentials of Apache Spark

Breaking down Spark

Installing Spark

Launching and deploying Spark programs

Programming in PySpark

Learning on massive click logs with Spark

Loading click logs

Splitting and caching the data

One-hot encoding categorical features

Training and testing a logistic regression model

Feature engineering on categorical variables with Spark

Hashing categorical features

Combining multiple variables – feature interaction

Summary

Exercises

Stock Price Prediction with Regression Algorithms

Brief overview of the stock market and stock prices

What is regression?

Mining stock price data

Getting started with feature engineering

Acquiring data and generating features

Estimating with linear regression

How does linear regression work?

Implementing linear regression

Estimating with decision tree regression

Transitioning from classification trees to regression trees

Implementing decision tree regression

Implementing regression forest

Estimating with support vector regression

Implementing SVR

Estimating with neural networks

Demystifying neural networks

Implementing neural networks

Evaluating regression performance

Predicting stock price with four regression algorithms

Summary

Exercise

Section 3: Python Machine Learning Best Practices

Machine Learning Best Practices

Machine learning solution workflow

Best practices in the data preparation stage

Best practice 1 – completely understanding the project goal

Best practice 2 – collecting all fields that are relevant

Best practice 3 – maintaining the consistency of field values

Best practice 4 – dealing with missing data

Best practice 5 – storing large-scale data

Best practices in the training sets generation stage

Best practice 6 – identifying categorical features with numerical values

Best practice 7 – deciding on whether or not to encode categorical features

Best practice 8 – deciding on whether or not to select features, and if so, how to do so

Best practice 9 – deciding on whether or not to reduce dimensionality, and if so, how to do so

Best practice 10 – deciding on whether or not to rescale features

Best practice 11 – performing feature engineering with domain expertise

Best practice 12 – performing feature engineering without domain expertise

Best practice 13 – documenting how each feature is generated

Best practice 14 – extracting features from text data

Best practices in the model training, evaluation, and selection stage

Best practice 15 – choosing the right algorithm(s) to start with

Naïve Bayes

Logistic regression

SVM

Random forest (or decision tree)

Neural networks

Best practice 16 – reducing overfitting

Best practice 17 – diagnosing overfitting and underfitting

Best practice 18 – modeling on large-scale datasets

Best practices in the deployment and monitoring stage

Best practice 19 – saving, loading, and reusing models

Best practice 20 – monitoring model performance

Best practice 21 – updating models regularly

Summary

Exercises

Other Books You May Enjoy

Leave a review - let other readers know what you think

累计评论(0条) 1个书友正在讨论这本书 发表评论

发表评论

发表评论,分享你的想法吧!

买过这本书的人还买过

读了这本书的人还在读

回顶部