售 价:¥
温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印
为你推荐
Title Page
Credits
About the Author
About the Reviewer
www.PacktPub.com
Customer Feedback
Dedication
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
Introduction to Machine Learning and Predictive Analytics
Introducing Amazon Machine Learning
Machine Learning as a Service
Leveraging full AWS integration
Comparing performances
Engineering data versus model variety
Amazon's expertise and the gradient descent algorithm
Pricing
Understanding predictive analytics
Building the simplest predictive analytics algorithm
Regression versus classification
Expanding regression to classification with logistic regression
Extracting features to predict outcomes
Diving further into linear modeling for prediction
Validating the dataset
Missing from Amazon ML
The statistical approach versus the machine learning approach
Summary
Machine Learning Definitions and Concepts
What's an algorithm? What's a model?
Dealing with messy data
Classic datasets versus real-world datasets
Assumptions for multiclass linear models
Missing values
Normalization
Imbalanced datasets
Addressing multicollinearity
Detecting outliers
Accepting non-linear patterns
Adding features?
Preprocessing recapitulation
The predictive analytics workflow
Training and evaluation in Amazon ML
Identifying and correcting poor performances
Underfitting
Overfitting
Regularization on linear models
L2 regularization and Ridge
L1 regularization and Lasso
Evaluating the performance of your model
Summary
Overview of an Amazon Machine Learning Workflow
Opening an Amazon Web Services Account
Security
Setting up the account
Creating a user
Defining policies
Creating login credentials
Choosing a region
Overview of a standard Amazon Machine Learning workflow
The dataset
Loading the data on S3
Declaring a datasource
Creating the datasource
The model
The evaluation of the model
Comparing with a baseline
Making batch predictions
Summary
Loading and Preparing the Dataset
Working with datasets
Finding open datasets
Introducing the Titanic dataset
Preparing the data
Splitting the data
Loading data on S3
Creating a bucket
Loading the data
Granting permissions
Formatting the data
Creating the datasource
Verifying the data schema
Reusing the schema
Examining data statistics
Feature engineering with Athena
Introducing Athena
A brief tour of AWS Athena
Creating a titanic database
Using the wizard
Creating the database and table directly in SQL
Data munging in SQL
Missing values
Handling outliers in the fare
Extracting the title from the name
Inferring the deck from the cabin
Calculating family size
Wrapping up
Creating an improved datasource
Summary
Model Creation
Transforming data with recipes
Managing variables
Grouping variables
Naming variables with assignments
Specifying outputs
Data processing through seven transformations
Using simple transformations
Text mining
Coupling variables
Binning numeric values
Creating a model
Editing the suggested recipe
Applying recipes to the Titanic dataset
Choosing between recipes and data pre-processing.
Parametrizing the model
Setting model memory
Setting the number of data passes
Choosing regularization
Creating an evaluation
Evaluating the model
Evaluating binary classification
Exploring the model performances
Evaluating linear regression
Evaluating multiclass classification
Analyzing the logs
Optimizing the learning rate
Visualizing convergence
Impact of regularization
Comparing different recipes on the Titanic dataset
Keeping variables as numeric or applying quantile binning?
Parsing the model logs
Summary
Predictions and Performances
Making batch predictions
Creating the batch prediction job
Interpreting prediction outputs
Reading the manifest file
Reading the results file
Assessing our predictions
Evaluating the held-out dataset
Finding out who will survive
Multiplying trials
Making real-time predictions
Manually exploring variable influence
Setting up real-time predictions
AWS SDK
Setting up AWS credentials
AWS access keys
Setting up AWS CLI
Python SDK
Summary
Command Line and SDK
Getting started and setting up
Using the CLI versus SDK
Installing AWS CLI
Picking up CLI syntax
Passing parameters using JSON files
Introducing the Ames Housing dataset
Splitting the dataset with shell commands
A simple project using the CLI
An overview of Amazon ML CLI commands
Creating the datasource
Creating the model
Evaluating our model with create-evaluation
What is cross-validation?
Implementing Monte Carlo cross-validation
Generating the shuffled datasets
Generating the datasources template
Generating the models template
Generating the evaluations template
The results
Conclusion
Boto3, the Python SDK
Working with the Python SDK for Amazon Machine Learning
Waiting on operation completion
Wrapping up the Python-based workflow
Implementing recursive feature selection with Boto3
Managing schema and recipe
Summary
Creating Datasources from Redshift
Choosing between RDS and Redshift
Creating a Redshift instance
Connecting through the command line
Executing Redshift queries using Psql
Creating our own non-linear dataset
Uploading the nonlinear data to Redshift
Introducing polynomial regression
Establishing a baseline
Polynomial regression in Amazon ML
Driving the trials in Python
Interpreting the results
Summary
Building a Streaming Data Analysis Pipeline
Streaming Twitter sentiment analysis
Popularity contest on twitter
The training dataset and the model
Kinesis
Kinesis Stream
Kinesis Analytics
Setting up Kinesis Firehose
Producing tweets
The Redshift database
Adding Redshift to the Kinesis Firehose
Setting up the roles and policies
Dependencies and debugging
Data format synchronization
Debugging
Preprocessing with Lambda
Analyzing the results
Download the dataset from RedShift
Sentiment analysis with TextBlob
Removing duplicate tweets
And what is the most popular vegetable?
Going beyond classification and regression
Summary
买过这本书的人还买过
读了这本书的人还在读
同类图书排行榜