售 价:¥
温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印
为你推荐
Title Page
Copyright and Credits
Hands-On Machine Learning for Algorithmic Trading
About Packt
Why subscribe?
Contributors
About the author
About the reviewers
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Download the color images
Conventions used
Get in touch
Reviews
Machine Learning for Trading
How to read this book
What to expect
Who should read this book
How the book is organized
Part I – the framework – from data to strategy design
Part 2 – ML fundamentals
Part 3 – natural language processing
Part 4 – deep and reinforcement learning
What you need to succeed
Data sources
GitHub repository
Python libraries
The rise of ML in the investment industry
From electronic to high-frequency trading
Factor investing and smart beta funds
Algorithmic pioneers outperform humans at scale
ML driven funds attract $1 trillion AUM
The emergence of quantamental funds
Investments in strategic capabilities
ML and alternative data
Crowdsourcing of trading algorithms
Design and execution of a trading strategy
Sourcing and managing data
Alpha factor research and evaluation
Portfolio optimization and risk management
Strategy backtesting
ML and algorithmic trading strategies
Use Cases of ML for Trading
Data mining for feature extraction
Supervised learning for alpha factor creation and aggregation
Asset allocation
Testing trade ideas
Reinforcement learning
Summary
Market and Fundamental Data
How to work with market data
Market microstructure
Marketplaces
Types of orders
Working with order book data
The FIX protocol
Nasdaq TotalView-ITCH Order Book data
Parsing binary ITCH messages
Reconstructing trades and the order book
Regularizing tick data
Tick bars
Time bars
Volume bars
Dollar bars
API access to market data
Remote data access using pandas
Reading html tables
pandas-datareader for market data
The Investor Exchange
Quantopian
Zipline
Quandl
Other market-data providers
How to work with fundamental data
Financial statement data
Automated processing – XBRL
Building a fundamental data time series
Extracting the financial statements and notes dataset
Retrieving all quarterly Apple filings
Building a price/earnings time series
Other fundamental data sources
pandas_datareader – macro and industry data
Efficient data storage with pandas
Summary
Alternative Data for Finance
The alternative data revolution
Sources of alternative data
Individuals
Business processes
Sensors
Satellites
Geolocation data
Evaluating alternative datasets
Evaluation criteria
Quality of the signal content
Asset classes
Investment style
Risk premiums
Alpha content and quality
Quality of the data
Legal and reputational risks
Exclusivity
Time horizon
Frequency
Reliability
Technical aspects
Latency
Format
The market for alternative data
Data providers and use cases
Social sentiment data
Dataminr
StockTwits
RavenPack
Satellite data
Geolocation data
Email receipt data
Working with alternative data
Scraping OpenTable data
Extracting data from HTML using requests and BeautifulSoup
Introducing Selenium – using browser automation
Building a dataset of restaurant bookings
One step further – Scrapy and splash
Earnings call transcripts
Parsing HTML using regular expressions
Summary
Alpha Factor Research
Engineering alpha factors
Important factor categories
Momentum and sentiment factors
Rationale
Key metrics
Value factors
Rationale
Key metrics
Volatility and size factors
Rationale
Key metrics
Quality factors
Rationale
Key metrics
How to transform data into factors
Useful pandas and NumPy methods
Loading the data
Resampling from daily to monthly frequency
Computing momentum factors
Using lagged returns and different holding periods
Compute factor betas
Built-in Quantopian factors
TA-Lib
Seeking signals – how to use zipline
The architecture – event-driven trading simulation
A single alpha factor from market data
Combining factors from diverse data sources
Separating signal and noise – how to use alphalens
Creating forward returns and factor quantiles
Predictive performance by factor quantiles
The information coefficient
Factor turnover
Alpha factor resources
Alternative algorithmic trading libraries
Summary
Strategy Evaluation
How to build and test a portfolio with zipline
Scheduled trading and portfolio rebalancing
How to measure performance with pyfolio
The Sharpe ratio
The fundamental law of active management
In and out-of-sample performance with pyfolio
Getting pyfolio input from alphalens
Getting pyfolio input from a zipline backtest
Walk-forward testing – out-of-sample returns
Summary – performance statistics
Drawdown periods and factor exposure
Modeling event risk
How to avoid the pitfalls of backtesting
Data challenges
Look-ahead bias
Survivorship bias
Outlier control
Unrepresentative period
Implementation issues
Mark-to-market performance
Trading costs
Timing of trades
Data-snooping and backtest-overfitting
The minimum backtest length and the deflated SR
Optimal stopping for backtests
How to manage portfolio risk and return
Mean-variance optimization
How it works
The efficient frontier in Python
Challenges and shortcomings
Alternatives to mean-variance optimization
The 1/n portfolio
The minimum-variance portfolio
Global Portfolio Optimization - The Black-Litterman approach
How to size your bets – the Kelly rule
The optimal size of a bet
Optimal investment – single asset
Optimal investment – multiple assets
Risk parity
Risk factor investment
Hierarchical risk parity
Summary
The Machine Learning Process
Learning from data
Supervised learning
Unsupervised learning
Applications
Cluster algorithms
Dimensionality reduction
Reinforcement learning
The machine learning workflow
Basic walkthrough – k-nearest neighbors
Frame the problem – goals and metrics
Prediction versus inference
Causal inference
Regression problems
Classification problems
Receiver operating characteristics and the area under the curve
Precision-recall curves
Collecting and preparing the data
Explore, extract, and engineer features
Using information theory to evaluate features
Selecting an ML algorithm
Design and tune the model
The bias-variance trade-off
Underfitting versus overfitting
Managing the trade-off
Learning curves
How to use cross-validation for model selection
How to implement cross-validation in Python
Basic train-test split
Cross-validation
Using a hold-out test set
KFold iterator
Leave-one-out CV
Leave-P-Out CV
ShuffleSplit
Parameter tuning with scikit-learn
Validation curves with yellowbricks
Learning curves
Parameter tuning using GridSearchCV and pipeline
Challenges with cross-validation in finance
Time series cross-validation with sklearn
Purging, embargoing, and combinatorial CV
Summary
Linear Models
Linear regression for inference and prediction
The multiple linear regression model
How to formulate the model
How to train the model
Least squares
Maximum likelihood estimation
Gradient descent
The Gauss-Markov theorem
How to conduct statistical inference
How to diagnose and remedy problems
Goodness of fit
Heteroskedasticity
Serial correlation
Multicollinearity
How to run linear regression in practice
OLS with statsmodels
Stochastic gradient descent with sklearn
How to build a linear factor model
From the CAPM to the Fama-French five-factor model
Obtaining the risk factors
Fama-Macbeth regression
Shrinkage methods – regularization for linear regression
How to hedge against overfitting
How ridge regression works
How lasso regression works
How to use linear regression to predict returns
Prepare the data
Universe creation and time horizon
Target return computation
Alpha factor selection and transformation
Data cleaning – missing data
Data exploration
Dummy encoding of categorical variables
Creating forward returns
Linear OLS regression using statsmodels
Diagnostic statistics
Linear OLS regression using sklearn
Custom time series – cross-validation
Select features and target
Cross-validating the model
Test results – information coefficient and RMSE
Ridge regression using sklearn
Tuning the regularization parameters using cross-validation
Cross-validation results and ridge coefficient paths
Top 10 coefficients
Lasso regression using sklearn
Cross-validated information coefficient and Lasso Path
Linear classification
The logistic regression model
Objective function
The logistic function
Maximum likelihood estimation
How to conduct inference with statsmodels
How to use logistic regression for prediction
How to predict price movements using sklearn
Summary
Time Series Models
Analytical tools for diagnostics and feature extraction
How to decompose time series patterns
How to compute rolling window statistics
Moving averages and exponential smoothing
How to measure autocorrelation
How to diagnose and achieve stationarity
Time series transformations
How to diagnose and address unit roots
Unit root tests
How to apply time series transformations
Univariate time series models
How to build autoregressive models
How to identify the number of lags
How to diagnose model fit
How to build moving average models
How to identify the number of lags
The relationship between AR and MA models
How to build ARIMA models and extensions
How to identify the number of AR and MA terms
Adding features – ARMAX
Adding seasonal differencing – SARIMAX
How to forecast macro fundamentals
How to use time series models to forecast volatility
The autoregressive conditional heteroskedasticity (ARCH) model
Generalizing ARCH – the GARCH model
Selecting the lag order
How to build a volatility-forecasting model
Multivariate time series models
Systems of equations
The vector autoregressive (VAR) model
How to use the VAR model for macro fundamentals forecasts
Cointegration – time series with a common trend
Testing for cointegration
How to use cointegration for a pairs-trading strategy
Summary
Bayesian Machine Learning
How Bayesian machine learning works
How to update assumptions from empirical evidence
Exact inference – maximum a posteriori estimation
How to select priors
How to keep inference simple – conjugate priors
How to dynamically estimate the probabilities of asset price moves
Approximate inference: stochastic versus deterministic approaches
Sampling-based stochastic inference
Markov chain Monte Carlo sampling
Gibbs sampling
Metropolis-Hastings sampling
Hamiltonian Monte Carlo – going NUTS
Variational Inference
Automatic Differentiation Variational Inference
Probabilistic programming with PyMC3
Bayesian machine learning with Theano
The PyMC3 workflow
Model definition – Bayesian logistic regression
Visualization and plate notation
The Generalized Linear Models module
MAP inference
Approximate inference – MCMC
Credible intervals
Approximate inference – variational Bayes
Model diagnostics
Convergence
Posterior Predictive Checks
Prediction
Practical applications
Bayesian Sharpe ratio and performance comparison
Model definition
Performance comparison
Bayesian time series models
Stochastic volatility models
Summary
Decision Trees and Random Forests
Decision trees
How trees learn and apply decision rules
How to use decision trees in practice
How to prepare the data
How to code a custom cross-validation class
How to build a regression tree
How to build a classification tree
How to optimize for node purity
How to train a classification tree
How to visualize a decision tree
How to evaluate decision tree predictions
Feature importance
Overfitting and regularization
How to regularize a decision tree
Decision tree pruning
How to tune the hyperparameters
GridsearchCV for decision trees
How to inspect the tree structure
Learning curves
Strengths and weaknesses of decision trees
Random forests
Ensemble models
How bagging lowers model variance
Bagged decision trees
How to build a random forest
How to train and tune a random forest
Feature importance for random forests
Out-of-bag testing
Pros and cons of random forests
Summary
Gradient Boosting Machines
Adaptive boosting
The AdaBoost algorithm
AdaBoost with sklearn
Gradient boosting machines
How to train and tune GBM models
Ensemble size and early stopping
Shrinkage and learning rate
Subsampling and stochastic gradient boosting
How to use gradient boosting with sklearn
How to tune parameters with GridSearchCV
Parameter impact on test scores
How to test on the holdout set
Fast scalable GBM implementations
How algorithmic innovations drive performance
Second-order loss function approximation
Simplified split-finding algorithms
Depth-wise versus leaf-wise growth
GPU-based training
DART – dropout for trees
Treatment of categorical features
Additional features and optimizations
How to use XGBoost, LightGBM, and CatBoost
How to create binary data formats
How to tune hyperparameters
Objectives and loss functions
Learning parameters
Regularization
Randomized grid search
How to evaluate the results
Cross-validation results across models
How to interpret GBM results
Feature importance
Partial dependence plots
SHapley Additive exPlanations
How to summarize SHAP values by feature
How to use force plots to explain a prediction
How to analyze feature interaction
Summary
Unsupervised Learning
Dimensionality reduction
Linear and non-linear algorithms
The curse of dimensionality
Linear dimensionality reduction
Principal Component Analysis
Visualizing PCA in 2D
The assumptions made by PCA
How the PCA algorithm works
PCA based on the covariance matrix
PCA using Singular Value Decomposition
PCA with sklearn
Independent Component Analysis
ICA assumptions
The ICA algorithm
ICA with sklearn
PCA for algorithmic trading
Data-driven risk factors
Eigen portfolios
Manifold learning
t-SNE
UMAP
Clustering
k-Means clustering
Evaluating cluster quality
Hierarchical clustering
Visualization – dendrograms
Density-based clustering
DBSCAN
Hierarchical DBSCAN
Gaussian mixture models
The expectation-maximization algorithm
Hierarchical risk parity
Summary
Working with Text Data
How to extract features from text data
Challenges of NLP
The NLP workflow
Parsing and tokenizing text data
Linguistic annotation
Semantic annotation
Labeling
Use cases
From text to tokens – the NLP pipeline
NLP pipeline with spaCy and textacy
Parsing, tokenizing, and annotating a sentence
Batch-processing documents
Sentence boundary detection
Named entity recognition
N-grams
spaCy's streaming API
Multi-language NLP
NLP with TextBlob
Stemming
Sentiment polarity and subjectivity
From tokens to numbers – the document-term matrix
The BoW model
Measuring the similarity of documents
Document-term matrix with sklearn
Using CountVectorizer
Visualizing vocabulary distribution
Finding the most similar documents
TfidFTransformer and TfidFVectorizer
The effect of smoothing
How to summarize news articles using TfidFVectorizer
Text Preprocessing - review
Text classification and sentiment analysis
The Naive Bayes classifier
Bayes' theorem refresher
The conditional independence assumption
News article classification
Training and evaluating multinomial Naive Bayes classifier
Sentiment analysis
Twitter data
Multinomial Naive Bayes
Comparison with TextBlob sentiment scores
Business reviews – the Yelp dataset challenge
Benchmark accuracy
Multinomial Naive Bayes model
One-versus-all logistic regression
Combining text and numerical features
Multinomial logistic regression
Gradient-boosting machine
Summary
Topic Modeling
Learning latent topics: goals and approaches
From linear algebra to hierarchical probabilistic models
Latent semantic indexing
How to implement LSI using sklearn
Pros and cons
Probabilistic latent semantic analysis
How to implement pLSA using sklearn
Latent Dirichlet allocation
How LDA works
The Dirichlet distribution
The generative model
Reverse-engineering the process
How to evaluate LDA topics
Perplexity
Topic coherence
How to implement LDA using sklearn
How to visualize LDA results using pyLDAvis
How to implement LDA using gensim
Topic modeling for earnings calls
Data preprocessing
Model training and evaluation
Running experiments
Topic modeling for Yelp business reviews
Summary
Word Embeddings
How word embeddings encode semantics
How neural language models learn usage in context
The Word2vec model – learn embeddings at scale
Model objective – simplifying the softmax
Automatic phrase detection
How to evaluate embeddings – vector arithmetic and analogies
How to use pre-trained word vectors
GloVe – global vectors for word representation
How to train your own word vector embeddings
The Skip-Gram architecture in Keras
Noise-contrastive estimation
The model components
Visualizing embeddings using TensorBoard
Word vectors from SEC filings using gensim
Preprocessing
Automatic phrase detection
Model training
Model evaluation
Performance impact of parameter settings
Sentiment analysis with Doc2vec
Training Doc2vec on yelp sentiment data
Create input data
Bonus – Word2vec for translation
Summary
Deep Learning
Deep learning and AI
The challenges of high-dimensional data
DL as representation learning
How DL extracts hierarchical features from data
Universal function approximation
DL and manifold learning
How DL relates to ML and AI
How to design a neural network
How neural networks work
A simple feedforward network architecture
Key design choices
Cost functions
Output units
Hidden units
How to regularize deep neural networks
Parameter norm penalties
Early stopping
Dropout
Optimization for DL
SGD
Momentum
Adaptive learning rates
AdaGrad
RMSProp
Adam
How to build a neural network using Python
The input layer
The hidden layer
The output layer
Forward propagation
The cross-entropy cost function
How to train a neural network
How to implement backprop using Python
How to compute the gradient
The loss function gradient
The output layer gradients
The hidden layer gradients
Putting it all together
Testing the gradients
Implementing momentum updates using Python
Training the network
How to use DL libraries
How to use Keras
How to use TensorBoard
How to use PyTorch 1.0
How to create a PyTorch DataLoader
How to define the neural network architecture
How to train the model
How to evaluate the model predictions
How to use TensorFlow 2.0
How to optimize neural network architectures
Creating a stock return series to predict asset price movement
Defining a neural network architecture with placeholders
Defining a custom loss metric for early stopping
Running GridSearchCV to tune the neural network architecture
How to further improve the results
Summary
Convolutional Neural Networks
How ConvNets work
How a convolutional layer works
The convolution stage – detecting local features
The convolution operation
How to scan the input – strides and padding
Parameter sharing
The detector stage – adding non-linearity
The pooling stage – downsampling the feature maps
Max pooling
Inspiration from neuroscience
Reference ConvNet architectures
LeNet5 – the first modern CNN (1998)
AlexNet – putting CNN on the map (2012)
VGGNet – going for smaller filters
GoogLeNet – fewer parameters through Inception
ResNet – current state-of-the-art
Benchmarks
Lessons learned
Computer vision beyond classification – detection and segmentation
How to design and train a CNN using Python
LeNet5 and MNIST using Keras
How to prepare the data
How to define the architecture
AlexNet and CIFAR10 with Keras
How to prepare the data using image augmentation
How to define the model architecture
How to use CNN with time-series data
Transfer learning – faster training with less data
How to build on a pre-trained CNN
How to extract bottleneck features
How to further train a pre-trained model
How to detect objects
Google Street View house number dataset
How to define a CNN with multiple outputs
Recent developments
Fast detection of objects on satellite images
How capsule networks capture pose
Summary
Recurrent Neural Networks
How RNNs work
Unfolding a computational graph with cycles
Backpropagation through time
Alternative RNN architectures
Output recurrence and teacher forcing
Bidirectional RNNs
Encoder-decoder architectures and the attention mechanism
How to design deep RNNs
The challenge of learning long-range dependencies
Long Short-Term Memory Units
GRUs
How to build and train RNNs using Python
Univariate time series regression
How to get time series data into shape for a RNN
How to define a two-layer RNN using a single LSTM layer
Stacked LSTMs for time series classification
How to prepare the data
How to define the architecture
Multivariate time-series regression
Loading the data
Preparing the data
Defining and training the model
LSTM and word embeddings for sentiment classification
Loading the IMDB movie review data
Defining embedding and RNN architectures
Sentiment analysis with pretrained word vectors
Preprocessing the text data
Loading the pretrained GloVe embeddings
Summary
Autoencoders and Generative Adversarial Nets
How autoencoders work
Nonlinear dimensionality reduction
Convolutional autoencoders
Sparsity constraints with regularized autoencoders
Fixing corrupted data with denoising autoencoders
Sequence-to-sequence autoencoders
Variational autoencoders
Designing and training autoencoders using Python
Preparing the data
One-layer feedforward autoencoder
Defining the enoder
Defining the decoder
Training the model
Evaluating the results
Feedforward autoencoder with sparsity constraints
Deep feedforward autoencoder
Visualizing the encoding
Convolutional autoencoders
Denoising autoencoders
How GANs work
How generative and discriminative models differ
How adversarial training works
How GAN architectures are evolving
Deep Convolutional GAN (DCGAN)
Conditional GANs
Successful and emerging GAN applications
CycleGAN – unpaired image-to-image translation
StackGAN – text-to-photo image synthesis
Photo-realistic image super-resolution
Synthetic time series with recurrent cGANs
How to build GANs using Python
Defining the discriminator network
Defining the generator network
Combining both networks to define the GAN
Adversarial training
Evaluating the results
Summary
Reinforcement Learning
Key elements of RL
Components of an interactive RL system
The policy – from states to actions
Rewards – learning from actions
The value function – good decisions for the long run
Model-free versus model-based agents
How to solve RL problems
Key challenges in solving RL problems
Credit assignment
Exploration versus exploitation
Fundamental approaches to solving RL problems
Dynamic programming – Value and Policy iteration
Finite MDPs
Sequences of states, actions, and rewards
Value functions – how to estimate the long-run reward
The Bellman equation
From a value function to an optimal policy
Policy iteration
Value iteration
Generalized policy iteration
Dynamic programming using Python
Setting up the GridWorld
Computing the transition matrix
Value iteration
Policy iteration
Solving MDPs using pymdptoolbox
Conclusion
Q-learning
The exploration-exploitation trade-off – the ε-greedy policy
The Q-learning algorithm
Training a Q-learning agent using Python
Deep reinforcement learning
Value function approximation with neural networks
The deep Q-learning algorithm and extensions
Experience replay
Slowly-changing target network
Double deep Q-learning
The Open AI Gym – the Lunar Lander environment
Double deep Q-learning using Tensorflow
The DQN architecture
Setting up the OpenAI environment
Hyperparameters
The DDQN computational graph
Performance
Reinforcement learning for trading
How to design an OpenAI trading environment
A basic trading game
How to build a deep Q-learning agent for the stock market
Summary
Next Steps
Key takeaways and lessons learned
Data is the single most important ingredient
Quality control
Data integration
Domain expertise helps unlock value in data
Feature engineering and alpha factor research
ML is a toolkit for solving problems with data
Model diagnostics help speed up optimization
Making do without a free lunch
Managing the bias-variance trade-off
Define targeted model objectives
The optimization verification test
Beware of backtest overfitting
How to gain insights from black-box models
ML for trading in practice
Data management technologies
Database systems
Big Data technologies – Hadoop and Spark
ML tools
Online trading platforms
Quantopian
QuantConnect
QuantRocket
Conclusion
Other Books You May Enjoy
Leave a review - let other readers know what you think
买过这本书的人还买过
读了这本书的人还在读
同类图书排行榜