万本电子书0元读

万本电子书0元读

顶部广告

Hands-On Machine Learning for Cybersecurity电子书

售       价:¥

1人正在读 | 0人评论 9.8

作       者:Soma Halder

出  版  社:Packt Publishing

出版时间:2018-12-31

字       数:30.5万

所属分类: 进口书 > 外文原版书 > 电脑/网络

温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印

为你推荐

  • 读书简介
  • 目录
  • 累计评论(0条)
  • 读书简介
  • 目录
  • 累计评论(0条)
Get into the world of smart data security using machine learning algorithms and Python libraries Key Features *Learn machine learning algorithms and cybersecurity fundamentals *Automate your daily workflow by applying use cases to many facets of security *Implement smart machine learning solutions to detect various cybersecurity problems Book Description Cyber threats today are one of the costliest losses that an organization can face. In this book, we use the most efficient tool to solve the big problems that exist in the cybersecurity domain. The book begins by giving you the basics of ML in cybersecurity using Python and its libraries. You will explore various ML domains (such as time series analysis and ensemble modeling) to get your foundations right. You will implement various examples such as building system to identify malicious URLs, and building a program to detect fraudulent emails and spam. Later, you will learn how to make effective use of K-means algorithm to develop a solution to detect and alert you to any malicious activity in the network. Also learn how to implement biometrics and fingerprint to validate whether the user is a legitimate user or not. Finally, you will see how we change the game with TensorFlow and learn how deep learning is effective for creating models and training systems What you will learn *Use machine learning algorithms with complex datasets to implement cybersecurity concepts *Implement machine learning algorithms such as clustering, k-means, and Naive Bayes to solve real-world problems *Learn to speed up a system using Python libraries with NumPy, Scikit-learn, and CUDA *Understand how to combat malware, detect spam, and fight financial fraud to mitigate cyber crimes *Use TensorFlow in the cybersecurity domain and implement real-world examples *Learn how machine learning and Python can be used in complex cyber issues Who this book is for This book is for the data scientists, machine learning developers, security researchers, and anyone keen to apply machine learning to up-skill computer security. Having some working knowledge of Python and being familiar with the basics of machine learning and cybersecurity fundamentals will help to get the most out of the book
目录展开

Title Page

Copyright and Credits

Hands-On Machine Learning for Cybersecurity

About Packt

Why subscribe?

Packt.com

Contributors

About the authors

About the reviewers

Packt is searching for authors like you

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Reviews

Basics of Machine Learning in Cybersecurity

What is machine learning?

Problems that machine learning solves

Why use machine learning in cybersecurity?

Current cybersecurity solutions

Data in machine learning

Structured versus unstructured data

Labelled versus unlabelled data

Machine learning phases

Inconsistencies in data

Overfitting

Underfitting

Different types of machine learning algorithm

Supervised learning algorithms

Unsupervised learning algorithms

Reinforcement learning

Another categorization of machine learning

Classification problems

Clustering problems

Regression problems

Dimensionality reduction problems

Density estimation problems

Deep learning

Algorithms in machine learning

Support vector machines

Bayesian networks

Decision trees

Random forests

Hierarchical algorithms

Genetic algorithms

Similarity algorithms

ANNs

The machine learning architecture

Data ingestion

Data store

The model engine

Data preparation

Feature generation

Training

Testing

Performance tuning

Mean squared error

Mean absolute error

Precision, recall, and accuracy

How can model performance be improved?

Fetching the data to improve performance

Switching machine learning algorithms

Ensemble learning to improve performance

Hands-on machine learning

Python for machine learning

Comparing Python 2.x with 3.x

Python installation

Python interactive development environment

Jupyter Notebook installation

Python packages

NumPy

SciPy

Scikit-learn

pandas

Matplotlib

Mongodb with Python

Installing MongoDB

PyMongo

Setting up the development and testing environment

Use case

Data

Code

Summary

Time Series Analysis and Ensemble Modeling

What is a time series?

Time series analysis

Stationarity of a time series models

Strictly stationary process

Correlation in time series

Autocorrelation

Partial autocorrelation function

Classes of time series models

Stochastic time series model

Artificial neural network time series model

Support vector time series models

Time series components

Systematic models

Non-systematic models

Time series decomposition

Level

Trend

Seasonality

Noise

Use cases for time series

Signal processing

Stock market predictions

Weather forecasting

Reconnaissance detection

Time series analysis in cybersecurity

Time series trends and seasonal spikes

Detecting distributed denial of series with time series

Dealing with the time element in time series

Tackling the use case

Importing packages

Importing data in pandas

Data cleansing and transformation

Feature computation

Predicting DDoS attacks

ARMA

ARIMA

ARFIMA

Ensemble learning methods

Types of ensembling

Averaging

Majority vote

Weighted average

Types of ensemble algorithm

Bagging

Boosting

Stacking

Bayesian parameter averaging

Bayesian model combination

Bucket of models

Cybersecurity with ensemble techniques

Voting ensemble method to detect cyber attacks

Summary

Segregating Legitimate and Lousy URLs

Introduction to the types of abnormalities in URLs

URL blacklisting

Drive-by download URLs

Command and control URLs

Phishing URLs

Using heuristics to detect malicious pages

Data for the analysis

Feature extraction

Lexical features

Web-content-based features

Host-based features

Site-popularity features

Using machine learning to detect malicious URLs

Logistic regression to detect malicious URLs

Dataset

Model

TF-IDF

SVM to detect malicious URLs

Multiclass classification for URL classification

One-versus-rest

Summary

Knocking Down CAPTCHAs

Characteristics of CAPTCHA

Using artificial intelligence to crack CAPTCHA

Types of CAPTCHA

reCAPTCHA

No CAPTCHA reCAPTCHA

Breaking a CAPTCHA

Solving CAPTCHAs with a neural network

Dataset

Packages

Theory of CNN

Model

Code

Training the model

Testing the model

Summary

Using Data Science to Catch Email Fraud and Spam

Email spoofing

Bogus offers

Requests for help

Types of spam emails

Deceptive emails

CEO fraud

Pharming

Dropbox phishing

Google Docs phishing

Spam detection

Types of mail servers

Data collection from mail servers

Using the Naive Bayes theorem to detect spam

Laplace smoothing

Featurization techniques that convert text-based emails into numeric values

Log-space

TF-IDF

N-grams

Tokenization

Logistic regression spam filters

Logistic regression

Dataset

Python

Results

Summary

Efficient Network Anomaly Detection Using k-means

Stages of a network attack

Phase 1 – Reconnaissance

Phase 2 – Initial compromise

Phase 3 – Command and control

Phase 4 – Lateral movement

Phase 5 – Target attainment

Phase 6 – Ex-filtration, corruption, and disruption

Dealing with lateral movement in networks

Using Windows event logs to detect network anomalies

Logon/Logoff events

Account logon events

Object access events

Account management events

Active directory events

Ingesting active directory data

Data parsing

Modeling

Detecting anomalies in a network with k-means

Network intrusion data

Coding the network intrusion attack

Model evaluation

Sum of squared errors

Choosing k for k-means

Normalizing features

Manual verification

Summary

Decision Tree and Context-Based Malicious Event Detection

Adware

Bots

Bugs

Ransomware

Rootkit

Spyware

Trojan horses

Viruses

Worms

Malicious data injection within databases

Malicious injections in wireless sensors

Use case

The dataset

Importing packages

Features of the data

Model

Decision tree

Types of decision trees

Categorical variable decision tree

Continuous variable decision tree

Gini coeffiecient

Random forest

Anomaly detection

Isolation forest

Supervised and outlier detection with Knowledge Discovery Databases (KDD)

Revisiting malicious URL detection with decision trees

Summary

Catching Impersonators and Hackers Red Handed

Understanding impersonation

Different types of impersonation fraud

Impersonators gathering information

How an impersonation attack is constructed

Using data science to detect domains that are impersonations

Levenshtein distance

Finding domain similarity between malicious URLs

Authorship attribution

AA detection for tweets

Difference between test and validation datasets

Sklearn pipeline

Naive Bayes classifier for multinomial models

Identifying impersonation as a means of intrusion detection

Summary

Changing the Game with TensorFlow

Introduction to TensorFlow

Installation of TensorFlow

TensorFlow for Windows users

Hello world in TensorFlow

Importing the MNIST dataset

Computation graphs

What is a computation graph?

Tensor processing unit

Using TensorFlow for intrusion detection

Summary

Financial Fraud and How Deep Learning Can Mitigate It

Machine learning to detect financial fraud

Imbalanced data

Handling imbalanced datasets

Random under-sampling

Random oversampling

Cluster-based oversampling

Synthetic minority oversampling technique

Modified synthetic minority oversampling technique

Detecting credit card fraud

Logistic regression

Loading the dataset

Approach

Logistic regression classifier – under-sampled data

Tuning hyperparameters

Detailed classification reports

Predictions on test sets and plotting a confusion matrix

Logistic regression classifier – skewed data

Investigating precision-recall curve and area

Deep learning time

Adam gradient optimizer

Summary

Case Studies

Introduction to our password dataset

Text feature extraction

Feature extraction with scikit-learn

Using the cosine similarity to quantify bad passwords

Putting it all together

Summary

Other Books You May Enjoy

Leave a review - let other readers know what you think

累计评论(0条) 0个书友正在讨论这本书 发表评论

发表评论

发表评论,分享你的想法吧!

买过这本书的人还买过

读了这本书的人还在读

回顶部