售 价:¥
温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印
为你推荐
About the Book
About the Authors
Learning Objectives
Audience
Approach
Hardware Requirements
Software Requirements
Conventions
Installation and Setup
Working with the Jupyter Notebook
Importing Python Libraries
Installing the Code Bundle
Additional Resources
Chapter 1
Introduction to Natural Language Processing
Introduction
History of NLP
Text Analytics and NLP
Exercise 1: Basic Text Analytics
Various Steps in NLP
Tokenization
Exercise 2: Tokenization of a Simple Sentence
PoS Tagging
Exercise 3: PoS Tagging
Stop Word Removal
Exercise 4: Stop Word Removal
Text Normalization
Exercise 5: Text Normalization
Spelling Correction
Exercise 6: Spelling Correction of a Word and a Sentence
Stemming
Exercise 7: Stemming
Lemmatization
Exercise 8: Extracting the base word using Lemmatization
NER
Exercise 9: Treating Named Entities
Word Sense Disambiguation
Exercise 10: Word Sense Disambiguation
Sentence Boundary Detection
Exercise 11: Sentence Boundary Detection
Activity 1: Preprocessing of Raw Text
Kick Starting an NLP Project
Data Collection
Data Preprocessing
Feature Extraction
Model Development
Model Assessment
Model Deployment
Summary
Chapter 2
Basic Feature Extraction Methods
Introduction
Types of Data
Categorizing Data Based on Structure
Categorization of Data Based on Content
Cleaning Text Data
Tokenization
Exercise 12: Text Cleaning and Tokenization
Exercise 13: Extracting n-grams
Exercise 14: Tokenizing Texts with Different Packages – Keras and TextBlob
Types of Tokenizers
Exercise 15: Tokenizing Text Using Various Tokenizers
Issues with Tokenization
Stemming
RegexpStemmer
Exercise 16: Converting words in gerund form into base words using RegexpStemmer
The Porter Stemmer
Exercise 17: The Porter Stemmer
Lemmatization
Exercise 18: Lemmatization
Exercise 19: Singularizing and Pluralizing Words
Language Translation
Exercise 20: Language Translation
Stop-Word Removal
Exercise 21: Stop-Word Removal
Feature Extraction from Texts
Extracting General Features from Raw Text
Exercise 22: Extracting General Features from Raw Text
Activity 2: Extracting General Features from Text
Bag of Words
Exercise 23: Creating a BoW
Zipf's Law
Exercise 24: Zipf's Law
TF-IDF
Exercise 25: TF-IDF Representation
Activity 3: Extracting Specific Features from Texts
Feature Engineering
Exercise 26: Feature Engineering (Text Similarity)
Word Clouds
Exercise 27: Word Clouds
Other Visualizations
Exercise 28: Other Visualizations (Dependency Parse Trees and Named Entities)
Activity 4: Text Visualization
Summary
Chapter 3
Developing a Text classifier
Introduction
Machine Learning
Unsupervised Learning
Hierarchical Clustering
Exercise 29: Hierarchical Clustering
K-Means Clustering
Exercise 30: K-Means Clustering
Supervised Learning
Classification
Logistic Regression
Naive Bayes Classifiers
K-Nearest Neighbors
Exercise 31: Text Classification (Logistic regression, Naive Bayes, and KNN)
Regression
Linear Regression
Exercise 32: Regression Analysis Using Textual Data
Tree Methods
Random Forest
GBM and XGBoost
Exercise 33: Tree-Based Methods (Decision Tree, Random Forest, GBM, and XGBoost)
Sampling
Exercise 34: Sampling (Simple Random, Stratified, Multi-Stage)
Developing a Text Classifier
Feature Extraction
Feature Engineering
Removing Correlated Features
Exercise 35: Removing Highly Correlated Features (Tokens)
Dimensionality Reduction
Exercise 36: Dimensionality Reduction (PCA)
Deciding on a Model Type
Evaluating the Performance of a Model
Exercise 37: Calculate the RMSE and MAPE
Activity 5: Developing End-to-End Text Classifiers
Building Pipelines for NLP Projects
Exercise 38: Building Pipelines for NLP Projects
Saving and Loading Models
Exercise 39: Saving and Loading Models
Summary
Chapter 4
Collecting Text Data from the Web
Introduction
Collecting Data by Scraping Web Pages
Exercise 40: Extraction of Tag-Based Information from HTML Files
Requesting Content from Web Pages
Exercise 41: Collecting Online Text Data
Exercise 42: Analyzing the Content of Jupyter Notebooks (in HTML Format)
Activity 6: Extracting Information from an Online HTML Page
Activity 7: Extracting and Analyzing Data Using Regular Expressions
Dealing with Semi-Structured Data
JSON
Exercise 43: Dealing with JSON Files
Activity 8: Dealing with Online JSON Files
XML
Exercise 44: Dealing with a Local XML File
Using APIs to Retrieve Real-Time Data
Exercise 45: Collecting Data Using APIs
API Creation
Activity 9: Extracting Data from Twitter
Extracting Data from Local Files
Exercise 46: Extracting Data from Local Files
Exercise 47: Performing Various Operations on Local Files
Summary
Chapter 5
Topic Modeling
Introduction
Topic Discovery
Discovering Themes
Exploratory Data Analysis
Document Clustering
Dimensionality Reduction
Historical Analysis
Bag of Words
Topic Modeling Algorithms
Latent Semantic Analysis
LSA – How It Works
Exercise 48: Analyzing Reuters News Articles with Latent Semantic Analysis
Latent Dirichlet Allocation
LDA – How It Works
Exercise 49: Topics in Airline Tweets
Topic Fingerprinting
Exercise 50: Visualizing Documents Using Topic Vectors
Activity 10: Topic Modelling Jeopardy Questions
Summary
Chapter 6
Text Summarization and Text Generation
Introduction
What is Automated Text Summarization?
Benefits of Automated Text Summarization
High-Level View of Text Summarization
Purpose
Input
Output
Extractive Text Summarization
Abstractive Text Summarization
Sequence to Sequence
Encoder Decoder
TextRank
Exercise 51: TextRank from Scratch
Summarizing Text Using Gensim
Activity 11: Summarizing a Downloaded Page Using the Gensim Text Summarizer
Summarizing Text Using Word Frequency
Exercise 52: Word Frequency Text Summarization
Generating Text with Markov Chains
Markov Chains
Exercise 53: Generating Text Using Markov Chains
Summary
Chapter 7
Vector Representation
Introduction
Vector Definition
Why Vector Representations?
Encoding
Character-Level Encoding
Exercise 54: Character Encoding Using ASCII Values
Exercise 55: Character Encoding with the Help of NumPy Arrays
Positional Character-Level Encoding
Exercise 56: Character-Level Encoding Using Positions
One-Hot Encoding
Key Steps in One-Hot Encoding
Exercise 57: Character One-Hot Encoding – Manual
Exercise 58: Character-Level One-Hot Encoding with Keras
Word-Level One-Hot Encoding
Exercise 59: Word-Level One-Hot Encoding
Word Embeddings
Word2Vec
Exercise 60: Training Word Vectors
Using Pre-Trained Word Vectors
Exercise 61: Loading Pre-Trained Word Vectors
Document Vectors
Uses of Document Vectors
Exercise 62: From Movie Dialogue to Document Vectors
Activity 12: Finding Similar Movie Lines Using Document Vectors
Summary
Chapter 8
Sentiment Analysis
Introduction
Why is Sentiment Analysis Required?
Growth of Sentiment Analysis
Monetization of Emotion
Types of Sentiments
Key Ideas and Terms
Applications of Sentiment Analysis
Tools Used for Sentiment Analysis
NLP Services from Major Cloud Providers
Online Marketplaces
Python NLP Libraries
Deep Learning Libraries
TextBlob
Exercise 63: Basic Sentiment Analysis Using the TextBlob Library
Activity 13: Tweet Sentiment Analysis Using the TextBlob library
Understanding Data for Sentiment Analysis
Exercise 64: Loading Data for Sentiment Analysis
Training Sentiment Models
Exercise 65: Training a Sentiment Model Using TFIDF and Logistic Regression
Summary
Appendix
Chapter 1: Introduction to Natural Language Processing
Activity 1: Preprocessing of Raw Text
Chapter 2: Basic Feature Extraction Methods
Activity 2: Extracting General Features from Text
Activity 3: Extracting Specific Features from Texts
Activity 4: Text Visualization
Chapter 3: Developing a Text classifier
Activity 5: Developing End-to-End Text Classifiers
Chapter 4: Collecting Text Data from the Web
Activity 6: Extracting Information from an Online HTML Page
Activity 7: Extracting and Analyzing Data Using Regular Expressions
Activity 8: Dealing with Online JSON Files
Activity 9: Extracting Data from Twitter
Chapter 5: Topic Modeling
Activity 10: Topic Modelling Jeopardy Questions
Chapter 6: Text Summarization and Text Generation
Activity 11: Summarizing a Downloaded Page Using the Gensim Text Summarizer
Chapter 7: Vector Representation
Activity 12: Finding Similar Movie Lines Using Document Vectors
Solution
Chapter 8: Sentiment Analysis
Activity 13: Tweet Sentiment Analysis Using the TextBlob library
买过这本书的人还买过
读了这本书的人还在读
同类图书排行榜