售 价:¥
Mastering Clojure Data Analysis
Table of Contents
Mastering Clojure Data Analysis
About the Author
About the Reviewers
Support files, eBooks, discount offers, and more
Why subscribe?
Free access for Packt account holders
What this book covers
What you need for this book
Who this book is for
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
1. Network Analysis – The Six Degrees of Kevin Bacon
Analyzing social networks
Getting the data
Understanding graphs
Implementing the graphs
Loading the data
Measuring social network graphs
Average path length
Network diameter
Clustering coefficient
Degrees of separation
Visualizing the graph
Setting up ClojureScript
A force-directed layout
A hive plot
A pie chart
2. GIS Analysis – Mapping Climate Change
Understanding GIS
Mapping the climate change
Downloading and extracting the data
Downloading the files
Extracting the files
Transforming the data – filtering
Rolling averages
Reading the data
Interpolating sample points and generating heat maps using inverse distance weighting (IDW)
Working with map projections
Finding a base map
Working with ArcGIS
3. Topic Modeling – Changing Concerns in the State of the Union Addresses
Understanding data in the State of Union addresses
Understanding topic modeling
Preparing for visualizations
Setting up the project
Getting the data
Loading the data into MALLET
Visualizing with D3 and ClojureScript
Exploring the topics
Exploring topic 43
Exploring topic 26
Exploring topic 42
4. Classifying UFO Sightings
Getting the data
Extracting the data
Dealing with messy data
Visualizing UFO data
Topic modeling descriptions
Preparing the data
Reading the data into a sequence of data records
Splitting the NUFORC comments
Categorizing the documents based on the comments
Partitioning the documents into directories based on the categories
Dividing them into training and test sets
Classifying the data
Coding the classifier interface
Setting up the Pipe and InstanceList
Tying it all together
Running the classifier and examining the results
5. Benford's Law – Detecting Natural Progressions of Numbers
Learning about Benford's Law
Applying Benford's law to compound interest
Looking at the world population data
Failing Benford's Law
Case studies
6. Sentiment Analysis – Categorizing Hotel Reviews
Understanding sentiment analysis
Getting hotel review data
Exploring the data
Preparing the data
Creating feature vectors
Creating feature vector functions and POS tagging
Cross-validating the results
Calculating error rates
Using the Weka machine learning library
Connecting Weka and cross-validation
Understanding maximum entropy classifiers
Understanding naive Bayesian classifiers
Running the experiment
Examining the results
Combining the error rates
Improving the results
7. Null Hypothesis Tests – Analyzing Crime Data
Introducing confirmatory data analysis
Understanding null hypothesis testing
Understanding the process
Formulating an initial hypothesis
Stating the null and alternative hypotheses
Determining appropriate tests
Selecting the significance level
Determining the critical region
Calculating the test statistics and its probability
Deciding whether to reject the null hypothesis or not
Flipping coins
Formulating an initial hypothesis
Stating the null and alternative hypotheses
Identifying the statistical assumptions in the sample
Determining appropriate tests
Selecting the significance level
Determining the critical region
Calculating the test statistic and its probability
Deciding whether to reject the null hypothesis or not
Understanding burglary rates
Getting the data
Parsing the Excel files
Pulling out raw data
Growing a data tree
Cutting down the data tree
Putting it all together
Transforming the data
Joining the data sources
Pivoting the data
Filtering the missing data
Putting it all together
Exploring the data
Generating summary statistics
Summarizing UNODC crime data
Summarizing World Bank land area and GNI data
Generating more charts and graphs
Conducting the experiment
Formulating an initial hypothesis
Stating the null and alternative hypotheses
Identifying the statistical assumptions in the sample
Determining which tests are appropriate
Understanding Spearman's rank correlation coefficient
Selecting the significance level
Determining the critical region
Calculating the test statistic and its probability
Deciding whether to reject the null hypothesis or not
Interpreting the results
8. A/B Testing – Statistical Experiments for the Web
Defining A/B testing
Conducting an A/B test
Planning the experiment
Framing the statistics
Building the experiment
Looking at options to build the site
Implementing A/B testing on the server
Understanding the scaffolded site
Building the test site
Implementing A/B testing
Viewing the results
Looking at A/B testing as a user
Analyzing the results
Understanding the t-test
Testing coin tosses
Testing the results
9. Analyzing Social Data Participation
Setting up the project
Understanding the analyses
Understanding social network data
Understanding knowledge-based social networks
Introducing the 80/20 rule
Getting the data
Looking at the amount of data
Looking at the data format
Defining and loading the data
Counting frequencies
Sorting and ranking
Finding the patterns of participation
Matching the 80/20 rule
Looking for the 20 percent of questioners
Looking for the 20 percent of respondents
Combining ranks
Looking at those who only post questions
Looking at those who only post answers
Looking at those who post both questions and answers
Finding the up-voted answers
Processing the answers
Predicting the accepted answer
Setting up
Creating the InstanceList object
Training sets and Test sets
Evaluating the outcome
10. Modeling Stock Data
Learning about financial data analysis
Setting up the basics
Setting up the library
Getting the data
Getting prepared with data
Working with news articles
Working with stock data
Analyzing the text
Analyzing vocabulary
Stop lists
Hapax and Dis Legomena
Inspecting the stock prices
Merging text and stock features
Analyzing both text and stock features together with neural nets
Understanding neural nets
Setting up the neural net
Training the neural net
Running the neural net
Validating the neural net
Finding the best parameters
Predicting the future
Loading stock prices
Loading news articles
Creating training and test sets
Finding the best parameters for the neural network
Training and validating the neural network
Running the network on new data
Taking it with a grain of salt
Related to this project
Related to machine learning and market modeling in general