万本电子书0元读

万本电子书0元读

顶部广告

Practical Data Analysis电子书

售       价:¥

0人正在读 | 0人评论 9.8

作       者:Hector Cuesta

出  版  社:Packt Publishing

出版时间:2013-10-22

字       数:108.6万

所属分类: 进口书 > 外文原版书 > 电脑/网络

温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印

为你推荐

  • 读书简介
  • 目录
  • 累计评论(0条)
  • 读书简介
  • 目录
  • 累计评论(0条)
Each chapter of the book quickly introduces a key ‘theme’ of Data Analysis, before immersing you in the practical aspects of each theme. You’ll learn quickly how to perform all aspects of Data Analysis.Practical Data Analysis is a book ideal for home and small business users who want to slice & dice the data they have on hand with minimum hassle.
目录展开

Practical Data Analysis

Table of Contents

Practical Data Analysis

Credits

Foreword

About the Author

Acknowledgments

About the Reviewers

www.PacktPub.com

Support files, eBooks, discount offers and more

Why Subscribe?

Free Access for Packt account holders

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Errata

Piracy

Questions

1. Getting Started

Computer science

Artificial intelligence (AI)

Machine Learning (ML)

Statistics

Mathematics

Knowledge domain

Data, information, and knowledge

The nature of data

The data analysis process

The problem

Data preparation

Data exploration

Predictive modeling

Visualization of results

Quantitative versus qualitative data analysis

Importance of data visualization

What about big data?

Sensors and cameras

Social networks analysis

Tools and toys for this book

Why Python?

Why mlpy?

Why D3.js?

Why MongoDB?

Summary

2. Working with Data

Datasource

Open data

Text files

Excel files

SQL databases

NoSQL databases

Multimedia

Web scraping

Data scrubbing

Statistical methods

Text parsing

Data transformation

Data formats

CSV

Parsing a CSV file with the csv module

Parsing a CSV file using NumPy

JSON

Parsing a JSON file using json module

XML

Parsing an XML file in Python using xml module

YAML

Getting started with OpenRefine

Text facet

Clustering

Text filters

Numeric facets

Transforming data

Exporting data

Operation history

Summary

3. Data Visualization

Data-Driven Documents (D3)

HTML

DOM

CSS

JavaScript

SVG

Getting started with D3.js

Bar chart

Pie chart

Scatter plot

Single line chart

Multi-line chart

Interaction and animation

Summary

4. Text Classification

Learning and classification

Bayesian classification

Naïve Bayes algorithm

E-mail subject line tester

The algorithm

Classifier accuracy

Summary

5. Similarity-based Image Retrieval

Image similarity search

Dynamic time warping (DTW)

Processing the image dataset

Implementing DTW

Analyzing the results

Summary

6. Simulation of Stock Prices

Financial time series

Random walk simulation

Monte Carlo methods

Generating random numbers

Implementation in D3.js

Summary

7. Predicting Gold Prices

Working with the time series data

Components of a time series

Smoothing the time series

The data – historical gold prices

Nonlinear regression

Kernel ridge regression

Smoothing the gold prices time series

Predicting in the smoothed time series

Contrasting the predicted value

Summary

8. Working with Support Vector Machines

Understanding the multivariate dataset

Dimensionality reduction

Linear Discriminant Analysis

Principal Component Analysis

Getting started with support vector machine

Kernel functions

Double spiral problem

SVM implemented on mlpy

Summary

9. Modeling Infectious Disease with Cellular Automata

Introduction to epidemiology

The epidemiology triangle

The epidemic models

The SIR model

Solving ordinary differential equation for the SIR model with SciPy

The SIRS model

Modeling with cellular automata

Cell, state, grid, and neighborhood

Global stochastic contact model

Simulation of the SIRS model in CA with D3.js

Summary

10. Working with Social Graphs

Structure of a graph

Undirected graph

Directed graph

Social Networks Analysis

Acquiring my Facebook graph

Using Netvizz

Representing graphs with Gephi

Statistical analysis

Male to female ratio

Degree distribution

Histogram of a graph

Centrality

Transforming GDF to JSON

Graph visualization with D3.js

Summary

11. Sentiment Analysis of Twitter Data

The anatomy of Twitter data

Tweet

Followers

Trending topics

Using OAuth to access Twitter API

Getting started with Twython

Simple search

Working with timelines

Working with followers

Working with places and trends

Sentiment classification

Affective Norms for English Words

Text corpus

Getting started with Natural Language Toolkit (NLTK)

Bag of words

Naive Bayes

Sentiment analysis of tweets

Summary

12. Data Processing and Aggregation with MongoDB

Getting started with MongoDB

Database

Collection

Document

Mongo shell

Insert/Update/Delete

Queries

Data preparation

Data transformation with OpenRefine

Inserting documents with PyMongo

Group

The aggregation framework

Pipelines

Expressions

Summary

13. Working with MapReduce

MapReduce overview

Programming model

Using MapReduce with MongoDB

The map function

The reduce function

Using mongo shell

Using UMongo

Using PyMongo

Filtering the input collection

Grouping and aggregation

Word cloud visualization of the most common positive words in tweets

Summary

14. Online Data Analysis with IPython and Wakari

Getting started with Wakari

Creating an account in Wakari

Getting started with IPython Notebook

Data visualization

Introduction to image processing with PIL

Opening an image

Image histogram

Filtering

Operations

Transformations

Getting started with Pandas

Working with time series

Working with multivariate dataset with DataFrame

Grouping, aggregation, and correlation

Multiprocessing with IPython

Pool

Sharing your Notebook

The data

Summary

A. Setting Up the Infrastructure

Installing and running Python 3

Installing and running Python 3.2 on Ubuntu

Installing and running IDLE on Ubuntu

Installing and running Python 3.2 on Windows

Installing and running IDLE on Windows

Installing and running NumPy

Installing and running NumPy on Ubuntu

Installing and running NumPy on Windows

Installing and running SciPy

Installing and running SciPy on Ubuntu

Installing and running SciPy on Windows

Installing and running mlpy

Installing and running mlpy on Ubuntu

Installing and running mlpy on Windows

Installing and running OpenRefine

Installing and running OpenRefine on Linux

Installing and running OpenRefine on Windows

Installing and running MongoDB

Installing and running MongoDB on Ubuntu

Installing and running MongoDB on Windows

Connecting Python with MongoDB

Installing and running UMongo

Installing and running Umongo on Ubuntu

Installing and running Umongo on Windows

Installing and running Gephi

Installing and running Gephi on Linux

Installing and running Gephi on Windows

Index

累计评论(0条) 0个书友正在讨论这本书 发表评论

发表评论

发表评论,分享你的想法吧!

买过这本书的人还买过

读了这本书的人还在读

回顶部