万本电子书0元读

万本电子书0元读

顶部广告

Mastering Python Data Analysis电子书

售       价:¥

4人正在读 | 0人评论 9.8

作       者:Magnus Vilhelm Persson,Luiz Felipe Martins

出  版  社:Packt Publishing

出版时间:2016-06-01

字       数:161.8万

所属分类: 进口书 > 外文原版书 > 电脑/网络

温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印

为你推荐

  • 读书简介
  • 目录
  • 累计评论(0条)
  • 读书简介
  • 目录
  • 累计评论(0条)
Become an expert at using Python for advanced statistical analysis of data using real-world examples About This Book Clean, format, and explore data using graphical and numerical summaries Leverage the IPython environment to efficiently analyze data with Python Packed with easy-to-follow examples to develop advanced computational skills for the analysis of complex data Who This Book Is For If you are a competent Python developer who wants to take your data analysis skills to the next level by solving complex problems, then this advanced guide is for you. Familiarity with the basics of applying Python libraries to data sets is assumed. What You Will Learn Read, sort, and map various data into Python and Pandas Recognise patterns so you can understand and explore data Use statistical models to discover patterns in data Review classical statistical inference using Python, Pandas, and SciPy Detect similarities and differences in data with clustering Clean your data to make it useful Work in Jupyter Notebook to produce publication ready figures to be included in reports In Detail Python, a multi-paradigm programming language, has become the language of choice for data scientists for data analysis, visualization, and machine learning. Ever imagined how to become an expert at effectively approaching data analysis problems, solving them, and extracting all of the available information from your dataWell, look no further, this is the book you want! Through this comprehensive guide, you will explore data and present results and conclusions from statistical analysis in a meaningful way. You’ll be able to quickly and accurately perform the hands-on sorting, reduction, and subsequent analysis, and fully appreciate how data analysis methods can support business decision-making. You’ll start off by learning about the tools available for data analysis in Python and will then explore the statistical models that are used to identify patterns in data. Gradually, you’ll move on to review statistical inference using Python, Pandas, and SciPy. After that, we’ll focus on performing regression using computational tools and you’ll get to understand the problem of identifying clusters in data in an algorithmic way. Finally, we delve into advanced techniques to quantify cause and effect using Bayesian methods and you’ll discover how to use Python’s tools for supervised machine learning. Style and approach This book takes a step-by-step approach to reading, processing, and analyzing data in Python using various methods and tools. Rich in examples, each topic connects to real-world examples and retrieves data directly online where possible. With this book, you are given the knowledge and tools to explore any data on your own, encouraging a curiosity befitting all data scientists.
目录展开

Mastering Python Data Analysis

Mastering Python Data Analysis

Credits

About the Authors

About the Reviewer

www.PacktPub.com

Why subscribe?

Free access for Packt account holders

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Downloading the color images of this book

Errata

Piracy

Questions

1. Tools of the Trade

Before you start

Using the notebook interface

Imports

An example using the Pandas library

Summary

2. Exploring Data

The General Social Survey

Obtaining the data

Reading the data

Univariate data

Histograms

Making things pretty

Characterization

Concept of statistical inference

Numeric summaries and boxplots

Relationships between variables – scatterplots

Summary

3. Learning About Models

Models and experiments

The cumulative distribution function

Working with distributions

The probability density function

Where do models come from?

Multivariate distributions

Summary

4. Regression

Introducing linear regression

Getting the dataset

Testing with linear regression

Multivariate regression

Adding economic indicators

Taking a step back

Logistic regression

Some notes

Summary

5. Clustering

Introduction to cluster finding

Starting out simple – John Snow on cholera

K-means clustering

Suicide rate versus GDP versus absolute latitude

Hierarchical clustering analysis

Reading in and reducing the data

Hierarchical cluster algorithm

Summary

6. Bayesian Methods

The Bayesian method

Credible versus confidence intervals

Bayes formula

Python packages

U.S. air travel safety record

Getting the NTSB database

Binning the data

Bayesian analysis of the data

Binning by month

Plotting coordinates

Cartopy

Mpl toolkits – basemap

Climate change - CO2 in the atmosphere

Getting the data

Creating and sampling the model

Summary

7. Supervised and Unsupervised Learning

Introduction to machine learning

Scikit-learn

Linear regression

Climate data

Checking with Bayesian analysis and OLS

Clustering

Seeds classification

Visualizing the data

Feature selection

Classifying the data

The SVC linear kernel

The SVC Radial Basis Function

The SVC polynomial

K-Nearest Neighbour

Random Forest

Choosing your classifier

Summary

8. Time Series Analysis

Introduction

Pandas and time series data

Indexing and slicing

Resampling, smoothing, and other estimates

Stationarity

Patterns and components

Decomposing components

Differencing

Time series models

Autoregressive – AR

Moving average – MA

Selecting p and q

Automatic function

The (Partial) AutoCorrelation Function

Autoregressive Integrated Moving Average – ARIMA

Summary

A. More on Jupyter Notebook and matplotlib Styles

Jupyter Notebook

Useful keyboard shortcuts

Command mode shortcuts

Edit mode shortcuts

Markdown cells

Notebook Python extensions

Installing the extensions

Codefolding

Collapsible headings

Help panel

Initialization cells

NbExtensions menu item

Ruler

Skip-traceback

Table of contents

Other Jupyter Notebook tips

External connections

Export

Additional file types

Matplotlib styles

Useful resources

General resources

Packages

Data repositories

Visualization of data

Summary

累计评论(0条) 0个书友正在讨论这本书 发表评论

发表评论

发表评论,分享你的想法吧!

买过这本书的人还买过

读了这本书的人还在读

回顶部