万本电子书0元读

万本电子书0元读

顶部广告

R Data Analysis Cookbook - Second Edition电子书

售       价:¥

1人正在读 | 0人评论 9.8

作       者:Kuntal Ganguly

出  版  社:Packt Publishing

出版时间:2017-09-20

字       数:51.5万

所属分类: 进口书 > 外文原版书 > 电脑/网络

温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印

为你推荐

  • 读书简介
  • 目录
  • 累计评论(0条)
  • 读书简介
  • 目录
  • 累计评论(0条)
Over 80 recipes to help you breeze through your data analysis projects using R About This Book ? Analyse your data using the popular R packages like ggplot2 with ready-to-use and customizable recipes ? Find meaningful insights from your data and generate dynamic reports ? A practical guide to help you put your data analysis skills in R to practical use Who This Book Is For This book is for data scientists, analysts and even enthusiasts who want to learn and implement the various data analysis techniques using R in a practical way. Those looking for quick, handy solutions to common tasks and challenges in data analysis will find this book to be very useful. Basic knowledge of statistics and R programming is assumed. What You Will Learn ? Acquire, format and visualize your data using R ? Using R to perform an Exploratory data analysis ? Introduction to machine learning algorithms such as classification and regression ? Get started with social network analysis ? Generate dynamic reporting with Shiny ? Get started with geospatial analysis ? Handling large data with R using Spark and MongoDB ? Build Recommendation system- Collaborative Filtering, Content based and Hybrid ? Learn real world dataset examples- Fraud Detection and Image Recognition In Detail Data analytics with R has emerged as a very important focus for organizations of all kinds. R enables even those with only an intuitive grasp of the underlying concepts, without a deep mathematical background, to unleash powerful and detailed examinations of their data. This book will show you how you can put your data analysis skills in R to practical use, with recipes catering to the basic as well as advanced data analysis tasks. Right from acquiring your data and preparing it for analysis to the more complex data analysis techniques, the book will show you how you can implement each technique in the best possible manner. You will also visualize your data using the popular R packages like ggplot2 and gain hidden insights from it. Starting with implementing the basic data analysis concepts like handling your data to creating basic plots, you will master the more advanced data analysis techniques like performing cluster analysis, and generating effective analysis reports and visualizations. Throughout the book, you will get to know the common problems and obstacles you might encounter while implementing each of the data analysis techniques in R, with ways to overcoming them in the easiest possible way. By the end of this book, you will have all the knowledge you need to become an expert in data analysis with R, and put your skills to test in real-world scenarios. Style and Approach ? Hands-on recipes to walk through data science challenges using R ? Your one-stop solution for common and not-so-common pain points while performing real-world problems to execute a series of tasks. ? Addressing your common and not-so-common pain points, this is a book that you must have on the shelf
目录展开

Title Page

R Data Analysis Cookbook

Second Edition

Copyright

R Data Analysis Cookbook

Second Edition

Credits

About the Author

About the Reviewers

www.PacktPub.com

Why subscribe?

Customer Feedback

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Downloading the color images of this book

Errata

Piracy

Questions

Acquire and Prepare the Ingredients - Your Data

Introduction

Working with data

Reading data from CSV files

Getting ready

How to do it...

How it works...

There's more...

Handling different column delimiters

Handling column headers/variable names

Handling missing values

Reading strings as characters and not as factors

Reading data directly from a website

Reading XML data

Getting ready

How to do it...

How it works...

There's more...

Extracting HTML table data from a web page

Extracting a single HTML table from a web page

Reading JSON data

Getting ready

How to do it...

How it works...

Reading data from fixed-width formatted files

Getting ready

How to do it...

How it works...

There's more...

Files with headers

Excluding columns from data

Reading data from R files and R libraries

Getting ready

How to do it...

How it works...

There's more...

Saving all objects in a session

Saving objects selectively in a session

Attaching/detaching R data files to an environment

Listing all datasets in loaded packages

Removing cases with missing values

Getting ready

How to do it...

How it works...

There's more...

Eliminating cases with NA for selected variables

Finding cases that have no missing values

Converting specific values to NA

Excluding NA values from computations

Replacing missing values with the mean

Getting ready

How to do it...

How it works...

There's more...

Imputing random values sampled from non-missing values

Removing duplicate cases

Getting ready

How to do it...

How it works...

There's more...

Identifying duplicates without deleting them

Rescaling a variable to specified min-max range

Getting ready

How to do it...

How it works...

There's more...

Rescaling many variables at once

See also

Normalizing or standardizing data in a data frame

Getting ready

How to do it...

How it works...

There's more...

Standardizing several variables simultaneously

See also

Binning numerical data

Getting ready

How to do it...

How it works...

There's more...

Creating a specified number of intervals automatically

Creating dummies for categorical variables

Getting ready

How to do it...

How it works...

There's more...

Choosing which variables to create dummies for

Handling missing data

Getting ready

How to do it...

How it works...

There's more...

Understanding missing data pattern

Correcting data

Getting ready

How to do it...

How it works...

There's more...

Combining multiple columns to single columns

Splitting single column to multiple columns

Imputing data

Getting ready

How to do it...

How it works...

There's more...

Detecting outliers

Getting ready

How to do it...

How it works...

There's more...

Treating the outliers with mean/median imputation

Handling extreme values with capping

Transforming and binning values

Outlier detection with LOF

What's in There - Exploratory Data Analysis

Introduction

Creating standard data summaries

Getting ready

How to do it...

How it works...

There's more...

Using the str() function for an overview of a data frame

Computing the summary and the str() function for a single variable

Finding other measures

Extracting a subset of a dataset

Getting ready

How to do it...

How it works...

There's more...

Excluding columns

Selecting based on multiple values

Selecting using logical vector

Splitting a dataset

Getting ready

How to do it...

How it works...

Creating random data partitions

Getting ready

How to do it...

Case 1 - Numerical target variable and two partitions

Case 2 - Numerical target variable and three partitions

Case 3 - Categorical target variable and two partitions

Case 4 - Categorical target variable and three partitions

How it works...

There's more...

Using a convenience function for partitioning

Sampling from a set of values

Generating standard plots, such as histograms, boxplots, and scatterplots

Getting ready

How to do it...

Creating histograms

Creating boxplots

Creating scatterplots

Creating scatterplot matrices

How it works...

Histograms

Boxplots

There's more...

Overlay a density plot on a histogram

Overlay a regression line on a scatterplot

Color specific points on a scatterplot

Generating multiple plots on a grid

Getting ready

How to do it...

How it works...

Graphics parameters

Creating plots with the lattice package

Getting ready

How to do it...

How it works...

There's more...

Adding flair to your graphs

See also

Creating charts that facilitate comparisons

Getting ready

How to do it...

Using base plotting system

How it works...

There's more...

Creating beanplots with the beanplot package

See also

Creating charts that help to visualize possible causality

Getting ready

How to do it...

How it works...

See also

Where Does It Belong? Classification

Introduction

Generating error/classification confusion matrices

Getting ready

How to do it...

How it works...

There's more...

Visualizing the error/classification confusion matrix

Comparing the model's performance for different classes

Principal Component Analysis

Getting ready

How to do it...

How it works...

Generating receiver operating characteristic charts

Getting ready

How to do it...

How it works...

There's more...

Using arbitrary class labels

Building, plotting, and evaluating with classification trees

Getting ready

How to do it...

How it works...

There's more...

Computing raw probabilities

Creating the ROC chart

See also

Using random forest models for classification

Getting ready

How to do it...

How it works...

There's more...

Computing raw probabilities

Generating the ROC chart

Specifying cutoffs for classification

See also

Classifying using the support vector machine approach

Getting ready

How to do it...

How it works...

There's more...

Controlling the scaling of variables

Determining the type of SVM model

Assigning weights to the classes

Choosing the cost of SVM

Tuning the SVM

See also

Classifying using the Naive Bayes approach

Getting ready

How to do it...

How it works...

See also

Classifying using the KNN approach

Getting ready

How to do it...

How it works...

There's more...

Automating the process of running KNN for many k values

Selecting appropriate values of k using caret

Using KNN to compute raw probabilities instead of classifications

Using neural networks for classification

Getting ready

How to do it...

How it works...

There's more...

Exercising greater control over nnet

Generating raw probabilities and plotting the ROC curve

Classifying using linear discriminant function analysis

Getting ready

How to do it...

How it works...

There's more...

Using the formula interface for lda

See also

Classifying using logistic regression

Getting ready

How to do it...

How it works...

Text classification for sentiment analysis

Getting ready

How to do it...

How it works...

Give Me a Number - Regression

Introduction

Computing the root-mean-square error

Getting ready

How to do it...

How it works...

There's more...

Using a convenience function to compute the RMS error

Building KNN models for regression

Getting ready

How to do it...

How it works...

There's more...

Running KNN with cross-validation in place of a validation partition

Using a convenience function to run KNN

Using a convenience function to run KNN for multiple k values

See also

Performing linear regression

Getting ready

How to do it...

How it works...

There's more...

Forcing lm to use a specific factor level as the reference

Using other options in the formula expression for linear models

See also

Performing variable selection in linear regression

Getting ready

How to do it...

How it works...

See also

Building regression trees

Getting ready

How to do it...

How it works...

There's more...

Generating regression trees for data with categorical predictors

Generating regression trees using the ensemble method - Bagging and Boosting

See also

Building random forest models for regression

Getting ready

How to do it...

How it works...

There's more...

Controlling forest generation

See also

Using neural networks for regression

Getting ready

How to do it...

How it works...

See also

Performing k-fold cross-validation

Getting ready

How to do it...

How it works...

See also

Performing leave-one-out cross-validation to limit overfitting

How to do it...

How it works...

See also

Can you Simplify That? Data Reduction Techniques

Introduction

Performing cluster analysis using hierarchical clustering

Getting ready

How to do it...

How it works...

There's more...

Cutting trees into clusters

Getting ready

How to do it...

How it works...

Performing cluster analysis using partitioning clustering

Getting ready

How to do it...

How it works...

There's more...

Image segmentation using mini-batch K-means

Getting ready

How to do it...

Partitioning around medoids

Getting ready

How to do it...

How it works...

Clustering large application

Getting ready

How to do it...

How it works...

Performing cluster validation

Getting ready

How to do it...

How it works...

Performing Advance clustering

Density-based spatial clustering of applications with noise

Getting ready

How to do it...

How it works...

Model-based clustering with the EM algorithm

Getting ready

How to do it...

How it works...

Reducing dimensionality with principal component analysis

Getting ready

How to do it...

How it works...

Lessons from History - Time Series Analysis

Introduction

Exploring finance datasets

Getting ready

How to do it...

How it works...

There's more...

Creating and examining date objects

Getting ready

How to do it...

How it works...

Operating on date objects

Getting ready

How to do it...

How it works...

See also

Performing preliminary analyses on time series data

Getting ready

How to do it...

How it works...

See also

Using time series objects

Getting ready

How to do it...

How it works...

See also

Decomposing time series

Getting ready

How to do it...

How it works...

See also

Filtering time series data

Getting ready

How to do it...

How it works...

See also

Smoothing and forecasting using the Holt-Winters method

Getting ready

How to do it...

How it works...

See also

Building an automated ARIMA model

Getting ready

How to do it...

How it works...

See also

How does it look? - Advanced data visualization

Introduction

Creating scatter plots

Getting ready

How to do it...

How it works...

There's more...

Graph using qplot

Creating line graphs

Getting ready

How to do it...

How it works...

Creating bar graphs

Getting ready

How to do it...

Creating bar charts with ggplot2

How it works...

Making distributions plots

Getting ready

How to do it...

How it works...

Creating mosaic graphs

Getting ready

How to do it...

How it works...

Making treemaps

Getting ready

How to do it...

How it works...

Plotting a correlations matrix

Getting ready

How to do it...

How it works...

There's more...

Visualizing a correlation matrix with ggplot2

Creating heatmaps

Getting ready

How to do it...

How it works...

There's more...

Plotting a heatmap over geospatial data

See also

Plotting network graphs

Getting ready

How to do it...

How it works...

See also

Labeling and legends

Getting ready

How to do it...

How it works...

Coloring and themes

Getting ready

How to do it...

How it works...

Creating multivariate plots

Getting ready

How to do it...

How it works...

There's more...

Multivariate plots with the GGally package

Creating 3D graphs and animation

Getting ready

How to do it...

How it works...

There's more...

Adding text to an existing 3D plot

Using a 3D histogram

Using a line graph

Selecting a graphics device

Getting ready

How to do it...

How it works...

This may also interest you - Building Recommendations

Introduction

Building collaborative filtering systems

Getting ready

How to do it...

How it works...

There's more...

Using collaborative filtering on binary data

Performing content-based systems

Getting ready

How to do it...

How it works...

Building hybrid systems

Getting ready

How to do it...

How it works...

Performing similarity measures

Getting ready

How to do it...

How it works...

Application of ML algorithms - image recognition system

Getting ready

How to do it...

How it works...

Evaluating models and optimization

Getting ready

How to do it...

How it works...

There's more...

Identifying a suitable model

Optimizing parameters

A practical example - fraud detection system

Getting ready

How to do it...

How it works...

It's All About Your Connections - Social Network Analysis

Introduction

Downloading social network data using public APIs

Getting ready

How to do it...

How it works...

See also

Creating adjacency matrices and edge lists

Getting ready

How to do it...

How it works...

See also

Plotting social network data

Getting ready

How to do it...

How it works...

There's more...

Specifying plotting preferences

Plotting directed graphs

Creating a graph object with weights

Extracting the network as an adjacency matrix from the graph object

Extracting an adjacency matrix with weights

Extracting an edge list from a graph object

Creating a bipartite network graph

Generating projections of a bipartite network

Computing important network metrics

Getting ready

How to do it...

How it works...

There's more...

Getting edge sequences

Getting immediate and distant neighbors

Adding vertices or nodes

Adding edges

Deleting isolates from a graph

Creating subgraphs

Cluster analysis

Getting ready

How to do it...

How it works...

Force layout

Getting ready

How to do it...

How it works...

There's more...

Force Atlas 2

YiFan Hu layout

Getting ready

How to do it...

How it works...

There's more...

Put Your Best Foot Forward - Document and Present Your Analysis

Introduction

Generating reports of your data analysis with R Markdown and knitr

Getting ready

How to do it...

How it works...

There's more...

Using the render function

Adding output options

Creating interactive web applications with shiny

Getting ready

How to do it...

How it works...

There's more...

Adding images

Adding HTML

Adding tab sets

Adding a dynamic UI

Creating a single-file web application

Dynamic integration of Shiny with knitr

Creating PDF presentations of your analysis with R presentation

Getting ready

How to do it...

How it works...

There's more...

Using hyperlinks

Controlling the display

Enhancing the look of the presentation

Generating dynamic reports

Getting ready

How to do it...

How it works...

Work Smarter, Not Harder - Efficient and Elegant R Code

Introduction

Exploiting vectorized operations

Getting ready

How to do it...

How it works...

There's more...

Processing entire rows or columns using the apply function

Getting ready

How to do it...

How it works...

There's more...

Using apply on a three-dimensional array

Applying a function to all elements of a collection with lapply and sapply

Getting ready

How to do it...

How it works...

There's more...

Dynamic output

One caution

Applying functions to subsets of a vector

Getting ready

How to do it...

How it works...

There's more...

Applying a function on groups from a data frame

Using the split-apply-combine strategy with plyr

Getting ready

How to do it...

How it works...

There's more...

Adding a new column using transform or mutate

Using summarize along with the plyr function

Concatenating the list of data frames into a big data frame

Common grouping functions in plyr

Split-apply-combine with dplyr

Slicing, dicing, and combining data with data tables

Getting ready

How to do it...

How it works...

There's more...

Adding multiple aggregated columns

Counting groups

Deleting a column

Joining data tables

Using symbols

Where in the World? Geospatial Analysis

Introduction

Downloading and plotting a Google map of an area

Getting ready

How to do it...

How it works...

There's more...

Saving the downloaded map as an image file

Getting a satellite image

Overlaying data on the downloaded Google map

Getting ready

How to do it...

How it works...

There's more...

Importing ESRI shape files to R

Getting ready

How to do it...

How it works...

Using the sp package to plot geographic data

Getting ready

How to do it...

How it works...

Getting maps from the maps package

Getting ready

How to do it...

How it works...

Creating spatial data frames from regular data frames containing spatial and other data

Getting ready

How to do it...

How it works...

Creating spatial data frames by combining regular data frames with spatial objects

Getting ready

How to do it...

How it works...

Adding variables to an existing spatial data frame

Getting ready

How to do it...

How it works...

Spatial data analysis with R and QGIS

Getting ready

How to do it...

How it works...

Playing Nice - Connecting to Other Systems

Introduction

Using Java objects in R

Getting ready

How to do it...

How it works...

There's more...

Checking JVM properties

Displaying available methods

Using JRI to call R functions from Java

Getting ready

How to do it...

How it works...

There's more...

Using Rserve to call R functions from Java

Getting ready

How to do it...

How it works...

There's more...

Retrieving an array from R

Executing R scripts from Java

Getting ready

How to do it...

How it works...

Using the xlsx package to connect to Excel

Getting ready

How to do it...

How it works...

Reading data from relational databases - MySQL

Getting ready

How to do it...

Using RODBC

Using RMySQL

Using RJDBC

How it works...

Using RODBC

Using RMySQL

Using RJDBC

There's more...

Fetching all rows

When the SQL query is long

Reading data from NoSQL databases - MongoDB

Getting ready

How to do it...

How it works...

There's more...

Find most severe crime zone

Plotting the crimes on the Chicago map

Working with in-memory data processing with Apache Spark

Getting ready

How to do it...

How it works...

There's more...

Classification with SparkR

Movie lens recommendation system with SparkR

累计评论(0条) 0个书友正在讨论这本书 发表评论

发表评论

发表评论,分享你的想法吧!

买过这本书的人还买过

读了这本书的人还在读

回顶部