售 价:¥
温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印
为你推荐
Title Page
Copyright
R Programming By Example
Credits
About the Author
About the Reviewer
www.PacktPub.com
Why subscribe?
Customer Feedback
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
Introduction to R
What R is and what it isn't
The inspiration for R – the S language
R is a high quality statistical computing system
R is a flexible programming language
R is free, as in freedom and as in free beer
What R is not good for
Comparing R with other software
The interpreter and the console
Tools to work efficiently with R
Pick an IDE or a powerful editor
The send to console functionality
The efficient write-execute loop
Executing R code in non-interactive sessions
How to use this book
Tracking state with symbols and variables
Working with data types and data structures
Numerics
Special values
Characters
Logicals
Vectors
Factors
Matrices
Lists
Data frames
Divide and conquer with functions
Optional arguments
Functions as arguments
Operators are functions
Coercion
Complex logic with control structures
If… else conditionals
For loops
While loops
The examples in this book
Summary
Understanding Votes with Descriptive Statistics
This chapter's required packages
The Brexit votes example
Cleaning and setting up the data
Summarizing the data into a data frame
Getting intuition with graphs and correlations
Visualizing variable distributions
Using matrix scatter plots for a quick overview
Getting a better look with detailed scatter plots
Understanding interactions with correlations
Creating a new dataset with what we've learned
Building new variables with principal components
Putting it all together into high-quality code
Planning before programming
Understanding the fundamentals of high-quality code
Programming by visualizing the big picture
Summary
Predicting Votes with Linear Models
Required packages
Setting up the data
Training and testing datasets
Predicting votes with linear models
Checking model assumptions
Checking linearity with scatter plots
Checking normality with histograms and quantile-quantile plots
Checking homoscedasticity with residual plots
Checking no collinearity with correlations
Measuring accuracy with score functions
Programatically finding the best model
Generating model combinations
Predicting votes from wards with unknown data
Summary
Simulating Sales Data and Working with Databases
Required packages
Designing our data tables
The basic variables
Simplifying assumptions
Potential pitfalls
The too-much-empty-space problem
The too-much-repeated-data problem
Simulating the sales data
Simulating numeric data according to distribution assumptions
Simulating categorical values using factors
Simulating dates within a range
Simulating numbers under shared restrictions
Simulating strings for complex identifiers
Putting everything together
Simulating the client data
Simulating the client messages data
Working with relational databases
Summary
Communicating Sales with Visualizations
Required packages
Extending our data with profit metrics
Building blocks for reusable high-quality graphs
Starting with simple applications for bar graphs
Adding a third dimension with colors
Graphing top performers with bar graphs
Graphing disaggregated data with boxplots
Scatter plots with joint and marginal distributions
Pricing and profitability by protein source and continent
Client birth dates, gender, and ratings
Developing our own graph type – radar graphs
Exploring with interactive 3D scatter plots
Looking at dynamic data with time-series
Looking at geographical data with static maps
Navigating geographical data with interactive maps
Maps you can navigate and zoom-in to
High-tech-looking interactive globe
Summary
Understanding Reviews with Text Analysis
This chapter's required packages
What is text analysis and how does it work?
Preparing, training, and testing data
Building the corpus with tokenization and data cleaning
Document feature matrices
Training models with cross validation
Training our first predictive model
Improving speed with parallelization
Computing predictive accuracy and confusion matrices
Improving our results with TF-IDF
Adding flexibility with N-grams
Reducing dimensionality with SVD
Extending our analysis with cosine similarity
Digging deeper with sentiment analysis
Testing our predictive model with unseen data
Retrieving text data from Twitter
Summary
Developing Automatic Presentations
Required packages
Why invest in automation?
Literate programming as a content creation methodology
Reproducibility as a benefit of literate programming
The basic tools for an automation pipeline
A gentle introduction to Markdown
Text
Headers
Header Level 1
Header Level 2
Header Level 3
Header Level 4
Lists
Tables
Links
Images
Quotes
Code
Mathematics
Extending Markdown with R Markdown
Code chunks
Tables
Graphs
Chunk options
Global chunk options
Caching
Producing the final output with knitr
Developing graphs and analysis as we normally would
Building our presentation with R Markdown
Summary
Object-Oriented System to Track Cryptocurrencies
This chapter's required packages
The cryptocurrencies example
A brief introduction to object-oriented programming
The purpose of object-oriented programming
Important concepts behind object-oriented languages
Encapsulation
Polymorphism
Hierarchies
Classes and constructors
Public and private methods
Interfaces, factories, and patterns in general
Introducing three object models in R – S3, S4, and R6
The first source of confusion – various object models
The second source of confusion – generic functions
The S3 object model
Classes, constructors, and composition
Public methods and polymorphism
Encapsulation and mutability
Inheritance
The S4 object model
Classes, constructors, and composition
Public methods and polymorphism
Encapsulation and mutability
Inheritance
The R6 object model
Classes, constructors, and composition
Public methods and polymorphism
Encapsulation and mutability
Inheritance
Active bindings
Finalizers
The architecture behind our cryptocurrencies system
Starting simple with timestamps using S3 classes
Implementing cryptocurrency assets using S4 classes
Implementing our storage layer with R6 classes
Communicating available behavior with a database interface
Implementing a database-like storage system with CSV files
Easily allowing new database integration with a factory
Encapsulating multiple databases with a storage layer
Retrieving live data for markets and wallets with R6 classes
Creating a very simple requester to isolate API calls
Developing our exchanges infrastructure
Developing our wallets infrastructure
Implementing our wallet requesters
Finally introducing users with S3 classes
Helping ourselves with a centralized settings file
Saving our initial user data into the system
Activating our system with two simple functions
Some advice when working with object-oriented systems
Summary
Implementing an Efficient Simple Moving Average
Required packages
Starting by using good algorithms
Just how much impact can algorithm selection have?
How fast is fast enough?
Calculating simple moving averages inefficiently
Simulating the time-series
Our first (very inefficient) attempt at an SMA
Understanding why R can be slow
Object immutability
Interpreted dynamic typings
Memory-bound processes
Single-threaded processes
Measuring by profiling and benchmarking
Profiling fundamentals with Rprof()
Benchmarking manually with system.time()
Benchmarking automatically with microbenchmark()
Easily achieving high benefit - cost improvements
Using the simple data structure for the job
Vectorizing as much as possible
Removing unnecessary logic
Moving checks out of iterative processes
If you can, avoid iterating at all
Using R's way of iterating efficiently
Avoiding sending data structures with overheads
Using parallelization to divide and conquer
How deep does the parallelization rabbit hole go?
Practical parallelization with R
Using C++ and Fortran to accelerate calculations
Using an old-school approach with Fortran
Using a modern approach with C++
Looking back at what we have achieved
Other topics of interest to enhance performance
Preallocating memory to avoid duplication
Making R code a bit faster with byte code compilation
Just-in-time (JIT) compilation of R code
Using memoization or cache layers
Improving our data and memory management
Using specialized packages for performance
Flexibility and power with cloud computing
Specialized R distributions
Summary
Adding Interactivity with Dashboards
Required packages
Introducing the Shiny application architecture and reactivity
What is functional reactive programming and why is it useful?
How is functional reactivity handled within Shiny?
The building blocks for reactivity in Shiny
The input, output, and rendering functions
Designing our high-level application structure
Setting up a two-column distribution
Introducing sections with panels
Inserting a dynamic data table
Introducing interactivity with user input
Setting up static user inputs
Setting up dynamic options in a drop-down
Setting up dynamic input panels
Adding a summary table with shared data
Adding a simple moving average graph
Adding interactivity with a secondary zoom-in graph
Styling our application with themes
Other topics of interest
Adding static images
Adding HTML to your web application
Adding custom CSS styling
Sharing your newly created application
Summary
Required Packages
External requirements – software outside of R
Dependencies for the RMySQL R package
Ubuntu 17.10
macOS High Sierra
Setting up user/password in both Linux and macOS
Dependencies for the rgl and rgdal R packages
Ubuntu 17.10
macOS High Sierra
Dependencies for the Rcpp package and the .Fortran() function
Ubuntu 17.10
macOS High Sierra
Internal requirements – R packages
Loading R packages
买过这本书的人还买过
读了这本书的人还在读
同类图书排行榜