售 价:¥
温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印
为你推荐
Learning pandas
Table of Contents
Learning pandas
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers, and more
Why subscribe?
Free access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
1. A Tour of pandas
pandas and why it is important
pandas and IPython Notebooks
Referencing pandas in the application
Primary pandas objects
The pandas Series object
The pandas DataFrame object
Loading data from files and the Web
Loading CSV data from files
Loading data from the Web
Simplicity of visualization of pandas data
Summary
2. Installing pandas
Getting Anaconda
Installing Anaconda
Installing Anaconda on Linux
Installing Anaconda on Mac OS X
Installing Anaconda on Windows
Ensuring pandas is up to date
Running a small pandas sample in IPython
Starting the IPython Notebook server
Installing and running IPython Notebooks
Using Wakari for pandas
Summary
3. NumPy for pandas
Installing and importing NumPy
Benefits and characteristics of NumPy arrays
Creating NumPy arrays and performing basic array operations
Selecting array elements
Logical operations on arrays
Slicing arrays
Reshaping arrays
Combining arrays
Splitting arrays
Useful numerical methods of NumPy arrays
Summary
4. The pandas Series Object
The Series object
Importing pandas
Creating Series
Size, shape, uniqueness, and counts of values
Peeking at data with heads, tails, and take
Looking up values in Series
Alignment via index labels
Arithmetic operations
The special case of Not-A-Number (NaN)
Boolean selection
Reindexing a Series
Modifying a Series in-place
Slicing a Series
Summary
5. The pandas DataFrame Object
Creating DataFrame from scratch
Example data
S&P 500
Monthly stock historical prices
Selecting columns of a DataFrame
Selecting rows and values of a DataFrame using the index
Slicing using the [] operator
Selecting rows by index label and location: .loc[] and .iloc[]
Selecting rows by index label and/or location: .ix[]
Scalar lookup by label or location using .at[] and .iat[]
Selecting rows of a DataFrame by Boolean selection
Modifying the structure and content of DataFrame
Renaming columns
Adding and inserting columns
Replacing the contents of a column
Deleting columns in a DataFrame
Adding rows to a DataFrame
Appending rows with .append()
Concatenating DataFrame objects with pd.concat()
Adding rows (and columns) via setting with enlargement
Removing rows from a DataFrame
Removing rows using .drop()
Removing rows using Boolean selection
Removing rows using a slice
Changing scalar values in a DataFrame
Arithmetic on a DataFrame
Resetting and reindexing
Hierarchical indexing
Summarized data and descriptive statistics
Summary
6. Accessing Data
Setting up the IPython notebook
CSV and Text/Tabular format
The sample CSV data set
Reading a CSV file into a DataFrame
Specifying the index column when reading a CSV file
Data type inference and specification
Specifying column names
Specifying specific columns to load
Saving DataFrame to a CSV file
General field-delimited data
Handling noise rows in field-delimited data
Reading and writing data in an Excel format
Reading and writing JSON files
Reading HTML data from the Web
Reading and writing HDF5 format files
Accessing data on the web and in the cloud
Reading and writing from/to SQL databases
Reading data from remote data services
Reading stock data from Yahoo! and Google Finance
Retrieving data from Yahoo! Finance Options
Reading economic data from the Federal Reserve Bank of St. Louis
Accessing Kenneth French's data
Reading from the World Bank
Summary
7. Tidying Up Your Data
What is tidying your data?
Setting up the IPython notebook
Working with missing data
Determining NaN values in Series and DataFrame objects
Selecting out or dropping missing data
How pandas handles NaN values in mathematical operations
Filling in missing data
Forward and backward filling of missing values
Filling using index labels
Interpolation of missing values
Handling duplicate data
Transforming Data
Mapping
Replacing values
Applying functions to transform data
Summary
8. Combining and Reshaping Data
Setting up the IPython notebook
Concatenating data
Merging and joining data
An overview of merges
Specifying the join semantics of a merge operation
Pivoting
Stacking and unstacking
Stacking using nonhierarchical indexes
Unstacking using hierarchical indexes
Melting
Performance benefits of stacked data
Summary
9. Grouping and Aggregating Data
Setting up the IPython notebook
The split, apply, and combine (SAC) pattern
Split
Data for the examples
Grouping by a single column's values
Accessing the results of grouping
Grouping using index levels
Apply
Applying aggregation functions to groups
The transformation of group data
An overview of transformation
Practical examples of transformation
Filtering groups
Discretization and Binning
Summary
10. Time-series Data
Setting up the IPython notebook
Representation of dates, time, and intervals
The datetime, day, and time objects
Timestamp objects
Timedelta
Introducing time-series data
DatetimeIndex
Creating time-series data with specific frequencies
Calculating new dates using offsets
Date offsets
Anchored offsets
Representing durations of time using Period objects
The Period object
PeriodIndex
Handling holidays using calendars
Normalizing timestamps using time zones
Manipulating time-series data
Shifting and lagging
Frequency conversion
Up and down resampling
Time-series moving-window operations
Summary
11. Visualization
Setting up the IPython notebook
Plotting basics with pandas
Creating time-series charts with .plot()
Adorning and styling your time-series plot
Adding a title and changing axes labels
Specifying the legend content and position
Specifying line colors, styles, thickness, and markers
Specifying tick mark locations and tick labels
Formatting axes tick date labels using formatters
Common plots used in statistical analyses
Bar plots
Histograms
Box and whisker charts
Area plots
Scatter plots
Density plot
The scatter plot matrix
Heatmaps
Multiple plots in a single chart
Summary
12. Applications to Finance
Setting up the IPython notebook
Obtaining and organizing stock data from Yahoo!
Plotting time-series prices
Plotting volume-series data
Calculating the simple daily percentage change
Calculating simple daily cumulative returns
Resampling data from daily to monthly returns
Analyzing distribution of returns
Performing a moving-average calculation
The comparison of average daily returns across stocks
The correlation of stocks based on the daily percentage change of the closing price
Volatility calculation
Determining risk relative to expected returns
Summary
Index
买过这本书的人还买过
读了这本书的人还在读
同类图书排行榜