万本电子书0元读

万本电子书0元读

顶部广告

Learn Python by Building Data Science Applications电子书

售       价:¥

1人正在读 | 0人评论 9.8

作       者:Philipp Kats

出  版  社:Packt Publishing

出版时间:2019-08-30

字       数:63.2万

所属分类: 进口书 > 外文原版书 > 电脑/网络

温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印

为你推荐

  • 读书简介
  • 目录
  • 累计评论(0条)
  • 读书简介
  • 目录
  • 累计评论(0条)
Understand the constructs of the Python programming language and use them to build data science projects Key Features * Learn the basics of developing applications with Python and deploy your first data application * Take your first steps in Python programming by understanding and using data structures, variables, and loops * Delve into Jupyter, NumPy, Pandas, SciPy, and sklearn to explore the data science ecosystem in Python Book Description Python is the most widely used programming language for building data science applications. Complete with step-by-step instructions, this book contains easy-to-follow tutorials to help you learn Python and develop real-world data science projects. The “secret sauce” of the book is its curated list of topics and solutions, put together using a range of real-world projects, covering initial data collection, data analysis, and production. This Python book starts by taking you through the basics of programming, right from variables and data types to classes and functions. You’ll learn how to write idiomatic code and test and debug it, and discover how you can create packages or use the range of built-in ones. You’ll also be introduced to the extensive ecosystem of Python data science packages, including NumPy, Pandas, scikit-learn, Altair, and Datashader. Furthermore, you’ll be able to perform data analysis, train models, and interpret and communicate the results. Finally, you’ll get to grips with structuring and scheduling scripts using Luigi and sharing your machine learning models with the world as a microservice. By the end of the book, you’ll have learned not only how to implement Python in data science projects, but also how to maintain and design them to meet high programming standards. What you will learn * Code in Python using Jupyter and VS Code * Explore the basics of coding – loops, variables, functions, and classes * Deploy continuous integration with Git, Bash, and DVC * Get to grips with Pandas, NumPy, and scikit-learn * Perform data visualization with Matplotlib, Altair, and Datashader * Create a package out of your code using poetry and test it with PyTest * Make your machine learning model accessible to anyone with the web API Who this book is for If you want to learn Python or data science in a fun and engaging way, this book is for you. You’ll also find this book useful if you’re a high school student, researcher, analyst, or anyone with little or no coding experience with an interest in the subject and courage to learn, fail, and learn from failing. A basic understanding of how computers work will be useful.
目录展开

About Packt

Why subscribe?

Contributors

About the authors

About the reviewers

Packt is searching for authors like you

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Code in Action

Conventions used

Get in touch

Reviews

Section 1: Getting Started with Python

Preparing the Workspace

Technical requirements

Installing Python

Downloading materials for running the code

Installing Python packages

Working with VS Code

The VS Code interface

Beginning with Jupyter

Notebooks

The Jupyter interface

Pre-flight check

Summary

Questions

Further reading

First Steps in Coding - Variables and Data Types

Technical requirements

Assigning variables

Naming the variable

Understanding data types

Floats and integers

Operations with self-assignment

Order of execution

Strings

Formatting

Format method

F-strings

Legacy formatting

Formatting mini-language

Strings as sequences

Booleans

Logical operators

Converting the data types

Exercise

Summary

Questions

Further reading

Functions

Technical requirements

Understanding a function

Interface functions

The input function

The eval function

Variable properties

The help function

The type function

The isinstance function

dir

Math

abs

The round function

Iterables

The len function

The sorted function

The range function

The all and any functions

The max, min, and sum functions

Defining the function

Default values

Var-positional and var-keyword

Docstrings

Type annotations

Refactoring the temperature conversion

Understanding anonymous (lambda) functions

Understanding recursion

Summary

Questions

Further reading

Data Structures

Technical requirements

What are data structures?

Lists

Slicing

Tuples

Immutability

Dictionaries

Sets

More data structures

frozenset

defaultdict

Counter

Queue

deque

namedtuple

Enumerations

Using generators

Useful functions to use with data structures

The sum, max, and min functions

The all and any functions

The zip function

The map, filter, and reduce functions

Comprehensions

Summary

Questions

Further reading

Loops and Other Compound Statements

Technical requirements

Understanding if, else, and elif statements

Inline if statements

Using if in a comprehension

Running code many times with loops

The for loop

itertools

cycle

chain

product

Enumeration

The while loop

Additional loop functionality – break and continue

Handling exceptions with try/except and try/finally

Exceptions

try/except

try/except/finally

Understanding the with statements

Summary

Questions

Further reading

First Script – Geocoding with Web APIs

Technical requirements

Geocoding as a service

Learning about web APIs

Working with HTTPS

Working with the Nominatim API

The requests library

Starting to code

Caching with decorators

Reading and writing data

Geocoding the addresses

Moving code to a separate module

Collecting NYC Open Data from the Socrata service

Summary

Questions

Further reading

Scraping Data from the Web with Beautiful Soup 4

Technical requirements

When there is no API

HTML in a nutshell

Scraping with Beautiful Soup 4

CSS and XPath selectors

Developer console

Scraping WWII battles

Step 1 – Scraping the list of battles

Unordered list

Step 2 – Scraping information from the Wiki page

Key information

Additional information

Step 3 – Scraping data as a whole

Quality control

Beyond Beautiful Soup

Summary

Questions

Further reading

Simulation with Classes and Inheritance

Technical requirements

Understanding classes

Special (dunder) methods

__init__

__repr__ and __str__

Arithmetical and logical operations

Equality/relationship methods

__len__

__getitem__

__class__

Inheritance

Using super()

Data classes

Using classes in simulation

Writing the base classes

Writing the Island class

Herbivore haven

Harsh islands

Visualization

Summary

Questions

Further reading

Shell, Git, Conda, and More – at Your Command

Technical requirements

Shell

Pipes

Executing Python scripts

Command-line interface

Git

Concept

GitHub

Practical example

gitignore

Conda

Conda for virtual environments

Conda and Jupyter

Make

Cookiecutter

Summary

Questions

Section 2: Hands-On with Data

Python for Data Applications

Technical requirements

Introducing Python for data science

Exploring NumPy

Beginning with pandas

Trying SciPy and scikit-learn

Understanding Jupyter

Summary

Questions

Data Cleaning and Manipulation

Technical requirements

Getting started with pandas

Selection – by columns, indices, or both

Masking

Data types and data conversion

Math

Merging

Working with real data

Initial exploration

Defining the scope of work to be done

Getting to know regular expressions

Parsing locations

Geocoding

Time

Belligerents

Understanding casualties

Multilevel slicing

Quality assurance

Writing the file

Summary

Questions

Further reading

Data Exploration and Visualization

Technical requirements

Exploring the dataset

Descriptive statistics

Data visualization with matplotlib (and its pandas interface)

Aggregating the data to calculate summary statistics

Resampling

Mapping

Declarative visualization with vega and altair

Drawing maps with Altair

Storing the Altair chart

Big data visualization with datashader

Summary

Questions

Further reading

Training a Machine Learning Model

Technical requirements

Understanding the basics of ML

Exploring unsupervised learning

Moving on to supervised learning

k-nearest neighbors

Linear regression

Decision trees

Summary

Questions

Further reading

Improving Your Model – Pipelines and Experiments

Technical requirements

Understanding cross-validation

Exploring feature engineering

Failed attempts

Optimizing the hyperparameters

Using a random forest model

Tracking your data and metrics with version control

Starting with data

Adding code to the equation

Metrics

Summary

Questions

Further reading

Section 3: Moving to Production

Packaging and Testing with Poetry and PyTest

Technical requirements

Building a package

Bringing your own package

Using a package manager – pip and conda

Creating a package scaffolding

A few ways to build your package

Trying out code with Poetry

Adding actual code

Defining dependencies

Non-code resources

Publishing the package

Development workflow

Testing the code so far

Testing with PyTest

Writing our own tests

Automating the process with CI services

Generating documentation generation with sphinx

Installing a package in editable mode

Summary

Questions

Further reading

Data Pipelines with Luigi

Technical requirements

Introducing the ETL pipeline

Redesigning your code as a pipeline

Building our first task in Luigi

Connecting the dots

Understanding time-based tasks

Scheduling with cron

Exploring the different output formats

Writing to an S3 bucket

Writing to SQL

Expanding Luigi with custom template classes

Summary

Questions

Further reading

Let's Build a Dashboard

Technical requirements

Building a dashboard – three types of dashboard

Static dashboards

Debugging Altair

Connecting your app to the Luigi pipeline

Understanding dynamic dashboards

First try with panel

Reading data from the database

Creating an interactive dashboard in Jupyter

Summary

Questions

Further reading

Serving Models with a RESTful API

Technical requirements

What is a RESTful API?

Python web frameworks

Building a basic API service

Exploring service with OpenAPI

Finalizing our naive first iteration

Data validation

Sending data in with POST requests

Adding features to our service

Building a web page

Speeding up with asynchronous calls

Deploying and testing your API loads with Locust

Summary

Questions

Further reading

Serverless API Using Chalice

Technical requirements

Understanding serverless

Getting started with Chalice

Setting up a simple model

Externalizing medians

Building a serverless API for an ML model

When we're still out of memory

Building a serverless function as a data pipeline

S3-triggered events

Summary

Questions

Further reading

Best Practices and Python Performance

Technical requirements

Speeding up your Python code

Rewriting the code with NumPy

Specialized data structures and algorithms

Dask

Dask-ML

Numba

Concurrency and parallelism

Different types of concurrency

Two types of problems

Before you start rewriting your code

Using best practices for coding in your project

Code formatting with black

Measuring code quality with Wily

Writing tests with hypothesis

Beyond this book – packages and technologies to look out for

Different Python flavors

Docker containers

Kubernetes

Summary

Questions

Further reading

Assessments

Chapter 1

Chapter 2

Chapter 3

Chapter 4

Chapter 5

Chapter 6

Chapter 7

Chapter 8

Chapter 9

Chapter 10

Chapter 11

Chapter 12

Chapter 13

Chapter 14

Chapter 15

Chapter 16

Chapter 17

Chapter 18

Chapter 19

Chapter 20

Other Books You May Enjoy

Leave a review - let other readers know what you think

累计评论(0条) 0个书友正在讨论这本书 发表评论

发表评论

发表评论,分享你的想法吧!

买过这本书的人还买过

读了这本书的人还在读

回顶部