万本电子书0元读

万本电子书0元读

顶部广告

ElasticSearch Server Second Edition电子书

售       价:¥

3人正在读 | 0人评论 9.8

作       者:Rafa? Ku?

出  版  社:Packt Publishing

出版时间:2014-04-24

字       数:697.8万

所属分类: 进口书 > 外文原版书 > 电脑/网络

温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印

为你推荐

  • 读书简介
  • 目录
  • 累计评论(0条)
  • 读书简介
  • 目录
  • 累计评论(0条)
This book is a detailed, practical, handson guide packed with reallife scenarios and examples which will show you how to implement an ElasticSearch search engine on your own websites. If you are a web developer or a user who wants to learn more about ElasticSearch, then this is the book for you. You do not need to know anything about ElastiSeach, Java, or Apache Lucene in order to use this book, though basic knowledge about databases and queries is required.
目录展开

Elasticsearch Server Second Edition

Table of Contents

Elasticsearch Server Second Edition

Credits

About the Author

Acknowledgments

About the Author

Acknowledgments

About the Reviewers

www.PacktPub.com

Support files, eBooks, discount offers, and more

Why subscribe?

Free access for Packt account holders

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Errata

Piracy

Questions

1. Getting Started with the Elasticsearch Cluster

Full-text searching

The Lucene glossary and architecture

Input data analysis

Indexing and querying

Scoring and query relevance

The basics of Elasticsearch

Key concepts of data architecture

Index

Document

Document type

Mapping

Key concepts of Elasticsearch

Node and cluster

Shard

Replica

Gateway

Indexing and searching

Installing and configuring your cluster

Installing Java

Installing Elasticsearch

Installing Elasticsearch from binary packages on Linux

Installing Elasticsearch using the RPM package

Installing Elasticsearch using the DEB package

The directory layout

Configuring Elasticsearch

Running Elasticsearch

Shutting down Elasticsearch

Running Elasticsearch as a system service

Elasticsearch as a system service on Linux

Elasticsearch as a system service on Windows

Manipulating data with the REST API

Understanding the Elasticsearch RESTful API

Storing data in Elasticsearch

Creating a new document

Automatic identifier creation

Retrieving documents

Updating documents

Deleting documents

Versioning

An example of versioning

Using the version provided by an external system

Searching with the URI request query

Sample data

The URI request

The Elasticsearch query response

Query analysis

URI query string parameters

The query

The default search field

Analyzer

The default operator

Query explanation

The fields returned

Sorting the results

The search timeout

The results window

The search type

Lowercasing the expanded terms

Analyzing the wildcard and prefixes

The Lucene query syntax

Summary

2. Indexing Your Data

Elasticsearch indexing

Shards and replicas

Creating indices

Altering automatic index creation

Settings for a newly created index

Mappings configuration

Type determining mechanism

Disabling field type guessing

Index structure mapping

Type definition

Fields

Core types

Common attributes

String

Number

Boolean

Binary

Date

Multifields

The IP address type

The token_count type

Using analyzers

Out-of-the-box analyzers

Defining your own analyzers

Analyzer fields

Default analyzers

Different similarity models

Setting per-field similarity

Available similarity models

Configuring DFR similarity

Configuring IB similarity

The postings format

Configuring the postings format

Doc values

Configuring the doc values

Doc values formats

Batch indexing to speed up your indexing process

Preparing data for bulk indexing

Indexing the data

Even quicker bulk requests

Extending your index structure with additional internal information

Identifier fields

The _type field

The _all field

The _source field

Exclusion and inclusion

The _index field

The _size field

The _timestamp field

The _ttl field

Introduction to segment merging

Segment merging

The need for segment merging

The merge policy

The merge scheduler

The merge factor

Throttling

Introduction to routing

Default indexing

Default searching

Routing

The routing parameters

Routing fields

Summary

3. Searching Your Data

Querying Elasticsearch

The example data

A simple query

Paging and result size

Returning the version value

Limiting the score

Choosing the fields that we want to return

The partial fields

Using the script fields

Passing parameters to the script fields

Understanding the querying process

Query logic

Search types

Search execution preferences

The Search shards API

Basic queries

The term query

The terms query

The match_all query

The common terms query

The match query

The Boolean match query

The match_phrase query

The match_phrase_prefix query

The multi_match query

The query_string query

Running the query_string query against multiple fields

The simple_query_string query

The identifiers query

The prefix query

The fuzzy_like_this query

The fuzzy_like_this_field query

The fuzzy query

The wildcard query

The more_like_this query

The more_like_this_field query

The range query

The dismax query

The regular expression query

Compound queries

The bool query

The boosting query

The constant_score query

The indices query

Filtering your results

Using filters

Filter types

The range filter

The exists filter

The missing filter

The script filter

The type filter

The limit filter

The identifiers filter

If this is not enough

Combining filters

A word about the bool filter

Named filters

Caching filters

Highlighting

Getting started with highlighting

Field configuration

Under the hood

Configuring HTML tags

Controlling the highlighted fragments

Global and local settings

Require matching

The postings highlighter

Validating your queries

Using the validate API

Sorting data

Default sorting

Selecting fields used for sorting

Specifying the behavior for missing fields

Dynamic criteria

Collation and national characters

Query rewrite

An example of the rewrite process

Query rewrite properties

Summary

4. Extending Your Index Structure

Indexing tree-like structures

Data structure

Analysis

Indexing data that is not flat

Data

Objects

Arrays

Mappings

Final mappings

Sending the mappings to Elasticsearch

To be or not to be dynamic

Using nested objects

Scoring and nested queries

Using the parent-child relationship

Index structure and data indexing

Parent mappings

Child mappings

The parent document

The child documents

Querying

Querying data in the child documents

The top children query

Querying data in the parent documents

The parent-child relationship and filtering

Performance considerations

Modifying your index structure with the update API

The mappings

Adding a new field

Modifying fields

Summary

5. Make Your Search Better

An introduction to Apache Lucene scoring

When a document is matched

Default scoring formula

Relevancy matters

Scripting capabilities of Elasticsearch

Objects available during script execution

MVEL

Using other languages

Using our own script library

Using native code

The factory implementation

Implementing the native script

Installing scripts

Running the script

Searching content in different languages

Handling languages differently

Handling multiple languages

Detecting the language of the documents

Sample document

The mappings

Querying

Queries with the identified language

Queries with unknown languages

Combining queries

Influencing scores with query boosts

The boost

Adding boost to queries

Modifying the score

The constant_score query

The boosting query

The function_score query

The structure of the function query

Deprecated queries

Replacing the custom_boost_factor query

Replacing the custom_score query

Replacing the custom_filters_score query

When does index-time boosting make sense?

Defining field boosting in input data

Defining boosting in mapping

Words with the same meaning

The synonym filter

Synonyms in the mappings

Synonyms stored in the filesystem

Defining synonym rules

Using Apache Solr synonyms

Explicit synonyms

Equivalent synonyms

Expanding synonyms

Using WordNet synonyms

Query- or index-time synonym expansion

Understanding the explain information

Understanding field analysis

Explaining the query

Summary

6. Beyond Full-text Searching

Aggregations

General query structure

Available aggregations

Metric aggregations

Min, max, sum, and avg aggregations

Using scripts

The value_count aggregation

The stats and extended_stats aggregations

Bucketing

The terms aggregation

The range aggregation

The date_range aggregation

IPv4 range aggregation

The missing aggregation

Nested aggregation

The histogram aggregation

The date_histogram aggregation

Time zones

The geo_distance aggregation

The geohash_grid aggregation

Nesting aggregations

Bucket ordering and nested aggregations

Global and subsets

Inclusions and exclusions

Faceting

The document structure

Returned results

Using queries for faceting calculations

Using filters for faceting calculations

Terms faceting

Ranges based faceting

Choosing different fields for an aggregated data calculation

Numerical and date histogram faceting

The date_histogram facet

Computing numerical field statistical data

Computing statistical data for terms

Geographical faceting

Filtering faceting results

Memory considerations

Using suggesters

Available suggester types

Including suggestions

The suggester response

The term suggester

The term suggester configuration options

Additional term suggester options

The phrase suggester

Configuration

The completion suggester

Indexing data

Querying the indexed completion suggester data

Custom weights

Percolator

The index

Percolator preparation

Getting deeper

Getting the number of matching queries

Indexed documents percolation

Handling files

Adding additional information about the file

Geo

Mappings preparation for spatial search

Example data

Sample queries

Distance-based sorting

Bounding box filtering

Limiting the distance

Arbitrary geo shapes

Point

Envelope

Polygon

Multipolygon

An example usage

Storing shapes in the index

The scroll API

Problem definition

Scrolling to the rescue

The terms filter

Terms lookup

The terms lookup query structure

Terms lookup cache settings

Summary

7. Elasticsearch Cluster in Detail

Node discovery

Discovery types

The master node

Configuring the master and data nodes

The master-election configuration

Setting the cluster name

Configuring multicast

Configuring unicast

Ping settings for nodes

The gateway and recovery modules

The gateway

Recovery control

Additional gateway recovery options

Preparing Elasticsearch cluster for high query and indexing throughput

The filter cache

The field data cache and circuit breaker

The circuit breaker

The store

Index buffers and the refresh rate

The index refresh rate

The thread pool configuration

Combining it all together – some general advice

Choosing the right store

The index refresh rate

Tuning the thread pools

Tuning your merge process

The field data cache and breaking the circuit

RAM buffer for indexing

Tuning transaction logging

Things to keep in mind

Templates and dynamic templates

Templates

An example of a template

Storing templates in files

Dynamic templates

The matching pattern

Field definitions

Summary

8. Administrating Your Cluster

The Elasticsearch time machine

Creating a snapshot repository

Creating snapshots

Additional parameters

Restoring a snapshot

Cleaning up – deleting old snapshots

Monitoring your cluster's state and health

The cluster health API

Controlling information details

Additional parameters

The indices stats API

Docs

Store

Indexing, get, and search

Additional information

The status API

The nodes info API

The nodes stats API

The cluster state API

The pending tasks API

The indices segments API

The cat API

Limiting returned information

Controlling cluster rebalancing

Rebalancing

Cluster being ready

The cluster rebalance settings

Controlling when rebalancing will start

Controlling the number of shards being moved between nodes concurrently

Controlling the number of shards initialized concurrently on a single node

Controlling the number of primary shards initialized concurrently on a single node

Controlling types of shards allocation

Controlling the number of concurrent streams on a single node

Controlling the shard and replica allocation

Explicitly controlling allocation

Specifying node parameters

Configuration

Index creation

Excluding nodes from allocation

Requiring node attributes

Using IP addresses for shard allocation

Disk-based shard allocation

Enabling disk-based shard allocation

Configuring disk-based shard allocation

Cluster wide allocation

Number of shards and replicas per node

Moving shards and replicas manually

Moving shards

Canceling shard allocation

Forcing shard allocation

Multiple commands per HTTP request

Warming up

Defining a new warming query

Retrieving the defined warming queries

Deleting a warming query

Disabling the warming up functionality

Choosing queries

Index aliasing and using it to simplify your everyday work

An alias

Creating an alias

Modifying aliases

Combining commands

Retrieving all aliases

Removing aliases

Filtering aliases

Aliases and routing

Elasticsearch plugins

The basics

Installing plugins

Removing plugins

The update settings API

Summary

Index

累计评论(0条) 0个书友正在讨论这本书 发表评论

发表评论

发表评论,分享你的想法吧!

买过这本书的人还买过

读了这本书的人还在读

回顶部