_source 和store

http://stackoverflow.com/questions/18833899/in-elasticsearch-what-happens-if-i-set-store-to-yes-on-a-few-fields-but-sou

http://stackoverflow.com/questions/17103047/why-do-i-need-storeyes-in-elasticsearch

You usually send a field to elasticsearch because you either want to search on it, or retrieve it. But it's true that if you don't store the field explicitly and you don't disable the source you can still retrieve the field using the _source. This means that in some cases it might actually make sense to have a field that is not indexed nor stored.

When you store a field, that's done in the underlying lucene. Lucene is an inverted index, that allows for fast full-text search and gives back document ids given text queries. Beyond the inverted index Lucene has some kind of storage where the field values can be stored in order to be retrieved given a document id. You usually store in lucene the fields that you want to return as search results. Elasticsearch doesn't require to store every field that you want to return because it always stores by default every document that you send to it, thus it's always able to return everything you sent to it as search result.

In just a few cases it might be useful to store fields explicitly in lucene: when the _source field is disabled, or when we want to avoid parsing it, even if the parsing is done automatically by elasticsearch. Keep in mind though that retrieving many stored fields from lucene might require one disk seek per field while with retrieving only the _source from lucene and parsing it in order to retrieve the needed fields is just a single disk seek and just faster in most of the cases.

如果字段的属性store 被设置为no,也可以通过_source获取文档,然后解析出该字段的内容,但是前提是_source的属性"enabled": true。

Aggregation

http://chrissimpson.co.uk/elasticsearch-aggregations-overview.html

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations.html

http://stackoverflow.com/questions/21018493/how-to-access-aggregations-result-with-elasticsearch-java-api-in-searchresponse

https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#search-aggregations-bucket-terms-aggregation-order

Top Hit Aggregation

https://www.elastic.co/guide/en/elasticsearch/reference/1.6/search-aggregations-metrics-top-hits-aggregation.html

Shards and replicas

一个shard 实际上是一个lucence index

主分片可以接受index,副本不行;但是查询都可以

http://blog.trifork.com/2014/01/07/elasticsearch-how-many-shards/

Aggregation

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations.html

Aggregation不准确

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_document_counts_are_approximate

Mapping

http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/mapping-intro.html

每个文档在索引中都有一个类型,每个类型有自己的mapping或者叫模型定义。mapping定义类型中的字段,每个字段的数据类型,以及在弹性搜索中字段是被如何处理的。mapping也用来配置与类型相关的元数据。

弹性搜索支持如下的简单字段数据类型:

  • String: string
  • Whole number: byteshortintegerlong
  • Floating-point: floatdouble
  • Boolean: boolean
  • Date: date

当你索引一个包含新字段的文档时,弹性搜索根据JSON的基本数据类型来猜测文档字段的数据类型。具体的对应关系如下:

JSON type

Field type

Boolean: true or false

boolean

Whole number: 123

long

Floating point: 123.45

double

String, valid date: 2014-09-15

date

String: foo bar

string

注意:
  这意味着,如果字段以“123”索引一个数字,该字段会被映射为String类型,而不是long类型。然而,如果该字段已经存在并且被定义为long类型,那么弹性搜索会尝试把String类型转换为long,如果无法转换(例如包含了字母)则会抛出一个异常。
 
自定义字段映射
字段最重要的属性是type,对于非String类型的字段,除了type属性,你几乎不用指定任何属性。
String类型的字段默认是全文,即:在索引之前,值会传递给分词器;全文检索时,在搜索前值也会先传给分词器。
String类型最重要的两个属性是indexanalyzer
 
Index属性包含三个备选值:
analyzed
先分词,再索引。
not_analyzed
    直接索引,所以它是可搜索的,但是用全值建索引,不分词。
no
不建索引,所以该字段是不可搜索的。

String类型的属性,默认值是analyzed,所以想要用原始值建索引,需要设置为 not_analyzed。

其他类型(例如long,double,date)也有index属性,但是备选值只有no和not_analyzed,这些值永远不会被分词

 

最新文章

  1. SQL基础之基本操作
  2. grunt 一个目录下所有的js文件压缩 配置收藏
  3. Jquery-EasyUI学习~
  4. asp,asp.net 以表格输出excel,数据默认科学计数的解决办法
  5. jquery each函数对应的continue 和 break方法
  6. Ant快速入门(一)-----Ant介绍
  7. MySql数据库的基本原理及指令
  8. [bzoj4098] [Usaco2015 Open]Palindromic Paths
  9. Minimum Inversion Number~hdu 1394
  10. js基础--获取浏览器当前页面的滚动条高度的兼容写法
  11. 请求转发 和 URL 重定向
  12. 第6章 Hyperledger Fabric模型
  13. JAVA基础部分 JDK和JRE以及JVM
  14. 生成PDF文档之iText
  15. aspnet core 2.0 发布之后没有 views文件夹
  16. Django之Models进阶操作(字段属性)
  17. Windows-universal-samples学习笔记系列三:Navigation
  18. [转]浅论ViewController的加载 -- 解决 viewDidLoad 被提前加载的问题(pushViewController 前执行)
  19. (KMP 最大表示最小表示)String Problem -- hdu-- 3374
  20. Masnory 学习

热门文章

  1. nginx应用 突破高并发的性能优化
  2. 鸟哥的linux私房菜 - 第三章 主机规划与磁盘分区
  3. python学习笔记:第五天( 列表、元组)
  4. latex编译过程-关于嵌入所有字体
  5. tensorboard 用法
  6. less语言特性
  7. 【LeetCode】Reverse Words in a String 反转字符串中的单词
  8. sass与compass实战(读书笔记)
  9. docker基础用法
  10. 【LeetCode】454 4Sum II