一:  一个请求到达es集群,选中一个coordinate节点以后,会通过请求路由到指定primary shard中,如果分发策略选择为round-robin,如果来4个请求,则2个打到primary shard中2个打到replic shard中。

二: es在多个shard进行分片但数据倾斜严重的时候有可能会发生搜索score不准的情况,因为IDF分值的计算方法实在shard本地完成的;如shard1中数据较多,在计算某一词搜索时的分值时会导致分值整体下降,而这时shard2中出现的词频较少会整体分值偏高,这样容易导致原本不太相关的内容却变得分值高了起来,从而使排序不准;解决方法就是让多个shard在生产环境中尽量做到数据均衡分布,这样就不会因为score的本地计算而整体受影响。

三: es计算分值时有两种策略:

1)most-field->默认策略是全文检索的所有关键词,在document的每一个field中可匹配的次数越多则分值越高;规则:(每个match中field匹配分值的和) *(实际document匹配到了字段个数)/(query中match的个数) ,如下代码:

GET /index3/type3/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"title":"spark"//title中可匹配成功
}
},
{
"match": {
"content":"java"//content中也可匹配成功
}
}
]
}
}
}

2)beast-field->如果使用dis_max,document的分值则会根据match中field匹配分值最高的决定,也就是说和其他属性无关

GET /index3/type3/_search
{
"query": {
"dis_max": {
"queries": [
{
"match": {
"title": "spark"
}
},
{
"match": {
"content": "java"
}
}
]
}
}

3)es中除了most_fields和beast_fields以外,使用cross_fields的情况还是比较多的,使用es系统中默认的cross_fields策略实质是将 "fields": ["name","content"]两个字段的内容放到一起后建立索引,这样就能通过一个fullField字段进行fullText,使结果更加准确

搜索参数:
GET /index2/type2/_search
{
"query": {
"multi_match": {
"query": "happening like",
//query中的搜索词条去content和name两个字段中来匹配,不过会由于两个字段mapping定义不同导致得分不同,排序结果可能有差异
"fields": ["name","content"],
//best_fields策略是每个document的得分等于得分最高的match field的值;而匹配出最佳以后,其它document得分未必准确;most_fields根据每个field的评分计算出ducoment的综合评分
"type":"cross_fields",
"operator":"and"
}
}
}
结果:
{
"took": 36,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 0.84968257,
"hits": [
{
"_index": "index2",
"_type": "type2",
"_id": "2",
"_score": 0.84968257,
"_source": {
"num": 10,
"title": "他的名字",
"name": "yes happening like write",
"content": "happening like"
}
},
{
"_index": "index2",
"_type": "type2",
"_id": "4",
"_score": 0.8164005,
"_source": {
"num": 1000,
"title": "我的名字",
"name": "happening like write",
"content": "happening hello like yeas and he happening like had read a lot about happening hello like"
}
},
{
"_index": "index2",
"_type": "type2",
"_id": "3",
"_score": 0.5063205,
"_source": {
"num": 105,
"title": "这是谁的名字",
"name": "happening like write",
"content": " national treasure because of its rare number and cute appearance. Many foreign people are so crazy about pandas and they can’t watching these lovely creatures all the time. Though some action"
}
}
]
}
}

四:提升全文检索效果的两种方法

1) 使用boost提升检索分值

GET index3/type3/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"content": {
"query": "from",
"boost":5//使用boost将term检索评分提升5倍
}
}
},{
"match": {
"content": {
"query": "foot"//如果不使用boost则搜索foot则会得分较高
}
}
}
]
}
}
}
结果:
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1.3150566,
"hits": [
{
"_index": "index3",
"_type": "type3",
"_id": "1",
"_score": 1.3150566,
"_source": {
"date": "2019-01-02",
"name": "the little",
"content": "Half the hello book ideas in his talk were plagiarized from an article I wrote last month.",
"no": "123"
}
},
{
"_index": "index3",
"_type": "type3",
"_id": "5",
"_score": 1.3114156,
"_source": {
"date": "2019-05-01",
"name": "http litty",
"content": "There are hello moments in life when you miss book someone so much that you just want to pick them from your dreams",
"no": "564",
"description": "描述"
}
},
{
"_index": "index3",
"_type": "type3",
"_id": "3",
"_score": 0.28582606,
"_source": {
"date": "2019-07-01",
"name": "very tag",
"content": "Some of our hello comrades love book to write long articles with no substance, very much like the foot bindings of a slattern, long as well as smelly",
"no": "123"
}
}
]
}
}

2)使用boosting的positive和negative进行反向筛选,通过设置 (negative_boost:0.5) 降低分值

GET index3/type3/_search
{
"query": {
"boosting": {
//正常匹配的
"positive": {
"match": {
"content": "from"
}
},
//降低分值去匹配的,以下字段的分值乘以negative_boost值
"negative": {
"match": {
"content": {
"query": "Half"
}
}
},
"negative_boost": 0.1
}
}
}
结果:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0.26228312,
"hits": [
{
"_index": "index3",
"_type": "type3",
"_id": "5",
"_score": 0.26228312,
"_source": {
"date": "2019-05-01",
"name": "http litty",
"content": "There are hello moments in life when you miss book someone so much that you just want to pick them from your dreams",
"no": "564",
"description": "描述"
}
},
{
"_index": "index3",
"_type": "type3",
"_id": "1",
"_score": 0.026301134,
"_source": {
"date": "2019-01-02",
"name": "the little",
"content": "Half the hello book ideas in his talk were plagiarized from an article I wrote last month.",
"no": "123"
}
}
]
}
}

最新文章

  1. ReactJS webpack实现JS模块化使用的坑
  2. angular1.x的简单介绍(二)
  3. centos下安装yaf框架
  4. openwrt-智能路由器hack技术(1)---"DNS劫持"
  5. 一步一步教你如何解锁被盗的iPhone 6S
  6. Sql Server 2008 数据库附加失败提示9004错误解决办法
  7. oracle 中的Ipad()函数
  8. 神奇的Noip模拟试题 T3 科技节 位运算
  9. 学习java之利用泛型访问自己定义的类
  10. 解决Ubuntu和Windows的文件乱码问题(转载)
  11. net 关于系统性能调优了解和看法
  12. 【原创】基于部署映像服务和管理(DISM)修改映象解决WIN7 USB3.0安装时报错
  13. iOS数组、字典与json字符串的转换
  14. highstock
  15. Longest Palindromic Substring - 字符串中最长的回文字段
  16. hdu5651 xiaoxin juju needs help(逆元)
  17. stm32的NVIC是什么?
  18. Powershell的IIS管理小结
  19. 【收藏】UICrawler
  20. 转载:C# 将引用的DLL文件放到指定的目录下

热门文章

  1. Out of memory: Kill process 6033 (mysqld) score 85 or sacrifice child
  2. P1057 传球游戏——小学生dp
  3. Dns的作用
  4. Java RabbitMQ配置和使用,基于SpringBoot
  5. StarUML自动生成Java代码
  6. fluent-动网格-动态层
  7. PostgreSQL 增量备份详解以及相关示例
  8. super与this的用法
  9. Visual Studio 2019更新到16.2.2
  10. 【转载】 漫谈Code Review的错误实践