elasticsearch搜索引擎的常用方法

1、term和terms
term和terms等查询，不会对查询对字段进行分词处理，适合于date、num、id等确切数据进行搜索

如果需要查询keywords，则查询等keywords必须是查询字段中可以分出来的词，如果不是，则无法查询到数据。

例如：age字段包含的值为80后、90后。使用term查询，{"term":{"age":"80后"}} 这样是无法查询到age是"80后"的数据的，因为term不会对"80后"进行分词，而es中存储的age字段，会把"80后"分成“80”和“后”，没有“80后”，所以使用term无法查询到

term查询某一个关键词的数据：

# 查询content含有“学习”关键词的数据

GET index_1/_search

{

  "query": {

    "bool": {

      "filter": {

        "term": {

          "content": "学习"

        }

      }

    }

  }

}

terms查询某些关键词的数据：

# 查询content中含有“学习”和“生活”的数据

GET index_1/_search

{

  "query": {

    "bool": {

      "filter": {

        "terms": {

          "content": [

            "学习",

            "生活"

          ]

        }

      }

    }

  }

}

2、match
match的所有方法，都会对字段进行分词，所查询的字段数据只要包含分词后结果的一个，就会被查询到

例如：age的数据包含有80后、90后，使用 {"match":{"age":"80后"}} 查询，会把所有的数据都查询出来。

分析：match会把“80后”分词为“80”和“后”，而es也会讲age分词为“80“、“后”，“90”、“后”，所有数据都包含”后“，所以会被全部查找出来

match查找age是80后的字段：

GET index_1/_search

{

  "query": {

    "bool": {

      "filter": {

        "match": {

          "age": "80后"

        }

      }

    }

  }

}

结果：

{

  "took": 4,

  "timed_out": false,

  "_shards": {

    "total": 1,

    "successful": 1,

    "failed": 0

  },

  "hits": {

    "total": 6433,

    "max_score": 0,

    "hits": [

      {

        "_index": "index_1",

        "_type": "type_1",

        "_id": "1",

        "_score": 0,

        "_source": {

          "age": "70后"

          ...}

      },

      {

        "_index": "index_1",

        "_type": "type_1",

        "_id": "2",

        "_score": 0,

        "_source": {

          "age": "80后"

          ...}

      },

      {

        "_index": "index_1",

        "_type": "type_1",

        "_id": "2",

        "_score": 0,

        "_source": {

          "age": "90后"

          ...}

      }

    ]

  }

}

match_phrase：短语匹配查询，必须匹配短语中的所有分词，并且保证各个分词的相对位置不变

例如：age的数据包含有80后、90后，使用 {"match_phrase":{"age":"80后"}} 查询，只会把age字段是“80后”的数据查询出来。

分析：match_phrase会把“80后”分词为“80”和“后”，而es也会讲age分词为“80“、“后”，“90”、“后”，查询的时候，只会查询“80”后面的分词是“后”的数据

match_phrase查找age是80后的字段：

GET index_1/_search

{

  "query": {

    "bool": {

      "filter": {

        "match_phrase": {

          "age": "80后"

        }

      }

    }

  }

}

结果：

{

  "took": 4,

  "timed_out": false,

  "_shards": {

    "total": 1,

    "successful": 1,

    "failed": 0

  },

  "hits": {

    "total": 6433,

    "max_score": 0,

    "hits": [

      {

        "_index": "index_1",

        "_type": "type_1",

        "_id": "1",

        "_score": 0,

        "_source": {

          "age": "80后"

          ...}

      },

      {

        "_index": "index_1",

        "_type": "type_1",

        "_id": "2",

        "_score": 0,

        "_source": {

          "age": "80后"

          ...}

      },

      {

        "_index": "index_1",

        "_type": "type_1",

        "_id": "2",

        "_score": 0,

        "_source": {

          "age": "80后"

          ...}

      }

    ]

  }

}

multi_match：查询多个字段包含某个关键词的数据

# 查询content或education中含有"大学"的数据

GET index_1/_search

{

  "query": {

    "bool": {

      "filter": {

        "multi_match": {

          "query": "大学",

          "fields": ["content", "education"]

        }

      }

    }

  }

}

match_all：查询所有文档

GET index_1/_search

{

  "query": {

    "match_all": {}

  }

}

3、range

range范围查找，查找某一范围的所有数据

gt：大于

gte：大于等于

lt：小于

lte：小于等于

# 查询时间大于等于2019-08-10 10:08:29，小于等于2019-08-13 10:08:29的数据

GET index_4/_search

{

  "query": {

    "bool": {

      "filter": {

        "range": {

          "date": {

            "gte": "2019-08-10 10:08:29",

            "lte": "2019-08-13 10:08:29"

          }

        }

      }

    }

  }

}

4、sort

sort按照某些字段对数据进行排序，可以是一个字段，也可以是多个字段

desc：降序

asc：生序

# 查询数据按照时间的降序排列

GET index_1/_search

{

  "sort": [

    {

      "date": {

        "order": "desc"

      }

    }

  ],

  "query": {

    "match_all": {}

  }

}

5、_source

对于搜索的结果，只关注某些字段的值

# 查询所有的数据的name和age

GET index_1/_search

{

  "_source": ["name", "age"],

  "query": {

    "match_all": {}

  }

}

6、from和size

from：从某个位置开始查询，最小为0，某些情况下可以为-1（下一篇说明）

size：查询长度

from+size不能大于10000，否则es会报错（下一篇解决）

# 查询前20条数据，并按照date的降序排列

GET index_1/_search

{

  "from": 0,

  "size": 20,

  "sort": [

    {

      "date": {

        "order": "desc"

      }

    }

  ],

  "query": {

    "match_all": {}

  }

}

7、fuzzy
模糊匹配

value：查询包含某关键字

boost：增加查询的权值，默认值是1.0，必须于value同用，涉及字段_score（es默认按照_score排序）

fuzziness：设置匹配的最小相似度，默认值0.5，对于字符串，取值0-1(包括0和1)；对于数值，取值可能大于1；对于日期取值为1d,1m等

prefix_length：公共前缀长度，默认为0

max_expansions：指定可被扩大到的最大词条数，默认为无限制

GET index_4/_search

{

  "query": {

   "fuzzy": {

     "type": {

       "value": "分期",

       "boost": 0.5

     }

   }

  }

}

8、wildcard

通配符查询

*：匹配0或多个字符

?：匹配任意字符

注意：使用wildcard的字段类型需要是keyword，切不分词；尽量少用，效率较低

GET index_1/_search

{

  "query": {

    "wildcard": {

      "content": {

        "value": "*学习*"

      }

    }

  }

}

　采自于　https://blog.csdn.net/Misaki_root/article/details/101203647?spm=1001.2014.3001.5501

巴特西