Elasticsearch-24-search api和Query DSL基本语法

search api 的基本语法

1 2	GET /_search }{

GET /_search
{
    "from":0,
    "size":10
}

1	GET /_search?from=0&size=10

可以直接将参数拼接在url请求上,也可以放在request body中

在HTTP协议中,一般不允许GET请求带上reques body,但是因为GET请求更加适合描述查询数据的操作,因此还是这么用了,很多浏览器,或者是服务器,也都支持GET+request body模式,如果遇到不支持的场景，也可以用POST请求

POST /_search
{
    "from":0,
    "size":10
}

Query DSL

基本语法:

{
    QUERY_NAME: {
        ARGUMENT: VALUE,
        ARGUMENT: VALUE,...
    }
}

{
    QUERY_NAME: {
        FIELD_NAME: {
            ARGUMENT: VALUE,
            ARGUMENT: VALUE,...
        }
    }
}

示例

GET /test_index/test_type/_search
{
  "query": {
    "match": {  // 查询条件
      "test_field": "test"
    }
  }
}

组合多个搜索条件示例

我们先来添加几个document 用来进行搜索

PUT /query_index/query_type/1
{
  "title": "my elasticsearch article",
  "content": "es is very bad",
  "author_id": 110
}


PUT /query_index/query_type/2
{
  "title": "my hadoop article",
  "content": "hadoop is very bad",
  "author_id": 111
}

PUT /query_index/query_type/3
{
  "title": "my elasticsearch article",
  "content": "es is very goods",
  "author_id": 111
}

然后我们制定一个搜索条件,比如我们要查询 title必须包含 elasticsearch ,content 可以包含 elasticsearch 也可以不包含,author_id必须不为111

我们先来看一下数据:

title必须包含 elasticsearch : id是2和3的数据都符合
content 可以包含 elasticsearch 也可以不包含: 2和3中都没有包含,
author_id必须不为111: 3的id是111
根据这几个条件来看搜索结果就是id为1的那一条数据,然后我们来组合搜索条件进行搜索

GET /query_index/query_type/_search
{
  "query": {    // 查询
    "bool": {   // 组合查询条件
      "must": [ // 必须符合的条件
        {
          "match": {
            "title": "elasticsearch"
          }
        }
      ],
      "should": [ // 可以符合,也可以不符合的条件
        {
          "match": { 
            "content": "elasticsearch"
          }
        }
      ],
      "must_not": [ // 必须不符合的条件
        {
          "match": {
            "author_id": 111
          }
        }
      ]
    }
  }
}

执行后的结果:

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.25316024,
    "hits": [
      {
        "_index": "query_index",
        "_type": "query_type",
        "_id": "1",
        "_score": 0.25316024,
        "_source": {
          "title": "my elasticsearch article",
          "content": "es is very bad",
          "author_id": 110
        }
      }
    ]
  }
}

只返回了id是1的数据

query 与 filter

示例

我们现在有三条数据,如下

{
  "_index": "company",
  "_type": "emp",
  "_id": "2",
  "_score": 1,
  "_source": {
    "address": {
      "country": "china",
      "province": "jiangsu",
      "city": "nanjing"
    },
    "name": "tom",
    "age": 30,
    "join_date": "2016-01-01"
  }
},
{
  "_index": "company",
  "_type": "emp",
  "_id": "1",
  "_score": 1,
  "_source": {
    "name": "jack",
    "age": 27,
    "join_date": "2017-01-01",
    "address": {
      "country": "china",
      "province": "zhejiang",
      "city": "hangzhou"
    }
  }
},
{
  "_index": "company",
  "_type": "emp",
  "_id": "3",
  "_score": 1,
  "_source": {
    "address": {
      "country": "china",
      "province": "shanxi",
      "city": "xian"
    },
    "name": "marry",
    "age": 35,
    "join_date": "2015-01-01"
  }
}

现在有一个搜索请求, 搜索年龄必须大于等于30,同时join_date必须是2016-01-01

我们来构造一个包含query和filter的搜索请求

GET /company/emp/_search
{
  "query": {
    "bool": {   // 组合搜索
      "must": [ // 必须满足的条件
        {
          "match": {
            "join_date": "2016-01-01"
          }
        }
      ],
      "filter": {  // 过滤器
        "range": {
          "age": {
            "gte": 30
          }
        }
      }
    }
  }
}

返回值:

{
  "took": 16,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
      {
        "_index": "company",
        "_type": "emp",
        "_id": "2",
        "_score": 1,
        "_source": {
          "address": {
            "country": "china",
            "province": "jiangsu",
            "city": "nanjing"
          },
          "name": "tom",
          "age": 30,
          "join_date": "2016-01-01"
        }
      }
    ]
  }
}

可以看到搜到了一条满足条件的数据.

query 与 filter 对比

filter:仅仅只是按照搜索条件过滤出需要的数据而已,不计算任何相关度分数,对相关度没有任何影响.
query: 会去计算每个document相对于搜索条件的相关度,并按照相关度进行排序.

一般来说,我们在搜索的时候需要将最匹配的数据先返回的时候,就用query,如果只是需要根据条件筛选出一些数据,不关注其相关度,就用filter

除非你的这些搜索条件,你希望越符合这些搜索条件的document越排在前面,那么这些搜索条件要放到query中去.
如果你不希望一些搜索条件来影响你的document排序的话,那么就放在filter中即可

query 与 filter 性能

filter不需要计算相关度分数进行排序,同时还有内置的cache,自动缓存最常使用的filter数据
query相反,要计算相关度分数,按照分数进行排序,而且无法cache结果