Elasticsearch-38-实战案例-term filter搜索

之前都是随便写的一些demo来测试es的api,本文及以后将会基于一个案例,来更加深入使用这些api,之后会再使用Java api来实现具体功能.

场景

以一个IT论坛为背景,来置顶搜索需求,以及实现.

测试数据

POST /forum/article/_bulk
{ "index": { "_id": 1 }}
{ "articleID" : "XHDK-A-1293-#fJ3", "userID" : 1, "hidden": false, "postDate": "2017-01-01" }
{ "index": { "_id": 2 }}
{ "articleID" : "KDKE-B-9947-#kL5", "userID" : 1, "hidden": false, "postDate": "2017-01-02" }
{ "index": { "_id": 3 }}
{ "articleID" : "JODL-X-1937-#pV7", "userID" : 2, "hidden": false, "postDate": "2017-01-01" }
{ "index": { "_id": 4 }}
{ "articleID" : "QQPX-R-3956-#aD8", "userID" : 2, "hidden": true, "postDate": "2017-01-02" }

使用_bulk api来添加数据,目前我们只添加这几个field,articleID,userId,hidden

执行完毕以后,我们来查看一下dynamic mapping给我建立的mapping

1	GET /forum/_mapping/article

返回值:

{
  "forum": {
    "mappings": {
      "article": {
        "properties": {
          "articleID": {
            "type": "text",
            "fields": {
              "keyword": {  // 1
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "hidden": {
            "type": "boolean"
          },
          "postDate": {
            "type": "date"
          },
          "userID": {
            "type": "long"
          }
        }
      }
    }
  }
}

这里我们看1处,”articleID”的类型是text,里面还有一个”articleID.keyword”,这个东西是干嘛的呢?

在新版es中,type=text的时候,默认会设置两个field,一个是field本身,比如”articleID”,他是分词的,还有一个就是field.keyword,比如”articleID.keyword”,默认是不分词的, keyword里面还有一个属性是”ignore_above”:256,意思就是最多会保留256个字符

term filter的使用

term filter/query: 对搜索文本不分词,直接拿去倒排索引中去匹配,你输入的是什么,就去匹配什么

需求1:根据用户id来搜索帖子

GET /forum/article/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "term": {
          "userID": 1
        }
      }
    }
  }
}

需求2:搜索没有隐藏的帖子

GET /forum/article/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "term": {
          "hidden": false
        }
      }
    }
  }
}

需求3:根据发帖日期搜索帖子

GET /forum/article/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "term": {
          "postDate": "2017-01-01"
        }
      }
    }
  }
}

需求4:根据帖子id搜索帖子

GET /forum/article/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "term": {
          "articleID": "XHDK-A-1293-#fJ3"
        }
      }
    }
  }
}

返回值:

{
  "took": 5,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}

这里可以看到,一条结果也没有,但是应该是有这个数据的,为什么呢?

在添加数据的时候,字符串是默认会去分词,然后建立倒排索引的,而term是不去分词的,所以是查不到的

我们可以用上面es自动建立的keyword来进行搜索

GET /forum/article/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "term": {
          "articleID.keyword": "XHDK-A-1293-#fJ3"
        }
      }
    }
  }
}

返回值:

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
      {
        "_index": "forum",
        "_type": "article",
        "_id": "1",
        "_score": 1,
        "_source": {
          "articleID": "XHDK-A-1293-#fJ3",
          "userID": 1,
          "hidden": false,
          "postDate": "2017-01-01"
        }
      }
    ]
  }
}

这样就可以搜索到了,但是同时也有一个问题,就是keyword只会保留256个字符,如果这个字段太长的话那就还是搜索不到的.这时候,我们最好重建索引,手动设置mapping

删除索引

1	DELETE /forum

手动创建索引,指定articleID不分词

PUT /forum
{
  "mappings": {
    "article":{
      "properties": {
        "articleID":{
          "type": "keyword"
        }
      }
    }
  }
}

然后把上面的数据重新添加进去.
现在,再用articleID来进行查询

GET /forum/article/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "term": {
          "articleID": "XHDK-A-1293-#fJ3"
        }
      }
    }
  }
}

返回值:

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
      {
        "_index": "forum",
        "_type": "article",
        "_id": "1",
        "_score": 1,
        "_source": {
          "articleID": "XHDK-A-1293-#fJ3",
          "userID": 1,
          "hidden": false,
          "postDate": "2017-01-01"
        }
      }
    ]
  }
}

这时候就可以查询的到了

总结

term filter:根据exact value来进行搜索,数字,Boolean,date类型的天然支持
text类型的field需要在建立的索引的时候指定not_analyzed(新版中可以直接指定type为keyword),才可以使用term