-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Description
Description
中文more like this查询,highlight的词汇不对。 比如我查询 “项目经理”,但是返回的结果highlight的是: “高< em>级项目经< /em>理(”
Steps to reproduce
创建ik_smart的index
#!/usr/bin/bash
curl -X DELETE "localhost:9201/my_index"
curl -X PUT "localhost:9201/my_index" -H 'Content-Type: application/json' -d'
{
"settings": {
"analysis": {
"analyzer": {
"my_ik_smart": {
"type": "custom",
"tokenizer": "ik_smart"
}
}
}
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "my_ik_smart",
"position_increment_gap": 1,
"term_vector": "with_positions_offsets_payloads"
}
}
}
}
'
插入文档:
#! /usr/bin/bash
curl -X POST "localhost:9201/my_index/_doc/1" -H 'Content-Type: application/json' -d @- << 'EOF'
{
"title": [
"项目经理",
"ex Mingyuan - 前任 明源福州 销售负责人(till 06/2019)/ 前任 用友 高级项目经理(till 03/2020)",
"销售负责人(till 06/2019)/ 前任 用友 高级项目经理(till 03/2020)"
]
}
EOF
curl -X POST "localhost:9201/my_index/_doc/2" -H 'Content-Type: application/json' -d @- << 'EOF'
{
"title": [
"开发工程师",
"前任 Google 软件工程师经理",
"现任 Facebook 高级开发工程师"
]
}
EOF
curl -X POST "localhost:9201/my_index/_doc/3" -H 'Content-Type: application/json' -d @- << 'EOF'
{
"title": [
"数据分析师",
"前任 IBM 数据分析师",
"现任 Amazon 数据科学家",
"现任 Amazon 项目数据科学家"
]
}
EOF
使用more like this 和 highlight 查询:
#! /usr/bin/bash
curl -X POST "localhost:9201/my_index/_search?pretty" -H 'Content-Type: application/json' -d @- << 'EOF'
{
"query": {
"more_like_this": {
"fields": ["title"],
"like": "项目经理",
"min_term_freq": 1,
"min_doc_freq": 1,
"analyzer": "my_ik_smart"
}
},
"highlight": {
"fields": {
"title": {"type": "fvh",
"fragment_size": 150,
"number_of_fragments": 3}
}
}
}
EOF
Priovde your configuration or code snippet that helps.
Expected behavior
期望项目经理可以得到highlight
Actual behavior
得到结果,:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.3648179,
"hits" : [
{
"_index" : "my_index",
"_id" : "1",
"_score" : 1.3648179,
"_source" : {
"title" : [
"项目经理",
"ex Mingyuan - 前任 明源福州 销售负责人(till 06/2019)/ 前任 用友 高级项目经理(till 03/2020)",
"销售负责人(till 06/2019)/ 前任 用友 高级项目经理(till 03/2020)"
]
},
"highlight" : {
"title" : [
"项目经理",
"ex Mingyuan - 前任 明源福州 销售负责人(till 06/2019)/ 前任 用友 高级项目经理(till 03/2020)",
"销售负责人(till 06/2019)/ 前任 用友 高< em>级项目经< /em>理(till 03/2020)"
]
}
}
]
}
}
Environment
- Versions: [ docker.elastic.co/elasticsearch/elasticsearch:8.4.1]
- bin/elasticsearch-plugin install --batch https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v8.4.1/elasticsearch-analysis-ik-8.4.1.zip;