-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Description
Describe the bug
When using the unified
highlighter on a search that uses match_phrase_prefix
on a nested
field, no highlights are returned.
Notably, highlights work in all of these slightly different cases:
- Using the
unified
highlighter withmatch_phrase_prefix
on a non-nested field - Using the
unified
highlighter withmatch
on a nested field - Using the
unified
highlighter withmatch_phrase
on a nested field - Using the
unified
highlighter withmatch_bool_prefix
on a nested field - Using the
plain
highlighter withmatch_bool_prefix
on a nested field
In addition, it works correctly on modern versions of Elasticsearch (I tested on 9.1.0).
Related component
Search:Query Insights
To Reproduce
Create an executable script with this content:
#!/bin/bash
# OpenSearch/Elasticsearch Highlighting Bug Demonstration Script
# Bug: unified highlighter does not correctly highlight nested fields when match_phrase_prefix is used
# Author: Generated for bug report
# Usage: ./opensearch_highlighting_bug_demo.sh [opensearch|elasticsearch] [version]
set -e
# Default values
ENGINE="opensearch"
VERSION="3.1.0"
CONTAINER_NAME="search_engine_test"
PORT="9200"
# Arrays to track test results
declare -a TEST_RESULTS
declare -a TEST_DESCRIPTIONS
# Parse command line arguments
if [ $# -ge 1 ]; then
ENGINE="$1"
fi
if [ $# -ge 2 ]; then
VERSION="$2"
fi
# Validate engine choice
if [[ "$ENGINE" != "opensearch" && "$ENGINE" != "elasticsearch" ]]; then
echo "Error: Engine must be 'opensearch' or 'elasticsearch'"
echo "Usage: $0 [opensearch|elasticsearch] [version]"
exit 1
fi
echo "========================================="
echo "OpenSearch/Elasticsearch Highlighting Bug Demo"
echo "Engine: $ENGINE"
echo "Version: $VERSION"
echo "========================================="
# Function to wait for the search engine to be ready
wait_for_engine() {
echo "Waiting for $ENGINE to be ready..."
local max_attempts=30
local attempt=1
while [ $attempt -le $max_attempts ]; do
# Try to get a response
if curl -s -f "http://localhost:$PORT" > /dev/null 2>&1; then
echo "$ENGINE is ready!"
return 0
fi
# Check if container is still running on failure
if [ $attempt -eq 15 ] && ! docker ps --filter "name=$CONTAINER_NAME" --format "{{.Names}}" | grep -q "$CONTAINER_NAME"; then
echo "ERROR: Container $CONTAINER_NAME stopped running!"
echo "Container logs:"
docker logs "$CONTAINER_NAME" 2>/dev/null || echo "No logs available"
exit 1
fi
printf "."
sleep 2
((attempt++))
done
echo ""
echo "Error: $ENGINE failed to start within expected time"
echo "Container logs:"
docker logs "$CONTAINER_NAME" 2>/dev/null || echo "No logs available"
exit 1
}
# Function to check if highlighting worked
check_highlighting() {
local response="$1"
local field_name="$2"
# Check if the response contains highlight data for the specified field
if echo "$response" | jq -e ".hits.hits[0].highlight[\"$field_name\"]" > /dev/null 2>&1; then
return 0 # Highlighting worked
else
return 1 # No highlighting
fi
}
# Function to make HTTP requests with error handling and result tracking
make_request() {
local method="$1"
local url="$2"
local data="$3"
local description="$4"
local test_type="$5"
local highlight_field="$6"
echo "--- $description ---"
if [ -n "$data" ]; then
response=$(curl -s -X "$method" -H "Content-Type: application/json" -d "$data" "$url")
else
response=$(curl -s -X "$method" "$url")
fi
# If this is a test request, check the highlighting and record the result
if [ -n "$test_type" ] && [ -n "$highlight_field" ]; then
if check_highlighting "$response" "$highlight_field"; then
TEST_RESULTS+=("PASS")
echo "✅ RESULT: Highlighting worked correctly"
else
TEST_RESULTS+=("FAIL")
echo "❌ RESULT: No highlighting found"
fi
TEST_DESCRIPTIONS+=("$test_type")
else
# For non-test requests, just show success/failure
if echo "$response" | jq -e '.error' > /dev/null 2>&1; then
echo "❌ Request failed"
echo "$response" | jq '.error' 2>/dev/null || echo "$response"
else
echo "✅ Request successful"
fi
fi
echo ""
}
# Clean up any existing container
echo "Cleaning up any existing containers..."
docker stop "$CONTAINER_NAME" 2>/dev/null || true
docker rm "$CONTAINER_NAME" 2>/dev/null || true
# Start the appropriate search engine
echo "Starting $ENGINE:$VERSION..."
if [ "$ENGINE" = "opensearch" ]; then
echo "Starting OpenSearch with security disabled..."
# For OpenSearch 3.x, we need different configuration
if [[ "$VERSION" =~ ^3\. ]]; then
echo "Using OpenSearch 3.x configuration..."
docker run -d \
--name "$CONTAINER_NAME" \
-p "$PORT:9200" \
-e "discovery.type=single-node" \
-e "OPENSEARCH_INITIAL_ADMIN_PASSWORD=Admin123!" \
-e "DISABLE_INSTALL_DEMO_CONFIG=true" \
-e "DISABLE_SECURITY_PLUGIN=true" \
-e "bootstrap.memory_lock=false" \
opensearchproject/opensearch:"$VERSION"
else
echo "Using OpenSearch 2.x configuration..."
docker run -d \
--name "$CONTAINER_NAME" \
-p "$PORT:9200" \
-e "discovery.type=single-node" \
-e "OPENSEARCH_INITIAL_ADMIN_PASSWORD=Admin123!" \
-e "plugins.security.disabled=true" \
opensearchproject/opensearch:"$VERSION"
fi
else
echo "Starting Elasticsearch with security disabled..."
docker run -d \
--name "$CONTAINER_NAME" \
-p "$PORT:9200" \
-e "discovery.type=single-node" \
-e "xpack.security.enabled=false" \
elasticsearch:"$VERSION"
fi
# Wait for the engine to be ready
wait_for_engine
# Create index with nested field mapping
INDEX_NAME="test_highlighting"
MAPPING='{
"mappings": {
"properties": {
"title": {
"type": "text"
},
"content": {
"type": "text"
},
"nested_field": {
"type": "nested",
"properties": {
"name": {
"type": "text"
},
"description": {
"type": "text"
}
}
}
}
}
}'
make_request "PUT" "http://localhost:$PORT/$INDEX_NAME" "$MAPPING" "Creating index with nested field mapping"
# Index a test document
DOCUMENT='{
"title": "Sample Document Title",
"content": "This is the main content of the document with some searchable text",
"nested_field": [
{
"name": "First nested item",
"description": "This is a description for the first nested item with searchable content"
},
{
"name": "Second nested item",
"description": "Another description for the second nested item with more searchable text"
}
]
}'
make_request "POST" "http://localhost:$PORT/$INDEX_NAME/_doc/1" "$DOCUMENT" "Indexing test document"
# Refresh the index to make the document searchable
make_request "POST" "http://localhost:$PORT/$INDEX_NAME/_refresh" "" "Refreshing index"
echo ""
echo "========================================="
echo "DEMONSTRATING THE BUG"
echo "========================================="
# Test 1: match_phrase_prefix with nested field using unified highlighter (BUG in OpenSearch, works in Elasticsearch)
echo ""
echo "TEST 1: match_phrase_prefix + nested field + unified highlighter"
QUERY1='{
"query": {
"nested": {
"path": "nested_field",
"query": {
"match_phrase_prefix": {
"nested_field.description": "searchable"
}
}
}
},
"highlight": {
"type": "unified",
"fields": {
"nested_field.description": {}
}
}
}'
make_request "POST" "http://localhost:$PORT/$INDEX_NAME/_search" "$QUERY1" "TEST 1: match_phrase_prefix + nested + unified highlighter" "match_phrase_prefix + nested + unified" "nested_field.description"
# Test 2: match_phrase_prefix with non-nested field using unified highlighter (WORKS)
echo ""
echo "TEST 2: match_phrase_prefix + non-nested field + unified highlighter"
QUERY2='{
"query": {
"match_phrase_prefix": {
"content": "searchable"
}
},
"highlight": {
"type": "unified",
"fields": {
"content": {}
}
}
}'
make_request "POST" "http://localhost:$PORT/$INDEX_NAME/_search" "$QUERY2" "TEST 2: match_phrase_prefix + non-nested + unified highlighter" "match_phrase_prefix + non-nested + unified" "content"
# Test 3: match (not match_phrase_prefix) with nested field using unified highlighter (WORKS)
echo ""
echo "TEST 3: match + nested field + unified highlighter"
QUERY3='{
"query": {
"nested": {
"path": "nested_field",
"query": {
"match": {
"nested_field.description": "searchable"
}
}
}
},
"highlight": {
"type": "unified",
"fields": {
"nested_field.description": {}
}
}
}'
make_request "POST" "http://localhost:$PORT/$INDEX_NAME/_search" "$QUERY3" "TEST 3: match + nested + unified highlighter" "match + nested + unified" "nested_field.description"
# Test 4: match_phrase with nested field using unified highlighter (WORKS)
echo ""
echo "TEST 4: match_phrase + nested field + unified highlighter"
QUERY4='{
"query": {
"nested": {
"path": "nested_field",
"query": {
"match_phrase": {
"nested_field.description": "searchable content"
}
}
}
},
"highlight": {
"type": "unified",
"fields": {
"nested_field.description": {}
}
}
}'
make_request "POST" "http://localhost:$PORT/$INDEX_NAME/_search" "$QUERY4" "TEST 4: match_phrase + nested + unified highlighter" "match_phrase + nested + unified" "nested_field.description"
# Test 5: match_bool_prefix with nested field using unified highlighter (WORKS)
echo ""
echo "TEST 5: match_bool_prefix + nested field + unified highlighter"
QUERY5='{
"query": {
"nested": {
"path": "nested_field",
"query": {
"match_bool_prefix": {
"nested_field.description": "searchable"
}
}
}
},
"highlight": {
"type": "unified",
"fields": {
"nested_field.description": {}
}
}
}'
make_request "POST" "http://localhost:$PORT/$INDEX_NAME/_search" "$QUERY5" "TEST 5: match_bool_prefix + nested + unified highlighter" "match_bool_prefix + nested + unified" "nested_field.description"
# Test 6: match_phrase_prefix with nested field using plain highlighter (WORKS)
echo ""
echo "TEST 6: match_phrase_prefix + nested field + plain highlighter"
QUERY6='{
"query": {
"nested": {
"path": "nested_field",
"query": {
"match_phrase_prefix": {
"nested_field.description": "searchable"
}
}
}
},
"highlight": {
"type": "plain",
"fields": {
"nested_field.description": {}
}
}
}'
make_request "POST" "http://localhost:$PORT/$INDEX_NAME/_search" "$QUERY6" "TEST 6: match_phrase_prefix + nested + plain highlighter" "match_phrase_prefix + nested + plain" "nested_field.description"
echo ""
echo "========================================="
echo "TEST RESULTS SUMMARY"
echo "========================================="
echo ""
echo "Engine: $ENGINE $VERSION"
echo ""
for i in "${!TEST_RESULTS[@]}"; do
result="${TEST_RESULTS[$i]}"
description="${TEST_DESCRIPTIONS[$i]}"
if [ "$result" = "PASS" ]; then
echo "✅ PASS: $description"
else
echo "❌ FAIL: $description"
fi
done
# Cleanup function
cleanup() {
echo "Cleaning up..."
docker stop "$CONTAINER_NAME" 2>/dev/null || true
docker rm "$CONTAINER_NAME" 2>/dev/null || true
}
# Ask user if they want to keep the container running
echo "Would you like to keep the $ENGINE container running for further testing? (y/N)"
read -r response
if [[ "$response" =~ ^[Yy]$ ]]; then
echo "Container '$CONTAINER_NAME' is still running on port $PORT"
echo "You can access it at: http://localhost:$PORT"
echo "To stop it later, run: docker stop $CONTAINER_NAME && docker rm $CONTAINER_NAME"
else
cleanup
echo "Container cleaned up."
fi
echo ""
echo "Script completed successfully!"
Then run it with elasticsearch
or opensearch
and a version:
$ ./opensearch_highlighting_bug_demo.sh opensearch 3.1.0
Expected behavior
I expect output indicating that highlights were returned. Here's example successful output from running the script against Elasticsearch 9.1.0:
$ ./opensearch_highlighting_bug_demo.sh elasticsearch 9.1.0
=========================================
OpenSearch/Elasticsearch Highlighting Bug Demo
Engine: elasticsearch
Version: 9.1.0
=========================================
Cleaning up any existing containers...
Starting elasticsearch:9.1.0...
Starting Elasticsearch with security disabled...
7543208dbc107a7d51fe7ccf59995e1942de11d5f5f8cd8fbf90035cc56aa4b5
Waiting for elasticsearch to be ready...
......elasticsearch is ready!
--- Creating index with nested field mapping ---
✅ Request successful
--- Indexing test document ---
✅ Request successful
--- Refreshing index ---
✅ Request successful
=========================================
DEMONSTRATING THE BUG
=========================================
TEST 1: match_phrase_prefix + nested field + unified highlighter
--- TEST 1: match_phrase_prefix + nested + unified highlighter ---
✅ RESULT: Highlighting worked correctly
TEST 2: match_phrase_prefix + non-nested field + unified highlighter
--- TEST 2: match_phrase_prefix + non-nested + unified highlighter ---
✅ RESULT: Highlighting worked correctly
TEST 3: match + nested field + unified highlighter
--- TEST 3: match + nested + unified highlighter ---
✅ RESULT: Highlighting worked correctly
TEST 4: match_phrase + nested field + unified highlighter
--- TEST 4: match_phrase + nested + unified highlighter ---
✅ RESULT: Highlighting worked correctly
TEST 5: match_bool_prefix + nested field + unified highlighter
--- TEST 5: match_bool_prefix + nested + unified highlighter ---
✅ RESULT: Highlighting worked correctly
TEST 6: match_phrase_prefix + nested field + plain highlighter
--- TEST 6: match_phrase_prefix + nested + plain highlighter ---
✅ RESULT: Highlighting worked correctly
=========================================
TEST RESULTS SUMMARY
=========================================
Engine: elasticsearch 9.1.0
✅ PASS: match_phrase_prefix + nested + unified
✅ PASS: match_phrase_prefix + non-nested + unified
✅ PASS: match + nested + unified
✅ PASS: match_phrase + nested + unified
✅ PASS: match_bool_prefix + nested + unified
✅ PASS: match_phrase_prefix + nested + plain
Would you like to keep the elasticsearch container running for further testing? (y/N)
n
Cleaning up...
search_engine_test
search_engine_test
Container cleaned up.
Script completed successfully!
In contrast, here's the output I get from OpenSearch 3.1.0:
./opensearch_highlighting_bug_demo.sh opensearch 3.1.0
=========================================
OpenSearch/Elasticsearch Highlighting Bug Demo
Engine: opensearch
Version: 3.1.0
=========================================
Cleaning up any existing containers...
Starting opensearch:3.1.0...
Starting OpenSearch with security disabled...
Using OpenSearch 3.x configuration...
ce8cee86e80f28b51dd5efc6a368d441b41eb40ddaa8e6d3447f7e866e719431
Waiting for opensearch to be ready...
....opensearch is ready!
--- Creating index with nested field mapping ---
✅ Request successful
--- Indexing test document ---
✅ Request successful
--- Refreshing index ---
✅ Request successful
=========================================
DEMONSTRATING THE BUG
=========================================
TEST 1: match_phrase_prefix + nested field + unified highlighter
--- TEST 1: match_phrase_prefix + nested + unified highlighter ---
❌ RESULT: No highlighting found
TEST 2: match_phrase_prefix + non-nested field + unified highlighter
--- TEST 2: match_phrase_prefix + non-nested + unified highlighter ---
✅ RESULT: Highlighting worked correctly
TEST 3: match + nested field + unified highlighter
--- TEST 3: match + nested + unified highlighter ---
✅ RESULT: Highlighting worked correctly
TEST 4: match_phrase + nested field + unified highlighter
--- TEST 4: match_phrase + nested + unified highlighter ---
✅ RESULT: Highlighting worked correctly
TEST 5: match_bool_prefix + nested field + unified highlighter
--- TEST 5: match_bool_prefix + nested + unified highlighter ---
✅ RESULT: Highlighting worked correctly
TEST 6: match_phrase_prefix + nested field + plain highlighter
--- TEST 6: match_phrase_prefix + nested + plain highlighter ---
✅ RESULT: Highlighting worked correctly
=========================================
TEST RESULTS SUMMARY
=========================================
Engine: opensearch 3.1.0
❌ FAIL: match_phrase_prefix + nested + unified
✅ PASS: match_phrase_prefix + non-nested + unified
✅ PASS: match + nested + unified
✅ PASS: match_phrase + nested + unified
✅ PASS: match_bool_prefix + nested + unified
✅ PASS: match_phrase_prefix + nested + plain
Would you like to keep the opensearch container running for further testing? (y/N)
n
Cleaning up...
search_engine_test
search_engine_test
Container cleaned up.
Script completed successfully!
Additional Details
Plugins
None.
Screenshots
None, but see the script output above.
Host/Environment (please complete the following information):
- OS: Mac OS X, but we've observed this on linux (e.g. via managed AWS OpenSearch) as well
- Version [e.g. 22] 3.1.0, but it seems like it's a bug in all OpenSearch versions
Additional context
None.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status