opensearch-project · vagimeli · May 8, 2024 · May 8, 2024 · May 10, 2024 · May 13, 2024
@@ -0,0 +1,89 @@
+---
+layout: default
+title: analyzer
+parent: Mapping parameters
+grand_parent: Mapping and field types
+nav_order: 5
+has_children: false
+has_toc: false
+---
+
+# `analyzer`
+
+The `analyzer` mapping parameter is used to define the text analysis process that is applied to a text field during both indexing and searching operations.
+
+The key functions of the `analyzer` mapping parameter are:
+
+1. **Tokenization:** The analyzer determines how the text is broken down into individual tokens (words, numbers) that can be indexed and searched. Each generated token must not exceed 32,766 bytes to avoid indexing failures.
+
+2. **Normalization:** The analyzer can apply various normalization techniques, such as converting text to lowercase, removing stopwords, and stemming/lemmatizing words.
+
+3. **Consistency:** By defining the same analyzer for both indexing and searching, you ensure that the text analysis process is consistent, which helps improve the relevance of search results.
+
+4. **Customization:** OpenSearch allows you to define custom analyzers by specifying the tokenizer, character filters, and token filters to use. This gives you fine-grained control over the text analysis process.
+
+------------
+
+## Example
+
+For example, here's a sample configuration that defines a custom analyzer called `my_custom_analyzer`:
+
+```json
+PUT my_index
+{
+  "settings": {
+    "analysis": {
+      "analyzer": {
+        "my_custom_analyzer": {
+          "type": "custom",
+          "tokenizer": "standard",
+          "filter": [
+            "lowercase",
+            "my_stop_filter",
+            "my_stemmer"
+          ]
+        }
+      },
+      "filter": {
+        "my_stop_filter": {
+          "type": "stop",
+          "stopwords": ["the", "a", "and", "or"]
+        },
+        "my_stemmer": {
+          "type": "stemmer",
+          "language": "english"
+        }
+      }
+    }
+  },
+  "mappings": {
+    "properties": {
+      "my_text_field": {
+        "type": "text",
+        "analyzer": "my_custom_analyzer",
+        "search_analyzer": "standard",
+        "search_quote_analyzer": "my_custom_analyzer"
+      }
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+In this example, the `my_custom_analyzer` uses the standard tokenizer, converts all tokens to lowercase, applies a custom stopword filter, and then applies an English stemmer.
+
+You can then map a text field to use this custom analyzer for both indexing and searching:
+
+```json
+"mappings": {
+  "properties": {
+    "my_text_field": {
+      "type": "text",
+      "analyzer": "my_custom_analyzer"
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+By configuring the `analyzer` mapping parameter, you can ensure that your text fields are analyzed consistently and in a way that optimizes the relevance of your search results.
@@ -0,0 +1,52 @@
+---
+layout: default
+title: boost
+parent: Mapping parameters
+grand_parent: Mapping and field types
+nav_order: 10
+has_children: false
+has_toc: false
+---
+
+# `boost` 
+
+The `boost` mapping parameter is used to increase or decrease the relevance score of a field during search queries. It allows you to give more or less weight to specific fields when calculating the overall relevance score for a document.
+
+The `boost` parameter is applied as a multiplier to the score of a field. For example, if a field has a `boost` value of `2`, then the score contribution of that field is doubled. Conversely, a `boost` value of `0.5` would halve the score contribution of that field.
+
+-----------
+
+## Example
+
+The following is an example of how you can use the `boost` parameter in an OpenSearch mapping:
+
+```json
+PUT my-index1
+{
+  "mappings": {
+    "properties": {
+      "title": {
+        "type": "text",
+        "boost": 2
+      },
+      "description": {
+        "type": "text",
+        "boost": 1
+      },
+      "tags": {
+        "type": "keyword",
+        "boost": 1.5
+      }
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+In this example, the `title` field has a boost of 2, which means it contributes twice as much to the overall relevance score as the description field (which has a boost of 1). The `tags` field has a boost of 1.5, so it contributes 1.5 times more than the description field.
+
+The `boost` parameter is particularly useful when you want to give more weight to certain fields that are more important for your use case. For example, you might want to boost the `title` field more than the `description` field, as the title is often a better indicator of the document's relevance.
+
+It is important to note that the `boost` parameter is a multiplicative factor, not an additive one. This means that a field with a higher boost value will have a disproportionately higher impact on the overall relevance score compared to fields with lower boost values.
+
+When using the `boost` parameter, it is recommended to start with small values (1.5 or 2) and test the impact on your search results. Overly high boost values can skew the relevance scores and lead to unexpected or undesirable search results.
@@ -0,0 +1,98 @@
+---
+layout: default
+title: coerce
+parent: Mapping parameters
+grand_parent: Mapping and field types
+nav_order: 15
+has_children: false
+has_toc: false
+---
+
+# `coerce`
+
+The `coerce` mapping parameter controls how values are converted to the expected data type of a field during indexing. By using this parameter, you can ensure your data is properly formatted and indexed according to the expected field types, helping to maintain data integrity and improve the accuracy of your search results.
+
+## Examples
+
+Here are some examples for using the `coerce` mapping parameter.
+
+#### Indexing a document with `coerce` enabled
+
+```json
+PUT products
+{
+  "mappings": {
+    "properties": {
+      "price": {
+        "type": "integer",
+        "coerce": true
+      }
+    }
+  }
+}
+
+PUT products/_doc/1
+{
+  "name": "Product A",
+  "price": "19.99"
+}
+```
+{% include copy-curl.html %}
+
+In this example, the `price` field is defined as an `integer` type with `coerce` set to `true`. When indexing the document, the string value `19.99` is coerced to the integer `19`.
+
+#### Indexing a document with `coerce` disabled
+
+```json
+PUT orders
+{
+  "mappings": {
+    "properties": {
+      "quantity": {
+        "type": "integer",
+        "coerce": false
+      }
+    }
+  }
+}
+
+PUT orders/_doc/1
+{
+  "item": "Widget",
+  "quantity": "10"
+}
+```
+{% include copy-curl.html %}
+
+In this example, the `quantity` field is defined as an `integer` type with `coerce` set to `false`. When indexing the document, the string value `10` is not coerced, and the document is rejected due to the type mismatch. 
+
+#### Setting the index-level coercion setting
+
+```json
+PUT inventory
+{
+  "settings": {
+    "index.mapping.coerce": false
+  },
+  "mappings": {
+    "properties": {
+      "stock_count": {
+        "type": "integer",
+        "coerce": true
+      },
+      "sku": {
+        "type": "keyword"
+      }
+    }
+  }
+}
+
+PUT inventory/_doc/1
+{
+  "sku": "ABC123",
+  "stock_count": "50"
+}
+```
+{% include copy-curl.html %}
+
+In this example, the index-level `index.mapping.coerce` setting is set to `false`, which disables coercion globally. However, the `stock_count` field overrides this setting and enables coercion for that specific field.
@@ -0,0 +1,104 @@
+---
+layout: default
+title: copy_to
+parent: Mapping parameters
+grand_parent: Mapping and field types
+nav_order: 20
+has_children: false
+has_toc: false
+---
+
+# `copy-to`
+
+The `copy_to` parameter allows you to copy the values of multiple fields into a single field. This can be useful if you often search across multiple fields, as it allows you to search the group field instead. 
+
+The field value is copied, not the terms resulting from the analysis process. The original `_source` field remains unmodified, and the same value can be copied to multiple fields using the `copy_to` parameter. However, recursive copying through intermediary fields is not supported; instead, use `copy_to` directly from the originating field to multiple target fields.
+
+For example, if you want to search for products by their name and description, you can use the `copy-to` parameter to copy those values into a single field, as follows:
+
+```json
+PUT my-products-index
+{
+  "mappings": {
+    "properties": {
+      "name": {
+        "type": "text",
+        "copy_to": "product_info"
+      },
+      "description": {
+        "type": "text",
+        "copy_to": "product_info" 
+      },
+      "product_info": {
+        "type": "text"
+      },
+      "price": {
+        "type": "float"
+      }
+    }
+  }
+}
+
+PUT my-products-index/_doc/1
+{
+  "name": "Wireless Headphones",
+  "description": "High-quality wireless headphones with noise cancellation",
+  "price": 99.99
+}
+
+PUT my-products-index/_doc/2
+{
+  "name": "Bluetooth Speaker",
+  "description": "Portable Bluetooth speaker with long battery life",
+  "price": 49.99
+}
+```
+{% include copy-curl.html %}
+
+In this example, the values from the name and description fields are copied into the `product_info` field. You can now search for products by querying the `product_info` field, as follows:
+
+```json
+GET my-products-index/_search
+{
+  "query": {
+    "match": {
+      "product_info": "wireless headphones"
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+#### Response
+
+```json
+{
+  "took": 20,
+  "timed_out": false,
+  "_shards": {
+    "total": 1,
+    "successful": 1,
+    "skipped": 0,
+    "failed": 0
+  },
+  "hits": {
+    "total": {
+      "value": 1,
+      "relation": "eq"
+    },
+    "max_score": 1.9061546,
+    "hits": [
+      {
+        "_index": "my-products-index",
+        "_id": "1",
+        "_score": 1.9061546,
+        "_source": {
+          "name": "Wireless Headphones",
+          "description": "High-quality wireless headphones with noise cancellation",
+          "price": 99.99
+        }
+      }
+    ]
+  }
+}
+```
@@ -0,0 +1,9 @@
+---
+layout: default
+title: doc_values
+parent: Mapping parameters
+grand_parent: Mapping and field types
+nav_order: 25
+has_children: false
+has_toc: false
+---
@@ -0,0 +1,9 @@
+---
+layout: default
+title: dynamic
+parent: Mapping parameters
+grand_parent: Mapping and field types
+nav_order: 30
+has_children: false
+has_toc: false
+---
@@ -0,0 +1,9 @@
+---
+layout: default
+title: eager_global_ordinals
+parent: Mapping parameters
+grand_parent: Mapping and field types
+nav_order: 35
+has_children: false
+has_toc: false
+---
@@ -0,0 +1,9 @@
+---
+layout: default
+title: enabled
+parent: Mapping parameters
+grand_parent: Mapping and field types
+nav_order: 40
+has_children: false
+has_toc: false
+---