diff --git a/_analyzers/character-filters/index.md b/_analyzers/character-filters/index.md index e6bb79d9d7..ddffd3c9df 100644 --- a/_analyzers/character-filters/index.md +++ b/_analyzers/character-filters/index.md @@ -8,16 +8,12 @@ has_toc: false # Character filters -Character filters process the text before tokenization, modifying or cleaning the input to prepare it for further analysis. +Character filters process the text before tokenization, modifying, or cleaning the input to prepare it for further analysis. -Unlike token filters, which operate on tokens (words or terms), character filters work on the raw input text before tokenization. They are especially useful for cleaning or transforming structured text with unwanted characters, like HTML tags or special symbols. Character filters help strip or replace these elements, ensuring the text is properly formatted for analysis. +Unlike token filters, which operate on tokens (words or terms), character filters work on the raw input text before tokenization. They are especially useful for cleaning or transforming structured text with unwanted characters, such as HTML tags or special symbols. Character filters help strip or replace these elements so that text is properly formatted for analysis. Use cases for character filters include: -## HTML stripping -Removing HTML tags from content, ensuring only the visible text is indexed. See [HTML stripping]({{site.url}}{{site.baseurl}}/analyzers/html-character-filter) for more information. - -## Pattern replacement -Replacing or removing unwanted characters or patterns in text (e.g., converting hyphens to spaces -## Custom mappings -Substituting specific characters or sequences with other values, such as converting currency symbols into their textual equivalents. +- **HTML stripping:** Removes HTML tags from content so that only the plain text is indexed. See [HTML stripping]({{site.url}}{{site.baseurl}}/analyzers/html-character-filter) for more information. +- **Pattern replacement:** Replaces or removes unwanted characters or patterns in text, for example, converting hyphens to spaces. +- **Custom mappings:** Substitutes specific characters or sequences with other values, for example, converting currency symbols into their textual equivalents.