Apply suggestions from code review

Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: AntonEliatra <[email protected]>
opensearch-project · Sep 13, 2024 · e373b7b · e373b7b
1 parent 0505d9b
commit e373b7b
Show file tree

Hide file tree

Showing 2 changed files with 5 additions and 6 deletions.
diff --git a/_analyzers/token-filters/cjk-width.md b/_analyzers/token-filters/cjk-width.md
@@ -7,11 +7,11 @@ nav_order: 40
 
 # CJK width token filter
 
-The CJK Width token filter normalizes Chinese, Japanese, and Korean (CJK) tokens by converting full-width ASCII characters to their standard (half-width) ASCII equivalents and half-width katakana characters to their full-width equivalents.
+The `cjk_width` token filter normalizes Chinese, Japanese, and Korean (CJK) tokens by converting full-width ASCII characters to their standard (half-width) ASCII equivalents and half-width katakana characters to their full-width equivalents.
 
-### Converting full-width ASCII Character
+### Converting full-width ASCII characters
 
-In CJK texts, ASCII characters (such as letters and numbers) can appear in full-width form, which occupies the space of two half-width characters. Full-width ASCII characters are typically used in East Asian typography to align with the width of CJK characters. However, for the purpose of indexing and searching, these full-width characters need to be normalized to their standard (half-width) ASCII equivalents.
+In CJK texts, ASCII characters (such as letters and numbers) can appear in full-width form, occupying the space of two half-width characters. Full-width ASCII characters are typically used in East Asian typography for alignment with the width of CJK characters. However, for the purposes of indexing and searching, these full-width characters need to be normalized to their standard (half-width) ASCII equivalents.
 
 The following example illustrates ASCII character normalization:
 
@@ -22,9 +22,8 @@ The following example illustrates ASCII character normalization:
 
 ### Converting half-width katakana characters
 
-The CJK width token filter converts half-width katakana characters to their full-width counterparts, which are the standard form used in Japanese text. This normalization is important for consistency in text processing and searching.
+The `cjk_width` token filter converts half-width katakana characters to their full-width counterparts, which are the standard form used in Japanese text. This normalization, illustrated in the following example, is important for consistency in text processing and searching:
 
-The following example illustrates ASCII converting half-width katakana characters:
 
 ```
         Half-Width katakana:               ｶﾀｶﾅ

diff --git a/_analyzers/token-filters/index.md b/_analyzers/token-filters/index.md
@@ -16,7 +16,7 @@ Token filter | Underlying Lucene token filter|  Description
 [`apostrophe`]({{site.url}}{{site.baseurl}}/analyzers/token-filters/apostrophe/) | [ApostropheFilter](https://lucene.apache.org/core/9_10_0/analysis/common/org/apache/lucene/analysis/tr/ApostropheFilter.html) | In each token containing an apostrophe, the `apostrophe` token filter removes the apostrophe itself and all characters following it. 
 [`asciifolding`]({{site.url}}{{site.baseurl}}/analyzers/token-filters/asciifolding/) | [ASCIIFoldingFilter](https://lucene.apache.org/core/9_10_0/analysis/common/org/apache/lucene/analysis/miscellaneous/ASCIIFoldingFilter.html) | Converts alphabetic, numeric, and symbolic characters.
 `cjk_bigram` | [CJKBigramFilter](https://lucene.apache.org/core/9_10_0/analysis/common/org/apache/lucene/analysis/cjk/CJKBigramFilter.html) | Forms bigrams of Chinese, Japanese, and Korean (CJK) tokens. 
-[`cjk_width`]({{site.url}}{{site.baseurl}}/analyzers/token-filters/cjk-width/) | [CJKWidthFilter](https://lucene.apache.org/core/9_10_0/analysis/common/org/apache/lucene/analysis/cjk/CJKWidthFilter.html) | Normalizes Chinese, Japanese, and Korean (CJK) tokens according to the following rules: <br> - Folds full-width ASCII character variants into the equivalent basic Latin characters. <br> - Folds half-width Katakana character variants into the equivalent Kana characters.
+[`cjk_width`]({{site.url}}{{site.baseurl}}/analyzers/token-filters/cjk-width/) | [CJKWidthFilter](https://lucene.apache.org/core/9_10_0/analysis/common/org/apache/lucene/analysis/cjk/CJKWidthFilter.html) | Normalizes Chinese, Japanese, and Korean (CJK) tokens according to the following rules: <br> - Folds full-width ASCII character variants into their equivalent basic Latin characters. <br> - Folds half-width katakana character variants into their equivalent kana characters.
 `classic` | [ClassicFilter](https://lucene.apache.org/core/9_10_0/analysis/common/org/apache/lucene/analysis/classic/ClassicFilter.html) | Performs optional post-processing on the tokens generated by the classic tokenizer. Removes possessives (`'s`) and removes `.` from acronyms.
 `common_grams` | [CommonGramsFilter](https://lucene.apache.org/core/9_10_0/analysis/common/org/apache/lucene/analysis/commongrams/CommonGramsFilter.html) | Generates bigrams for a list of frequently occurring terms. The output contains both single terms and bigrams.
 `conditional` | [ConditionalTokenFilter](https://lucene.apache.org/core/9_10_0/analysis/common/org/apache/lucene/analysis/miscellaneous/ConditionalTokenFilter.html) | Applies an ordered list of token filters to tokens that match the conditions provided in a script.