Corrected a typo

darthskyy · Oct 4, 2024 · be8fc95 · be8fc95
1 parent 3f034ac
commit be8fc95
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/docs/index.html b/docs/index.html
@@ -782,7 +782,7 @@ <h3>Models Trained from Scratch</h3>
 
                     <h3>Pre-trained Language Models</h3>
                     <p><strong>Effect of Nguni-specific transfer learning.</strong>
-                        It was the expectation that the Nguni-XLMR model would outperform the other PLMs due to its specialized pre-training on tasks for the Nguni languages however this was not the case. The model performed well but was not the best in any of the tasks. This could be due to the fact that the model was trained on a narrower linguistic scope than the other PLMs and thus did not have the same level of generalization as the other models. The model was also not able to leverage the similarity of the Nguni languages as effectively as expected. Af
+                        It was the expectation that the Nguni-XLMR model would outperform the other PLMs due to its specialized pre-training on tasks for the Nguni languages however this was not the case. The model performed well but was not the best in any of the tasks. This could be due to the fact that the model was trained on a narrower linguistic scope than the other PLMs and thus did not have the same level of generalization as the other models. The model was also not able to leverage the similarity of the Nguni languages as effectively as expected.
                     </p>
                     <p><strong>Effect of subword tokenization.</strong>
                         Since the models were adapted from XLM-RoBERTa, they all used a SentencePiece tokenizer which is optimised for subword tokenization. This is suboptimal for our task since its inputs are already subword units. By subword tokenizing individual morphemes, we reduced the models ability to learn the morphological structure.