update training procedure

CS-433 · Dec 21, 2023 · 09fce67 · 09fce67
1 parent 969c0be
commit 09fce67
Show file tree

Hide file tree

Showing 2 changed files with 3 additions and 1 deletion.
diff --git a/report/sections/methodology.tex b/report/sections/methodology.tex
@@ -27,6 +27,7 @@ \subsection*{Phase 2: Transferring Knowledge via Finetuning}
 
 % Training
 Training is performed on the \texttt{curlie-gpt3.5-10k} and \texttt{curlie-gpt4-10k} dataset for a maximum of 100 epochs. We use a 30\% held-out validation split from the \texttt{crowdsourced} dataset to monitor the validation F1 score and stop training if no improvement is observed for 10 epochs. This is to prevent overfitting the LLM labels. We perform hyperparameter grid search to Bayesian TPE sampler from Optuna~\cite{optuna} for $\eta=100$ trials and $\tau=10$ startup trials to effectively search the hyperparameter space. The hyperparameter values are detailed in Table~\ref{tab:hyperparameters}. The model which performs best on macro F1 in the validation split is chosen for the evaluation.
+The training loss, defined as the average binary cross-entropy over 14 classes, includes a reweighting factor to address class imbalance, based on the negative-to-positive sample ratio.
 
 \input{tables/hyperparameters.tex}
 

diff --git a/report/sections/summary.tex b/report/sections/summary.tex
@@ -1,3 +1,4 @@
 \section{Summary}\label{sec:summary}
 
-We have demonstrated that LLMs can provide cost-effective, and high-quality annotations in the settign of multilingual, multilabel website topic classification. Our approach, which involved finetuning a pre-trained Homepage2vec model on LLM-generated labels, resulted in a improvement of 4.3 percentage points in the macro F1 score. Additionally, we are releasing the \texttt{curlie-gpt3.5-10k} and \texttt{curlie-gpt4-10k} datasets \cite{curlie-gpt-10k} with the intention of supporting further research in the open-source community.
+We have demonstrated that LLMs can provide cost-effective, and high-quality annotations in the settign of multilingual, multilabel website topic classification. Our approach, which involved finetuning a pre-trained Homepage2vec model on LLM-generated labels, resulted in a improvement of 4.3 percentage points in the macro F1 score.
+Additionally, the \texttt{curlie-gpt3.5-10k} and \texttt{curlie-gpt4-10k} datasets \cite{curlie-gpt-10k} are being released to aid open-source research.