Skip to content

Latest commit

 

History

History

fine-tuning

Medical Question-Answer Dataset Curation

This repository contains a Jupyter notebook that guides users through the process of curating a dataset specifically designed for training generative AI models in the medical domain. The tutorial utilizes Label Studio, a versatile tool for data labeling, to facilitate the creation of a question-answer dataset aimed at medical applications.

Notebooks

Features

  • Label Studio Setup: Step-by-step instructions on setting up Label Studio projects for syntheticly generated Q&A datasets.
  • Integrates with the Label Studio Backend LLM Interactive example
  • Data Import and Configuration: Importing and configuring datasets in Label Studio for question and answer generation.
  • Fine-tune Llama 3: Notebook to fine-tune Llama 3 on a Colab T4 based on Unsloth demo notebook.