Skip to content

HodBadichi/Bias-Mitigation-Through-Topic-Aware-Distribution-Matching

Repository files navigation

Bias Mitigation Through Topic-Aware Distribution Matching - project:

This project consists of 2 parts:

  1. Topic modeling
  • BERTopic workflow, trains the model -> tunes hyperparams -> visualize and measure coherence
  • LDA workflow, trains the model -> tunes hyperparams -> visualize and measure coherence
  1. GAN
  • FrozenBert workflow, trains pre-trained bert model over pubmed data
  • Discriminator workflow, using Bert as embedder
  • Discriminator workflow with Sentence bert as embedder
  • GAN Workflow, using Bert as embedder

Each workflow is treated as an independent script which generates the data and modify the project structure in case it is missing anything, to start a workflow use "Run" function that invokes it.

In each folder a run_on_server.sh file exists. It should be used for running the workflow as a batch job on Technion 'lambda' server by using: sbatch -c 2 --gres=gpu:1 run_on_server.sh -o run.out command. Use hparams_config file in each workflow to tune the hyperparams as desired. WandB logging is used, use: wandb login before running the workflows.

For example - BerTopic workflow:

  1. Move to TopicModeling\Bert\src
  2. Configure the Hyperparams for your BerTopic experiment in TopicModeling\Bert\src\hparams_config.py
  3. Run sbatch -c 2 --gres=gpu:1 run_on_server.sh -o run.out inside the lambda server
  • Results :
  1. trained topic models will be saved in TopicModeling\Bert\saved_models
  2. visualizations and coherence csv files will be saved in TopicModeling\Bert\results

Note - the project has a requirements file, run: pip install -r requirements.txt to create the environment

Enjoy :)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published