Skip to content
/ fms-dgt Public

DiGiT is a framework that enables different algorithms and models to be used to generate synthetic data.

License

Notifications You must be signed in to change notification settings

IBM/fms-dgt

FMS-DGT

DGT (pronounced "digit") is a framework that enables different algorithms and models to be used to generate synthetic data.

Python Version Code style: black GitHub License

| Setup | Usage |

This is the main repository for DiGiT, our Data Generation and Transformation framework.

Setup

First clone the repository

git clone [email protected]:IBM/fms_dgt.git
cd fms_dgt

Now set up your virtual environment. We recommend using a Python virtual environment with Python >=3.10.15 and <3.13.x. Here is how to setup a virtual environment using Python venv

python3 -m venv .venv
source .venv/bin/activate

To install packages, we recommend the following

pip install -e ".[all]"

Important

Please install the pre-commit hooks to adhere with code hygiene standards

pip install pre-commit
pre-commit install

For whichever of various API services you plan on using, you need to add configurations to .env file. Copy the .env.example as .env and add your KEYS as follows:

# watsonx [Optional]
WATSONX_API_KEY=<WatsonX key goes here>
WATSONX_PROJECT_ID=<Project env variable>

# OpenAI [Optional]
OPENAI_API_KEY=<OPENAI key goes here>

# Azure OpenAI [Optional]
AZURE_OPENAI_API_KEY=<AZURE OPENAI key goes here>

# Antropic [Optional]
ANTHROPIC_API_KEY=<ANTHROPIC key goes here>

Usage

To test whether you have been successful, run the following operation that references a databuilder.

Tip

Default settings assumes you have mistral-small3.2 running. Please use following command to run it for an hour

ollama run mistral-small3.2 --keepalive "1h" &
python -m fms_dgt.core --task-paths ./tasks/core/simple/logical_reasoning/causal --restart-generation

Caution

you must set up a WATSONX_API_KEY and WATSONX_PROJECT_ID before using watsonx API service

python -m fms_dgt.core --task-paths ./tasks/core/simple/logical_reasoning/causal --restart-generation --config-path configs/core/watsonx_simple.yaml

If successful, you should see the outputs of the command in the ./output directory

The Team

FMS-DGT is currently maintained by Max Crouse, Kshitij Fadnis, Siva Sankalp Patel, and Pavan Kapanipathi.

License

FMS-DGT has an Apache 2.0 license, as found in the LICENSE file.

About

DiGiT is a framework that enables different algorithms and models to be used to generate synthetic data.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 9

Languages