This repository contains a minimal, standalone implementation of the Type 2 Diabetes (T2D) Total System Value (TSV) engine. The model simulates the progression of Type 2 Diabetes in a population over time, tracking health outcomes, complications, and the effects of different treatment scenarios.
Quick Links:
- 📚 Installation Guide - Step-by-step for non-R users
- 🔍 Validation Script - Check your setup
- 📊 Example Analysis - Analyse and visualise results
The T2D TSV engine is a systems dynamics model (implemented as a system of ordinary differential equations) that:
- Simulates a population stratified by:
- Diabetes status: No diabetes (D0), Uncontrolled T2D (D1), Controlled T2D (D2)
- Age groups: 18-45 years (a1), 45-65 years (a2), 65+ years (a3)
- Tracks transitions between states over time
- Models complications:
- Macrovascular: e.g. heart disease, stroke
- Microvascular: e.g. kidney disease, neuropathy, retinopathy
- Other: e.g. hypoglycaemia
- Calculates health and economic outcomes:
- Deaths and mortality rates
- Hospitalisations and bed days
- Direct and indirect healthcare costs
- Quality-adjusted life years (QALYs)
- Environmental impacts
- Pre-calibrated parameters: Model parameters are calibrated to match real-world epidemiological data.
- Treatment effects: Ability to model interventions by adjusting transition rates
- Scenario comparison: Compare "status quo" baseline with 4 intervention scenarios
- Long-term projections: Simulate from 2024 to 2060 (36 years)
Estimating_future_burden_T2D__supplementary_data/
├── README.md # This file
├── run_T2D_model.R # Main script to run the model
├── outputs/ # Output data (CSV files)
├── inputs/ # Input data (CSV files)
│ ├── precalibrated_population_2016+.csv # Initial population state
│ ├── precalibrated_params_2016+.csv # Calibrated model parameters
│ ├── static_params.csv # Fixed model parameters
│ ├── static_params_multifactorial.csv # Parameters for multifactorial interventions
│ ├── net_migration_age_breakdown.csv # Migration data by age
│ ├── complication_*.csv # Complication rates, costs, utilities
│ ├── diagnosis_direct_costs.csv # Diagnosis costs
│ ├── routine_*.csv # Routine care costs
│ └── ... (other cost and health data files)
└── source/ # R source code
├── library_loader.R # Loads required R packages
├── load_precalibrated_initial_population.R # Loads initial state
├── load_params.R # Loads model parameters
├── T2D_SDM.R # Systems Dynamics Model (ODEs)
├── T2D_PHA.R # Post-Hoc Analysis (outcomes)
├── preamble.R # Setup script
├── post_hoc_functions.R # Additional analysis functions
└── post_processing_functions.R # Data processing utilities
You need to have R installed on your system (version 4.0 or higher recommended).
-
Windows: Download from https://cran.r-project.org/bin/windows/base/
-
macOS: Download from https://cran.r-project.org/bin/macosx/
-
Linux (Ubuntu/Debian):
sudo apt-get update sudo apt-get install r-base
The model requires the following R packages:
- deSolve - For solving ordinary differential equations (ODEs)
- openxlsx - For writing Excel output files
- ggplot2 - For plotting (used in some functions)
- dplyr - For data manipulation
- tidyr - For data tidying
- purrr - For functional programming
- scales - For scaling functions
- Metrics - For model metrics
- ggpubr - For publication-ready plots
- growthmodels - For Gompertz growth models
Open R or RStudio and run:
install.packages(c("deSolve", "openxlsx", "ggplot2", "dplyr", "tidyr",
"purrr", "scales", "Metrics", "ggpubr", "growthmodels"))The packages will be loaded automatically when you run
run_T2D_model.R. If a package is missing, you'll see an error message.
Install it using:
install.packages("package_name")To verify R is installed correctly, open a terminal/command prompt and type:
R --versionYou should see R version information displayed.
Navigate to the T2D_Engine_Minimal directory and run:
Rscript run_T2D_model.R-
Open R or RStudio
-
Set your working directory to the
T2D_Engine_Minimalfolder:setwd("/path/to/T2D_Engine_Minimal") -
Run the script:
source("run_T2D_model.R")
The script will:
- Load all required libraries and source files
- Run Base case: Status Quo (Do Nothing)
- Baseline scenario with no interventions
- All treatment effects = 1.0 (no change from current rates)
- Run Scenario A: Total Control
- Intervention where diabetes control rates are improved
- Aims to move more people from uncontrolled to controlled T2D
- Run Scenario A1: Total Control & Multifactorial Control
- Same as Scenario A but with multifactorial intervention parameters
- Uses alternative parameter set for enhanced control
- Run Scenario B: Total Control & GLP1
- Combines improved control rates with GLP1 treatment effects
- Includes reduced complication rates from GLP1 therapy
- Run Scenario B1: Total Control & GLP1 & Multifactorial Control
- Most comprehensive scenario combining all interventions
- GLP1 effects with multifactorial control parameters
- Save output files:
outputs/output_status_quo.csv- Base case resultsoutputs/output_scenario_total_control.csv- Scenario A resultsoutputs/output_scenario_total_control_MF.csv- Scenario A1 resultsoutputs/output_scenario_GLP1.csv- Scenario B resultsoutputs/output_scenario_GLP1_MF.csv- Scenario B1 results
- Print summary statistics to the console
- On a typical modern computer: 30 seconds to 2 minutes
- Progress messages will be displayed during the run
The model generates CSV files with one row per year (2024-2061). Key columns include:
time: YearF0: Feeder compartment (individuals <18 years old)D0_a1,D0_a2,D0_a3: No diabetes by age groupD1_a1,D1_a2,D1_a3: Uncontrolled T2D by age groupD2_a1,D2_a2,D2_a3: Controlled T2D by age group
fC_o: Other complicationsfC_Ma: Macrovascular complicationsfC_Mi: Microvascular complications
D_acm_*: All-cause mortality by diabetes status and ageD_e_*: Disease-specific mortality
direct_costs: Direct healthcare costsindirect_costs: Indirect costs (productivity loss)bed_days: Hospital bed dayshospitalisations: Number of hospitalizationsdisutilities: Quality of life loss (QALY)LYL: Life years lost
f_P: Total populationf_D0_aX: Total non-diabetic populationf_D1_aX: Total uncontrolled T2D populationf_D2_aX: Total controlled T2D population
The easiest way to analyze results is using the provided
example_analysis.R script:
# After running the model
Rscript example_analysis.RThis script will: - Calculate key outcome metrics (complications,
deaths, costs, QALYs) for all 5 scenarios - Compare intervention scenarios against
base case quantitatively - Generate
publication-ready visualizations: - outputs/plot_population_by_status.png -
Population trajectories by scenario - outputs/plot_complications_by_type.png - Annual
complications by scenario - outputs/plot_cumulative_complications.png - Cumulative
outcomes comparison
To understand the impact of an intervention:
- Load both output files in R or Excel
- Compare key outcomes:
- Complications: Fewer complications = better
- Deaths: Fewer deaths = better
- Costs: Lower costs = more cost-effective
- QALYs: Higher QALYs (lower disutilities) = better quality of life
# Load output files (all 5 scenarios)
status_quo <- read.csv("outputs/output_status_quo.csv")
scenario_a <- read.csv("outputs/output_scenario_total_control.csv")
scenario_a1 <- read.csv("outputs/output_scenario_total_control_MF.csv")
scenario_b <- read.csv("outputs/output_scenario_GLP1.csv")
scenario_b1 <- read.csv("outputs/output_scenario_GLP1_MF.csv")
# Calculate total complications over entire period
sq_total_complications <- sum(status_quo$fC_Ma + status_quo$fC_Mi + status_quo$fC_o, na.rm = TRUE)
a_total_complications <- sum(scenario_a$fC_Ma + scenario_a$fC_Mi + scenario_a$fC_o, na.rm = TRUE)
b_total_complications <- sum(scenario_b$fC_Ma + scenario_b$fC_Mi + scenario_b$fC_o, na.rm = TRUE)
# Calculate reduction (Scenario B vs Base case)
reduction <- sq_total_complications - b_total_complications
pct_reduction <- (reduction / sq_total_complications) * 100
cat(sprintf("Scenario B complications avoided: %.0f (%.1f%% reduction)\n",
reduction, pct_reduction))
# Plot population by diabetes status over time (Base case vs Scenario B)
library(ggplot2)
library(tidyr)
library(dplyr)
# Prepare data for plotting
sq_long <- status_quo %>%
select(time, f_D0_aX, f_D1_aX, f_D2_aX) %>%
pivot_longer(cols = -time, names_to = "status", values_to = "population") %>%
mutate(scenario = "Base case")
b_long <- scenario_b %>%
select(time, f_D0_aX, f_D1_aX, f_D2_aX) %>%
pivot_longer(cols = -time, names_to = "status", values_to = "population") %>%
mutate(scenario = "Scenario B (GLP1)")
combined <- rbind(sq_long, b_long)
# Plot
ggplot(combined, aes(x = time, y = population, color = status)) +
geom_line(linewidth = 1) +
facet_wrap(~scenario) +
labs(title = "Population by Diabetes Status",
x = "Year", y = "Population",
color = "Status") +
theme_minimal()The model is a compartmental model solved using ordinary differential equations (ODEs). Individuals flow between compartments based on transition rates:
F0 (Age <18) → [aging] → D0_a1 (No diabetes, age 18-45)
↓
D1_a1 (Uncontrolled T2D, age 18-45)
↕
D2_a1 (Controlled T2D, age 18-45)
Similar structures exist for age groups a2 (45-65) and a3 (65+).
- δ (delta): Birth rate (Gompertz function)
- φ (phi): Net migration rate (Gompertz function)
- θ (theta): Aging rate (18→a1, a1→a2, a2→a3)
- ι (iota): T2D incidence rate (D0→D1)
- λ (lambda): Control rate (D1→D2)
- ε (epsilon): Loss of control rate (D2→D1)
- α (alpha): Other complication rate
- β (beta): Microvascular complication rate
- γ (gamma): Macrovascular complication rate
- μ (mu): All-cause mortality rate
Treatment effects are multiplicative factors (η, eta) applied to transition rates:
- η = 1.0: No effect (baseline)
- η < 1.0: Rate decreases (beneficial for bad outcomes)
- η > 1.0: Rate increases (beneficial for good outcomes like control)
For example: - eta_lambda_a1 = 2.0 doubles the rate of achieving
diabetes control in age group 1 - eta_beta_1_a1 = 0.8 reduces
microvascular complications by 20% in uncontrolled T2D, age group 1
The model is pre-calibrated using: - UK population data - Epidemiological studies on T2D prevalence and incidence - Published complication rates - Healthcare cost data - Quality of life (utility) data
To test a different intervention, modify the treatment effects in
run_T2D_model.R:
# Example: Reduce macrovascular complications by 15% (eta = 0.85)
treatment_effects_custom <- c(
eta_i_a1 = 1, eta_i_a2 = 1, eta_i_a3 = 1,
eta_lambda_a1 = 1, eta_lambda_a2 = 1, eta_lambda_a3 = 1,
eta_epsilon_a1 = 1, eta_epsilon_a2 = 1, eta_epsilon_a3 = 1,
eta_alpha_1_a1 = 1, eta_alpha_2_a1 = 1,
eta_alpha_1_a2 = 1, eta_alpha_2_a2 = 1,
eta_alpha_1_a3 = 1, eta_alpha_2_a3 = 1,
eta_beta_1_a1 = 1, eta_beta_2_a1 = 1,
eta_beta_1_a2 = 1, eta_beta_2_a2 = 1,
eta_beta_1_a3 = 1, eta_beta_2_a3 = 1,
# Reduce macrovascular complications
eta_gamma_1_a1 = 0.85, eta_gamma_2_a1 = 0.85,
eta_gamma_1_a2 = 0.85, eta_gamma_2_a2 = 0.85,
eta_gamma_1_a3 = 0.85, eta_gamma_2_a3 = 0.85,
eta_tau_alpha = 1, eta_tau_beta = 1, eta_tau_gamma = 1
)
output_custom <- T2D_TSV_engine(end_year = 2061,
treatment_effects = treatment_effects_custom,
multifactorial = FALSE)
write.csv(output_custom, file = 'output_custom_scenario.csv', row.names = FALSE)To simulate for a different time period, change the end_year variable:
end_year <- 2050 # Simulate through 2050 instead of 2061Note: The start year (2024) is hardcoded in the T2D_TSV_engine
function.
The model supports multifactorial interventions (combined treatment approaches). To use:
output <- T2D_TSV_engine(end_year = 2061,
treatment_effects = treatment_effects_DN,
multifactorial = TRUE) # Set to TRUEThis loads different parameters from
inputs/static_params_multifactorial.csv.
Solution: Install the missing package:
install.packages("package_name")Solution: Make sure you're running the script from the correct
directory. Your working directory should be T2D_Engine_Minimal.
Check with:
getwd()Set with:
setwd("/path/to/T2D_Engine_Minimal")Solution: The ODE solver can be computationally intensive. This is normal. Expected runtime is 30 seconds to 2 minutes. If it takes much longer, you may have limited computational resources.
Solution: These warnings are typically harmless. They occur during post-processing when calculating annual rates from cumulative values for the final year.
Solution: This is expected behavior. The last row contains NA values because annual rates (differences) cannot be calculated for the final year. Exclude the last row when analyzing results. Therefore we solve until 2061 to ensure complete results for the year 2060.
If you encounter issues not covered here:
- Check that all input files are present in the
inputs/directory - Verify that all source files are in the
source/directory - Ensure you have the correct R version (4.0+)
- Check that all required packages are installed
- Language: R (version 4.0 or higher)
- ODE Solver:
deSolve::ode()using LSODA algorithm - Timestep: Annual (1 year)
- Method: Systems dynamics modeling / Compartmental model
- The model solves a system of 45 coupled ordinary differential equations
- Stiff ODE solver (LSODA) automatically switches between stiff and non-stiff methods
- Precision: Relative tolerance = 1e-6, Absolute tolerance = 1e-8 (defaults)
- Open cohort: Net migration and new births are modeled
- Age-dependent rates: All transition rates vary by age group
- No recovery from T2D: Once diagnosed, individuals cannot return to non-diabetic state (D0)
- Memoryless transitions: Markov assumption (future state depends only on current state)
- Homogeneous populations: No further stratification beyond age and diabetes status
The model has been calibrated and validated against:
- UK diabetes prevalence data
- Published T2D incidence rates
- Complication rates from clinical studies
- Mortality data from national statistics
If you use this model in academic work, please cite:
[Manuscript citation to be added after publication]
See LICENSE.md
For questions about the model, please contact: