diff --git a/README.md b/README.md
index 872a769..8a140e4 100755
--- a/README.md
+++ b/README.md
@@ -19,65 +19,41 @@ The method is described in the paper: [SpatialFusion: A lightweight multimodal f
 You can find detailed documentation at https://uhlerlab.github.io/spatialfusion/
 
 ---
+## Prepare SpatialFusion inputs
+Before running SpatialFusion, you need to generate unimodal embeddings from:
+- spatial transcriptomics data → using **scGPT**
+- H&E / whole-slide images → using **UNI**
 
+We provide two ways to generate these, detailed on our [documentation website](https://uhlerlab.github.io/spatialfusion/unimodal-embeddings/).
 ## Installation
 
-We provide pretrained weights for the **multimodal autoencoder (AE)** and **graph convolutional masked autoencoder (GCN)** under `data/`.
-
-SpatialFusion depends on **PyTorch** and **DGL**, which have different builds for CPU and GPU systems. You can install it using **pip** or inside a **conda/mamba** environment.
-
----
-
-### 1. Create mamba environment
+### 1. Create virtual environment
 
 ```bash
 mamba create -n spatialfusion python=3.10 -y
 mamba activate spatialfusion
-# Then install GPU or CPU version below
-```
-
-### 2. Install platform-specific libraries (GPU vs CPU)
-
-#### GPU (CUDA 12.4)
-
-```bash
-pip install "torch==2.4.1" "torchvision==0.19.1" \
-  --index-url https://download.pytorch.org/whl/cu124
-conda install -c dglteam/label/th24_cu124 dgl
 ```
 
-**Note:** TorchText issues exist for this version:
-[https://github.com/pytorch/text/issues/2272](https://github.com/pytorch/text/issues/2272) — this may affect scGPT.
+### 2. Install platform-specific libraries
 
----
+SpatialFusion depends on PyTorch and DGL, which have different builds for CPU and GPU systems. 
 
-#### GPU (CUDA 12.1) — *Recommended if using scGPT*
+#### CPU
 
 ```bash
-pip install torch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 \
-  --index-url https://download.pytorch.org/whl/cu121
-conda install -c dglteam/label/th21_cu121 dgl
-
-# Optional: embeddings used by scGPT
-pip install --no-cache-dir torchtext==0.18.0 torchdata==0.9.0
+pip install "torch==2.4.1" "torchvision==0.19.1" \
+  --index-url https://download.pytorch.org/whl/cpu
 
-# Optional: UNI (H&E embedding model)
-pip install timm
+pip install dgl -f https://data.dgl.ai/wheels/torch-2.4/repo.html
 ```
----
 
-#### CPU-only
+#### GPU (CUDA 12.4)
 
 ```bash
 pip install "torch==2.4.1" "torchvision==0.19.1" \
-  --index-url https://download.pytorch.org/whl/cpu
-pip install dgl -f https://data.dgl.ai/wheels/torch-2.4/repo.html
-
-# Optional, used for scGPT
-pip install --no-cache-dir torchtext==0.18.0 torchdata==0.9.0
+  --index-url https://download.pytorch.org/whl/cu124
 
-# Optional, used for UNI
-pip install timm
+pip install dgl -f https://data.dgl.ai/wheels/torch-2.4/cu124/repo.html
 ```
 ---
 
@@ -93,12 +69,9 @@ Includes: `pytest`, `black`, `ruff`, `sphinx`, `matplotlib`, `seaborn`.
 
 ```bash
 git clone https://github.com/uhlerlab/spatialfusion.git
+
 cd spatialfusion/
-pip install -e .
-```
 
-```bash
-# Optional contributor extras
 pip install -e ".[dev,docs]"
 ```
 
@@ -278,10 +251,7 @@ Tutorial data is available on Zenodo:
 
 If you use SpatialFusion, please cite:
 
-> Broad Institute Spatial Foundation, *SpatialFusion* (2025).
-> [https://github.com/broadinstitute/spatialfusion](https://github.com/broadinstitute/spatialfusion)
-
-Full manuscript citation will be added when available.
+> Yates J, Shavakhi M, Choueiri T, Van Allen EM, Uhler C. SpatialFusion: A lightweight multimodal foundation model for pathway-informed spatial niche mapping. _bioRxiv_. 2026. doi:10.64898/2026.03.16.712056
 
 ---
 
diff --git a/docs/index.md b/docs/index.md
index 2bd4699..2206eff 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -7,7 +7,7 @@
   </figcaption>
 </figure>
 
-This method is described in the paper (TBD).
+The method is described in the paper: [SpatialFusion: A lightweight multimodal foundation model for pathway-informed spatial niche mapping](https://doi.org/10.64898/2026.03.16.712056).
 
 **SpatialFusion** is a lightweight foundation model designed to represent find niches in tissue using a lower dimensional embedding. It integrates spatial transcriptomics data with histopathology-derived image features into a shared latent representation, and can be applied to paired spatial transcriptomics and whole slide images or whole slide images only.
 
diff --git a/docs/installation.md b/docs/installation.md
index bac038c..2a4e0b5 100644
--- a/docs/installation.md
+++ b/docs/installation.md
@@ -1,81 +1,52 @@
 # Installation
 
-We provide pretrained weights for the **multimodal autoencoder (AE)** and **graph convolutional masked autoencoder (GCN)** under `data/`.
 
-SpatialFusion depends on **PyTorch** and **DGL**, which have different builds for CPU and GPU systems. You can install it using **pip** or inside a **conda/mamba** environment.
-
----
-
-### 1. Create mamba environment
+### 1. Create virtual environment
 
 ```bash
 mamba create -n spatialfusion python=3.10 -y
 mamba activate spatialfusion
-# Then install GPU or CPU version below
 ```
 
-### 2. Install platform-specific libraries (GPU vs CPU)
+### 2. Install platform-specific libraries
 
-#### GPU (CUDA 12.4)
+SpatialFusion depends on PyTorch and DGL, which have different builds for CPU and GPU systems. 
 
-```bash
-pip install "torch==2.4.1" "torchvision==0.19.1" \
-  --index-url https://download.pytorch.org/whl/cu124
-conda install -c dglteam/label/th24_cu124 dgl
-```
-
-**Note:** TorchText issues exist for this version:
-[https://github.com/pytorch/text/issues/2272](https://github.com/pytorch/text/issues/2272) — this may affect scGPT.
-
----
-
-#### GPU (CUDA 12.1) — *Recommended if using scGPT*
+#### CPU
 
 ```bash
-pip install torch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 \
-  --index-url https://download.pytorch.org/whl/cu121
-conda install -c dglteam/label/th21_cu121 dgl
-
-# Optional: embeddings used by scGPT
-pip install --no-cache-dir torchtext==0.18.0 torchdata==0.9.0
+pip install "torch==2.4.1" "torchvision==0.19.1" \
+  --index-url https://download.pytorch.org/whl/cpu
 
-# Optional: UNI (H&E embedding model)
-pip install timm
+pip install dgl -f https://data.dgl.ai/wheels/torch-2.4/repo.html
 ```
 
----
-
-#### CPU-only
+#### GPU (CUDA 12.4)
 
 ```bash
 pip install "torch==2.4.1" "torchvision==0.19.1" \
-  --index-url https://download.pytorch.org/whl/cpu
-conda install -c dglteam -c conda-forge dgl
-
-# Optional, used for scGPT
-pip install --no-cache-dir torchtext==0.18.0 torchdata==0.9.0
+  --index-url https://download.pytorch.org/whl/cu124
 
-# Optional, used for UNI
-pip install timm
+pip install dgl -f https://data.dgl.ai/wheels/torch-2.4/cu124/repo.html
 ```
 
-> 💡 Replace `cu124` with the CUDA version matching your system (e.g., `cu121`).
-
 ---
 
 ### 3. Install SpatialFusion package
 
 #### Basic installation — *Recommended for users*
 ```bash
-cd spatialfusion/
-pip install -e .
+pip install spatialfusion
 ```
 ---
-#### Developer installation - *Recommended for contributors*
+#### Install from source - *Recommended for contributors*
 Includes: `pytest`, `black`, `ruff`, `sphinx`, `matplotlib`, `seaborn`.
 
 ```bash
+git clone https://github.com/uhlerlab/spatialfusion.git
+
 cd spatialfusion/
+
 pip install -e ".[dev,docs]"
 ```
 
diff --git a/docs/unimodal-embeddings.md b/docs/unimodal-embeddings.md
new file mode 100644
index 0000000..c07a585
--- /dev/null
+++ b/docs/unimodal-embeddings.md
@@ -0,0 +1,114 @@
+# Generate SpatialFusion inputs
+
+## Overview
+
+Before running SpatialFusion, you need to generate unimodal embeddings from:
+
+- spatial transcriptomics data → using **scGPT**
+- H&E / whole-slide images → using **UNI**
+
+
+This step requires a GPU to run efficiently and we provide two ways to run it.
+
+## Which workflow should I choose?
+
+### WDL workflow
+Best if you:
+
+- do not have access to a GPU
+- use a platform like Terra
+
+Launch via Dockstore: 
+<https://dockstore.org/workflows/github.com/uhlerlab/spatialfusion/unimodal-embeddings-for-spatialfusion:main?tab=info>
+
+### Local / self-managed GPU workflow (this guide) 
+
+Best if you:
+
+- have access to a GPU machine
+
+---
+
+
+The remainder of this guide covers the **local/ self-managed GPU workflow**.
+
+## 1. Requirements
+
+Before running this step, you will need:
+
+- a GPU-enabled machine (tested with NVIDIA Tesla T4)
+- Docker installed
+
+
+## 2. Gather the required files
+
+Your inputs should include:
+
+- `adata`: AnnData (`.h5ad`) used for scGPT embeddings and for the spatial coordinates consumed by UNI. Spatial coordinates are expected in `adata.obsm["spatial"]`.
+- `wsi`: whole-slide image / H&E TIFF used to generate UNI image embeddings. TIFF / OME-TIFF format is expected.
+- `scgpt_weights`: a directory containing `best_model.pt`, `args.json`, and `vocab.json`.
+    - Download from <https://doi.org/10.6084/m9.figshare.24747228>
+- `uni_weights`: the UNI model weights file `pytorch_model.bin`.
+    - Request access and download from Mahmood Lab at <https://huggingface.co/MahmoodLab/UNI2-h>
+- `input_is_log_normalized`: decide whether your AnnData expression values are already log-normalized. You will pass `True` if they are already log-normalized and `False` if they are not.
+
+
+## 3. Set local paths
+
+Pull the public Docker image:
+
+```bash
+docker pull vanallenlab/unimodal-embeddings:v0.1
+```
+
+Set local path variables (absolute paths):
+
+```bash
+ADATA=/absolute/path/to/object.h5ad
+WSI=/absolute/path/to/image.ome.tif
+SCGPT_WEIGHTS_DIR=/absolute/path/to/scgpt
+UNI_WEIGHTS=/absolute/path/to/pytorch_model.bin
+OUTPUT_DIR=/absolute/path/to/output
+# Depends on your data
+LOG_NORM="False"
+```
+
+Notes:
+
+- `SCGPT_WEIGHTS_DIR` should point to a directory containing `best_model.pt`, `args.json`, and `vocab.json`.
+
+
+## 4. Run embedding generation
+
+```bash
+docker run --rm --gpus all \
+  -v "$ADATA":/inputs/object.h5ad \
+  -v "$WSI":/inputs/image.ome.tif \
+  -v "$SCGPT_WEIGHTS_DIR":/weights/scgpt \
+  -v "$UNI_WEIGHTS":/weights/pytorch_model.bin \
+  -v "$OUTPUT_DIR":/out \
+  vanallenlab/unimodal-embeddings:v0.1 \
+  python /app/unimodal-embeddings.py \
+    --mode both \
+    --adata /inputs/object.h5ad \
+    --input-is-log-normalized "$LOG_NORM" \
+    --wsi /inputs/image.ome.tif \
+    --output-dir /out \
+    --scgpt-weights /weights/scgpt \
+    --uni-weights /weights/pytorch_model.bin
+```
+
+## 5. Expected outputs
+After successful execution, you should see:
+
+```
+$OUTPUT_DIR/
+  ├── scGPT.parquet
+  └── UNI.parquet
+```
+
+
+## Notes
+- This guide covers the most common use case with minimal inputs
+- Additional optional parameters are available, see
+[`unimodal-embeddings.py`](https://github.com/uhlerlab/spatialfusion/blob/mkdocs-update/workflows/unimodal-embeddings/scripts/unimodal-embeddings.py#L208)
diff --git a/mkdocs.yml b/mkdocs.yml
index ecfd401..2a7803c 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -18,6 +18,7 @@ plugins:
 
 nav:
   - Home: index.md
+  - Prepare inputs: unimodal-embeddings.md
   - Installation: installation.md
   - Quick Start: quickstart.md
   - Concepts: concepts.md
diff --git a/workflows/unimodal-embeddings/README.md b/workflows/unimodal-embeddings/README.md
deleted file mode 100644
index 9ba49b9..0000000
--- a/workflows/unimodal-embeddings/README.md
+++ /dev/null
@@ -1,93 +0,0 @@
-# Generate unimodal embeddings for Spatial Fusion
-
-This workflow generates the unimodal embedding inputs used by SpatialFusion:
-
-- `scGPT.parquet` from spatial transcriptomics data with `scGPT`
-- `UNI.parquet` from H&E / whole-slide imaging data with `UNI`
-
-There are two supported ways to run this step:
-
-- use the WDL workflow on a platform such as Terra via Dockstore
-- run the published Docker image directly on your own GPU machine
-
-This README describes the second option: running the Docker image and script directly.
-
-## 1. What you need
-
-Before running this step, you need:
-
-- access to a GPU-enabled machine (note: we tested this using a NVIDIA Tesla T4)
-- Docker
-- your own input data:
-  - an AnnData `.h5ad` file
-  - an H&E / whole-slide image in TIFF / OME-TIFF format
-- model weights for both embedding models:
-  - `scGPT` weights
-  - `UNI` weights
-
-## 2. Gather the required files
-
-Your inputs should look like this:
-
-- `adata`: AnnData (`.h5ad`) used for scGPT embeddings and for the spatial coordinates consumed by UNI. Spatial coordinates are expected in `adata.obsm["spatial"]`.
-- `wsi`: whole-slide image / H&E TIFF used to generate UNI image embeddings. TIFF / OME-TIFF format is expected.
-- `scgpt_weights`: a directory containing `best_model.pt`, `args.json`, and `vocab.json`.
-- `uni_weights`: the UNI model weights file `pytorch_model.bin`.
-- `input_is_log_normalized`: decide whether your AnnData expression values are already log-normalized. You will pass `True` if they are already log-normalized and `False` if they are not.
-
-To get the model weights:
-
-- `scgpt_weights`: download `best_model.pt`, `args.json`, and `vocab.json` from the figshare dataset accompanying *Assessing the limits of zero-shot foundation models in single-cell biology* (DOI: <https://doi.org/10.6084/m9.figshare.24747228>), then place those three files in one directory.
-- `uni_weights`: request access to the UNI2-h weights from Mahmood Lab at <https://huggingface.co/MahmoodLab/UNI2-h>, then download `pytorch_model.bin`.
-
-## 3. Set local paths
-
-Pull the public Docker image:
-
-```bash
-docker pull vanallenlab/unimodal-embeddings:v0.1
-```
-
-Set local path variables for each required input. These should be absolute paths.
-
-```bash
-ADATA=/absolute/path/to/object.h5ad
-WSI=/absolute/path/to/image.ome.tif
-SCGPT_WEIGHTS_DIR=/absolute/path/to/scgpt
-UNI_WEIGHTS=/absolute/path/to/pytorch_model.bin
-OUTPUT_DIR=/absolute/path/to/output
-```
-
-Notes:
-
-- `SCGPT_WEIGHTS_DIR` should point to a directory containing `best_model.pt`, `args.json`, and `vocab.json`.
-
-## 4. Run the Docker command
-
-```bash
-docker run --rm --gpus all \
-  -v "$ADATA":/inputs/object.h5ad \
-  -v "$WSI":/inputs/image.ome.tif \
-  -v "$SCGPT_WEIGHTS_DIR":/weights/scgpt \
-  -v "$UNI_WEIGHTS":/weights/pytorch_model.bin \
-  -v "$OUTPUT_DIR":/out \
-  vanallenlab/unimodal-embeddings:latest \
-  python /app/unimodal-embeddings.py \
-  --mode both \
-  --adata /inputs/object.h5ad \
-  --input-is-log-normalized False \
-  --wsi /inputs/image.ome.tif \
-  --output-dir /out \
-  --scgpt-weights /weights/scgpt \
-  --uni-weights /weights/pytorch_model.bin
-```
-
-This will write the output files to your local machine at:
-
-- `$OUTPUT_DIR/scGPT.parquet`
-- `$OUTPUT_DIR/UNI.parquet`
-
-
-## Notes
-
-- This README shows the minimal inputs for the common case. The script exposes additional optional parameters for advanced use; see `scripts/unimodal-embeddings.py` for the full CLI.