docs: update docs

dgcnz · Oct 25, 2024 · cb05d1b · cb05d1b
1 parent e2d1de5
commit cb05d1b
Show file tree

Hide file tree

Showing 6 changed files with 118 additions and 227 deletions.
diff --git a/.github/workflows/deploy_book.yaml b/.github/workflows/deploy_book.yaml
@@ -0,0 +1,60 @@
+name: deploy-book
+
+# Run this when the master or main branch changes
+on:
+  push:
+    branches:
+    - master
+    - main
+    # If your git repository has the Jupyter Book within some-subfolder next to
+    # unrelated files, you can make this run only if a file within that specific
+    # folder has been modified.
+    #
+    paths:
+    - docs/src
+
+# This job installs dependencies, builds the book, and pushes it to `gh-pages`
+jobs:
+  deploy-book:
+    runs-on: ubuntu-latest
+    permissions:
+      pages: write
+      id-token: write
+    steps:
+    - uses: actions/checkout@v3
+
+    # Install dependencies
+    - name: Set up Python 3.11
+      uses: actions/setup-python@v4
+      with:
+        python-version: 3.11
+
+    - name: Install dependencies
+      run: |
+        pip install -r docs/requirements.txt
+
+    # (optional) Cache your executed notebooks between runs
+    # if you have config:
+    # execute:
+    #   execute_notebooks: cache
+    - name: cache executed notebooks
+      uses: actions/cache@v3
+      with:
+        path: _build/.jupyter_cache
+        key: jupyter-book-cache-${{ hashFiles('requirements.txt') }}
+
+    # Build the book
+    - name: Build the book
+      run: |
+        jupyter-book build docs/src --path-output docs
+
+    # Upload the book's HTML as an artifact
+    - name: Upload artifact
+      uses: actions/upload-pages-artifact@v2
+      with:
+        path: "docs/_build/html"
+
+    # Deploy the book's HTML to GitHub Pages
+    - name: Deploy to GitHub Pages
+      id: deployment
+      uses: actions/deploy-pages@v2
diff --git a/cmake/CMakeLists.new.old.txt b/cmake/CMakeLists.new.old.txt
diff --git a/cmake/CMakeLists_old.txt b/cmake/CMakeLists_old.txt
diff --git a/cmake/FindTensorRT.cmake b/cmake/FindTensorRT.cmake
diff --git a/cu124.yaml b/cu124.yaml
@@ -11,3 +11,4 @@ dependencies:
   - gcc=13 
   # nvidia-modelopt requires crypt.h, see: https://github.com/stanford-futuredata/ColBERT/issues/309
   - libxcrypt 
+  - poetry
diff --git a/docs/src/part1/getting_started.md b/docs/src/part1/getting_started.md
@@ -1,41 +1,81 @@
 # Getting Started
 
-TODO: 
-- [ ] Introduce project structure 
+```{contents}
+```
+
+## Project structure
+
+The project is structured as follows:
+
+```
+.
+├── artifacts           # Model weights and scripts I/O
+├── build               # Build directory (location for cpp executables)
+├── cpp                 # source code for cpp executables
+├── detrex              # fork of detrex
+├── docs                # documentation
+├── logs                
+├── notebooks           # jupyter notebooks
+├── output              # [Training] `scripts.train_net` outputs (tensorboard logs, weights, etc)
+├── projects            # configurations and model definitions
+├── scripts             # utility scripts 
+├── src                 # python source code
+├── third-party         # third-party c libraries
+├── wandb_output        # Output from wandb
+├── CMakeLists.txt      # CMake configuration for cpp executables
+├── cu124.yaml          # Conda environment file (only system dependencies: cuda, gcc, python)
+├── Makefile            # Makefile for project scripts
+├── poetry.lock         # Locked python dependencies
+├── pyproject.toml      # Poetry configuration
+├── README.md 
+```
+
+The main folders to focus are `src` and `scripts` as it is where most of the source code lies.
 
 ## Installation
 
-TODO: 
-- [ ] Write installation instructions nicely
+First make sure the (bold) pre-requirements are fulfilled:
+- **Conda** 
+- **Make** 
+- CMake (for building cpp executables)
+
+
+First let's create our conda environment. This will install the cuda runtime and libraries, python, the poetry dependency manager and other stuff:
 
-Prerequirements:
-- Conda
-- Make
-- CMake
+```bash
+conda env create -f cu124.yaml
+conda activate cu124
+```
+
+To avoid TorchInductor and ModelOpt errors looking for `crypt.h`:
 
-Create conda environment:
 ```bash
-conda create -f cu124.yaml
+conda env config vars set CPATH=$CONDA_PREFIX/include  
 conda activate cu124
 ```
 
-Install python requirements: 
+Installing the dependencies requires some manual building (`detrex`, `detectron2`), so we can use the make commands to do it for us:
+
 ```bash
 make setup_python
 make setup_detrex
 ```
 
-If you need the C++ runtime with TensoRT:
+(Optional) If you need the C++ TensorRT runtime and the accompanying benchmark executables, you can build them with the following commands:
+
 ```bash
 make download_and_build_torchtrt
 # To build the `benchmark` executable
 make build_cpp
 make compile_cpp
 ```
 
-## Downloading datasets
+This will automatically download the necessary files and build the libraries for you.
+
 
-If you have a designated folder for datasets, use it, for the purpose of this tutorial, we'll use `~/datasets`:
+## Downloading datasets (training-only)
+
+If you have a designated folder for datasets, use it, for the purpose of this tutorial, we'll use `~/datasets`. We'll test with the COCO dataset, so let's download it:
 
 ```bash
 cd ~/datasets
@@ -51,11 +91,9 @@ unzip train2017.zip
 unzip val2017.zip
 ```
 
-## Setting up environment variables
+To point the `detectron2` library to the dataset directory, we need to set the `DETECTRON2_DATASETS` environment variable:
 
 ```bash
-# Necessary to avoid TorchInductor and ModelOpt errors looking for crypt.h
-export CPATH=$(CONDA_PREFIX)/include  
-# To help detectron2 locate the dataset, set to your local path containing COCO
-export DETECTRON2_DATASETS=$DATASETS
+conda env config vars set DETECTRON2_DATASETS=~/datasets
+conda activate cu124
 ```
-Original file line number
+Diff line change
@@ Expand Up / @@ -11,3 +11,4 @@ dependencies: @@
       - gcc=13
       # nvidia-modelopt requires crypt.h, see: https://github.com/stanford-futuredata/ColBERT/issues/309
       - libxcrypt
+      - poetry