Skip to content

Commit

Permalink
docs: update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
dgcnz committed Oct 25, 2024
1 parent e2d1de5 commit cb05d1b
Show file tree
Hide file tree
Showing 6 changed files with 118 additions and 227 deletions.
60 changes: 60 additions & 0 deletions .github/workflows/deploy_book.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
name: deploy-book

# Run this when the master or main branch changes
on:
push:
branches:
- master
- main
# If your git repository has the Jupyter Book within some-subfolder next to
# unrelated files, you can make this run only if a file within that specific
# folder has been modified.
#
paths:
- docs/src

# This job installs dependencies, builds the book, and pushes it to `gh-pages`
jobs:
deploy-book:
runs-on: ubuntu-latest
permissions:
pages: write
id-token: write
steps:
- uses: actions/checkout@v3

# Install dependencies
- name: Set up Python 3.11
uses: actions/setup-python@v4
with:
python-version: 3.11

- name: Install dependencies
run: |
pip install -r docs/requirements.txt
# (optional) Cache your executed notebooks between runs
# if you have config:
# execute:
# execute_notebooks: cache
- name: cache executed notebooks
uses: actions/cache@v3
with:
path: _build/.jupyter_cache
key: jupyter-book-cache-${{ hashFiles('requirements.txt') }}

# Build the book
- name: Build the book
run: |
jupyter-book build docs/src --path-output docs
# Upload the book's HTML as an artifact
- name: Upload artifact
uses: actions/upload-pages-artifact@v2
with:
path: "docs/_build/html"

# Deploy the book's HTML to GitHub Pages
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v2
60 changes: 0 additions & 60 deletions cmake/CMakeLists.new.old.txt

This file was deleted.

55 changes: 0 additions & 55 deletions cmake/CMakeLists_old.txt

This file was deleted.

93 changes: 0 additions & 93 deletions cmake/FindTensorRT.cmake

This file was deleted.

1 change: 1 addition & 0 deletions cu124.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,4 @@ dependencies:
- gcc=13
# nvidia-modelopt requires crypt.h, see: https://github.com/stanford-futuredata/ColBERT/issues/309
- libxcrypt
- poetry
76 changes: 57 additions & 19 deletions docs/src/part1/getting_started.md
Original file line number Diff line number Diff line change
@@ -1,41 +1,81 @@
# Getting Started

TODO:
- [ ] Introduce project structure
```{contents}
```

## Project structure

The project is structured as follows:

```
.
├── artifacts # Model weights and scripts I/O
├── build # Build directory (location for cpp executables)
├── cpp # source code for cpp executables
├── detrex # fork of detrex
├── docs # documentation
├── logs
├── notebooks # jupyter notebooks
├── output # [Training] `scripts.train_net` outputs (tensorboard logs, weights, etc)
├── projects # configurations and model definitions
├── scripts # utility scripts
├── src # python source code
├── third-party # third-party c libraries
├── wandb_output # Output from wandb
├── CMakeLists.txt # CMake configuration for cpp executables
├── cu124.yaml # Conda environment file (only system dependencies: cuda, gcc, python)
├── Makefile # Makefile for project scripts
├── poetry.lock # Locked python dependencies
├── pyproject.toml # Poetry configuration
├── README.md
```

The main folders to focus are `src` and `scripts` as it is where most of the source code lies.

## Installation

TODO:
- [ ] Write installation instructions nicely
First make sure the (bold) pre-requirements are fulfilled:
- **Conda**
- **Make**
- CMake (for building cpp executables)


First let's create our conda environment. This will install the cuda runtime and libraries, python, the poetry dependency manager and other stuff:

Prerequirements:
- Conda
- Make
- CMake
```bash
conda env create -f cu124.yaml
conda activate cu124
```

To avoid TorchInductor and ModelOpt errors looking for `crypt.h`:

Create conda environment:
```bash
conda create -f cu124.yaml
conda env config vars set CPATH=$CONDA_PREFIX/include
conda activate cu124
```

Install python requirements:
Installing the dependencies requires some manual building (`detrex`, `detectron2`), so we can use the make commands to do it for us:

```bash
make setup_python
make setup_detrex
```

If you need the C++ runtime with TensoRT:
(Optional) If you need the C++ TensorRT runtime and the accompanying benchmark executables, you can build them with the following commands:

```bash
make download_and_build_torchtrt
# To build the `benchmark` executable
make build_cpp
make compile_cpp
```

## Downloading datasets
This will automatically download the necessary files and build the libraries for you.


If you have a designated folder for datasets, use it, for the purpose of this tutorial, we'll use `~/datasets`:
## Downloading datasets (training-only)

If you have a designated folder for datasets, use it, for the purpose of this tutorial, we'll use `~/datasets`. We'll test with the COCO dataset, so let's download it:

```bash
cd ~/datasets
Expand All @@ -51,11 +91,9 @@ unzip train2017.zip
unzip val2017.zip
```

## Setting up environment variables
To point the `detectron2` library to the dataset directory, we need to set the `DETECTRON2_DATASETS` environment variable:

```bash
# Necessary to avoid TorchInductor and ModelOpt errors looking for crypt.h
export CPATH=$(CONDA_PREFIX)/include
# To help detectron2 locate the dataset, set to your local path containing COCO
export DETECTRON2_DATASETS=$DATASETS
conda env config vars set DETECTRON2_DATASETS=~/datasets
conda activate cu124
```

0 comments on commit cb05d1b

Please sign in to comment.