496 docs review (#150)

* add denspose and update save dir * commit two autogenerated files * changelog edits * contribute edits * WIP edits for models * finish edits for models and separate out densepose * full list of parks * config options * vscode format * move config guide into tutorials * extra options * windows not tested * tutorial edits * remove extra nb * put save path back * quickstart edits * vscode formatting * finish quickstart edits * train tutorial * add template section * logging * ffpmeg install in readme * finetuning * capitalization * remove ffmpmeg * date * add densepose * tweak * simplify * fix densepose video link * caps and tensorboard * add help * remove pythong piece since this is focused on yaml files * more tweaks * edit history not change log * copy edits * fix changelog * table * table * typo * alphabetize * table bug * table edits * Simplify save_dir and some directory -> dir renames (#151) * wip renames * renames in docs * readme * data dir renamme in docs * rename in code from data_directory to data_dir * maintaining update * fix capitalization * further updates * tweak * do not overwrite * add overwrite save dir * add overwrite save dir to config * update configs with all info * use full train configuration * only upload if does not exist * tests for save * overwrite param * better set up and test for overwrite * docs * update docs with overwrite * from overwrite_save_dir to overwrite * missed rename * remove machine specific from vlc * unindent so test actually runs * check for local and cached checkpoints * should be and * write out predict config before preds start like we do for train config * update all configs and use only first 10 digits of hash * dry run check after save is configured; more robust test * reorder * show save directory * copy edits * update template * fix test * lower case for consistency * fix test
drivendataorg · Oct 25, 2021 · 6b69a9e · 6b69a9e
1 parent c8a354d
commit 6b69a9e
Show file tree

Hide file tree

Showing 47 changed files with 1,071 additions and 1,344 deletions.
diff --git a/.github/MAINTAINING.md b/.github/MAINTAINING.md
@@ -113,7 +113,7 @@ make publish_models
 
 This will generate a public file name for each model based on the config hash and upload the model weights to the three DrivenData public s3 buckets. This will generate a folder in `zamba/models/official_models/{your_name_name}` that contains the official config as well as reference yaml and json files. You should PR everything in this folder.
 
-Lastly, you need to update the template in `templates`. The template should contain all the same info as the model's `config.yaml`, plus placeholders for `data_directory` and `labels` in `train_config`, and `data_directory`, `filepaths`, and `checkpoint` in `predict_config`.
+Lastly, you need to update the template in `templates`. The template should contain all the same info as the model's `config.yaml`, plus placeholders for `data_dir` and `labels` in `train_config`, and `data_dir`, `filepaths`, and `checkpoint` in `predict_config`.
 
 ### New model checklist
 

diff --git a/HISTORY.md b/HISTORY.md
@@ -1,8 +1,8 @@
-# zamba Changelog
+# `zamba` changelog
 
-## v2 <!-- TODO: add release date as, eg, (2021-10-22)>
+## v2 (2021-10-22)
 
-### Previous Model - Machine Learning Competition
+### Previous model: Machine learning competition
 
 The algorithms used by `zamba` v1 were based on the winning solution from the
 [Pri-matrix Factorization](https://www.drivendata.org/competitions/49/deep-learning-camera-trap-animals/) machine learning
@@ -12,14 +12,14 @@ The core algorithm in `zamba` v1 was a [stacked ensemble](https://en.wikipedia.o
 learning models, whose individual predictions were combined in the second level
 of the stack to form the final prediction.
 
-In v2, the stacked ensemble algorithm from v1 is replaced with three more powerful [single-model options](https://zamba.drivendata.org/docs/models/index.md): `time_distributed`, `slowfast`, and `european`. The new models utilize state-of-the-art image and video classification architectures, and are able to outperform the much more computationally intensive stacked ensemble model.
+In v2, the stacked ensemble algorithm from v1 is replaced with three more powerful [single-model options](../models/index.md): `time_distributed`, `slowfast`, and `european`. The new models utilize state-of-the-art image and video classification architectures, and are able to outperform the much more computationally intensive stacked ensemble model.
 
 ### New geographies and species
 
-`zamba` v2 incorporates data from western Europe (Germany) in additional to locations in central and west Africa. The new data is packaged in the pretrained `european` model, which can predict 11 common European species not present in `zamba` v1.
+`zamba` v2 incorporates data from western Europe (Germany). The new data is packaged in the pretrained `european` model, which can predict 11 common European species not present in `zamba` v1.
 
-`zamba` v2 also incorporates new training data for central and west Africa. `zamba` v1 was primarily focused on species commonly found on savannas. v2 incorporates data from camera traps in jungle ecosystems, adding 13 additional species to the pretrained models for central and west Africa.
+`zamba` v2 also incorporates new training data from 15 countries in central and west Africa, and adds 12 additional species to the pretrained African models.
 
 ### Retraining flexibility
 
-Model training is easier to reproduce in `zamba` v2, so users can finetune a pretrained model using their own data. `zamba` v2 also allows users to retrain a model on completely new labels.
+Model training is made available `zamba` v2, so users can finetune a pretrained model using their own data to improve performance for a specific ecology or set of sites. `zamba` v2 also allows users to retrain a model on completely new species labels.
diff --git a/Makefile b/Makefile
@@ -85,7 +85,7 @@ docs-setup:
 	| sed 's|https://zamba.drivendata.org/docs/||g' \
 	> docs/docs/index.md
 
-	sed 's|https://zamba.drivendata.org/docs/|../|g' HISTORY.md > docs/docs/changelog/index.md
+	sed 's|https://zamba.drivendata.org/docs/|../|g' HISTORY.md > docs/docs/changelog.md
 
 ## Build the static version of the docs
 docs: docs-setup

diff --git a/README.md b/README.md
@@ -11,10 +11,10 @@ https://user-images.githubusercontent.com/46792169/138346340-98ee196a-5ecd-4753-
 
 **`zamba` is a tool built in Python that uses machine learning and computer vision to automatically detect and classify animals in camera trap videos.** You can use `zamba` to:
 
-- Filter out blank videos
 - Identify which species appear in each video
+- Filter out blank videos
 
-The tool is already trained to identify 42 species common to Africa and Europe (as well as blank, or "no species present"). Users can also input their own labeled videos to finetune a model and make predictions for new species or new contexts.
+The models in `zamba` can identify blank videos (where no animal is present) along with 32 species common to Africa and 11 species commmon to Europe. Users can also finetune models using their own labeled videos to then make predictions for new species and/or new ecologies.
 
 `zamba` can be used both as a command-line tool and as a Python package. It is also available as a user-friendly website application, [Zamba Cloud](https://www.zambacloud.com/).
 
@@ -62,8 +62,9 @@ Options:
   --help                Show this message and exit.
 
 Commands:
-  predict  Identify species in a video.
-  train    Train a model on your labeled data.
+  densepose  Run densepose algorithm on videos.
+  predict    Identify species in a video.
+  train      Train a model on your labeled data.
 ```
 
 `zamba` can be used "out of the box" to generate predictions or train a model using your own videos. `zamba` supports the same video formats as FFmpeg, [which are listed here](https://www.ffmpeg.org/general.html#Supported-File-Formats_002c-Codecs-or-Features). Any videos that fail a set of FFmpeg checks will be skipped during inference or training.
@@ -81,10 +82,10 @@ See the [Quickstart](https://zamba.drivendata.org/docs/quickstart/) page or the
 ### Training a model
 
 ```console
-$ zamba train --data-dir path/to/videos --labels path_to_labels.csv
+$ zamba train --data-dir path/to/videos --labels path_to_labels.csv --save_dir my_trained_model
 ```
 
-The newly trained model will be saved to a folder in the current working directory called `zamba_{model_name}`. For example, a model finetuned from the pretrained `time_distributed` model (the default) will be saved in `zamba_time_distributed`. The folder will contain a model checkpoint as well as training configuration, model hyperparameters, and test and validation metrics. Run `zamba train --help` to list all possible options to pass to `train`.
+The newly trained model will be saved to the specified save directory. The folder will contain a model checkpoint as well as training configuration, model hyperparameters, and validation and test metrics. Run `zamba train --help` to list all possible options to pass to `train`.
 
 See the [Quickstart](https://zamba.drivendata.org/docs/quickstart/) page or the user tutorial on [training a model](https://zamba.drivendata.org/docs/train-tutorial/) for more details.
 
@@ -98,4 +99,4 @@ The command is (from the project root):
 $ make tests
 ```
 
-See the docs page on [contributing to `zamba`](https://zamba.drivendata.org/docs/contribute/index.md) for details.
+See the docs page on [contributing to `zamba`](https://zamba.drivendata.org/docs/contribute/index.md) for details.
diff --git a/docs/docs/api-reference/densepose_config.md b/docs/docs/api-reference/densepose_config.md
@@ -0,0 +1,3 @@
+# zamba.models.densepose.config
+
+::: zamba.models.densepose.config
diff --git a/docs/docs/api-reference/densepose_manager.md b/docs/docs/api-reference/densepose_manager.md
@@ -0,0 +1,3 @@
+# zamba.models.densepose.densepose_manager
+
+::: zamba.models.densepose.densepose_manager
diff --git a/docs/docs/changelog/index.md b/docs/docs/changelog/index.md
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,3 @@
		# zamba.models.densepose.config

		::: zamba.models.densepose.config
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,3 @@
		# zamba.models.densepose.densepose_manager

		::: zamba.models.densepose.densepose_manager