Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
265 commits
Select commit Hold shift + click to select a range
a1f40fd
FIX label definitions
holmbergius Jun 1, 2023
57dc9fb
Add lion and cougar MiewId support
holmbergius Jun 13, 2023
e2b5b0a
Add panthera_leo mapping
holmbergius Jun 15, 2023
029ca12
adds best model checkpointing
LashaO Jun 18, 2023
51c6ff9
adds optuna sweep script
LashaO Jun 18, 2023
23ff8a9
adds CLI options to sweep scrip
LashaO Jun 18, 2023
c33cd3b
fixes formatting
LashaO Jun 18, 2023
96d245d
adds optuna to requirements
LashaO Jun 19, 2023
179b064
Add more finned species
holmbergius Jun 21, 2023
4603c4a
Adds final trial run checkpointing
LashaO Jun 21, 2023
5e34f4d
adds gradcam progress messages
LashaO Jun 21, 2023
05d616d
adds viewpoint flip for train
LashaO Jun 22, 2023
6a54247
Cross-application to more whales and dolphins
holmbergius Jun 29, 2023
ce98fea
Merge pull request #2 from WildMeOrg/More_finned_species
holmbergius Jul 1, 2023
3f27bda
Disable every-epoch checkpoint
LashaO Jul 6, 2023
8dd2af2
Add old bottlenose_dolphin+fin_dorsal key
holmbergius Jul 8, 2023
4b6a45b
Add whale_fin_fin_dorsal default mapping
holmbergius Jul 9, 2023
392a353
Add orca temporarily
holmbergius Jul 10, 2023
44b244e
fix docstring text
LashaO Jul 11, 2023
af7193c
adds wandb context manager to fix sweep.py logs
LashaO Jul 11, 2023
5df3fbf
adds gitignore entries
LashaO Jul 11, 2023
05e5c12
adds bbox cropping for training
LashaO Jul 13, 2023
376fd4f
adds bbox cropping for training fixes
LashaO Jul 13, 2023
047108f
Update default_dataset.py
tsubramanian Jul 15, 2023
be9a37c
Disables horizontal flip augmentation by default
LashaO Jul 18, 2023
78b6a17
Adds modifications and scripts for error analysis
LashaO Jul 18, 2023
1d9ba04
adds new format config
LashaO Jul 18, 2023
453f2ed
Adds wandb trial and config logging for sweep.py
LashaO Jul 20, 2023
c4c8c0e
Adds backwards compatibility for old config files
LashaO Jul 21, 2023
97f137d
Modifies eval_fn to also return cmc
LashaO Jul 24, 2023
104da46
fixes test.py references
LashaO Jul 24, 2023
62e09e6
Adds parameter defaults to config
LashaO Jul 26, 2023
720b27e
Refactors visualization scripts
LashaO Jul 28, 2023
87ebd14
Adds groupwise evaluation to test.py
LashaO Jul 28, 2023
3e001ee
Updates sweep.py min/max lr coefficients
LashaO Aug 1, 2023
1bc07f8
Adds config saving for of a run in results
LashaO Aug 1, 2023
b078ef8
Fixes data truncation in test.py
LashaO Aug 1, 2023
821e538
Adds safety when loading out-of-bounds bboxes
LashaO Aug 1, 2023
427bde7
Fix visualization image output format
LashaO Aug 1, 2023
90a165a
Increases default augmentation rotation limit
LashaO Aug 1, 2023
12e134b
Adds exact img path loading option from coco file
LashaO Aug 1, 2023
7281d93
adjusts sweep.py lr coefficients
LashaO Aug 1, 2023
5a1209d
Adds grayscale to rgb conversion in plugin dataset
LashaO Aug 2, 2023
1ff10dc
Adds support for coco image_uuid available
LashaO Aug 2, 2023
1b721d1
Fixed visualization bad imports
LashaO Aug 2, 2023
d9e9e21
updates checkpoint_dir path construction
LashaO Aug 7, 2023
bcb79ed
fix groupwise eval when using image full pathj
LashaO Aug 7, 2023
95e5c42
fixes inactive shuffle for train dataloader
LashaO Aug 7, 2023
b2ffae2
Adds info print
LashaO Aug 8, 2023
24b8740
Defaults inference dataloader num_workers to 0
LashaO Aug 8, 2023
e8eee10
Adds default checkpoint path entry when saving config
LashaO Aug 8, 2023
2822260
Adds vectorized map calculation code(not used yet)
LashaO Aug 10, 2023
ce0acac
Reduces gradcam render batch size
LashaO Aug 16, 2023
0b2e3e8
Reduces gradcam render batch size
LashaO Aug 16, 2023
0de8b6f
Rename torch.cat for backwards compatibility
LashaO Aug 22, 2023
9756b78
Updates train.py to save the exact config used
LashaO Aug 22, 2023
c75553c
Makes test.py n_classes inferred from weights
LashaO Aug 22, 2023
bb30128
add autocast to _plugin inference
LashaO Aug 22, 2023
a2daab9
Makes test.py num_workers configurable
LashaO Aug 22, 2023
56ec9b6
Removes redundant prints from test.py
LashaO Aug 22, 2023
3eb6c2c
Update _plugin.py
tsubramanian Aug 22, 2023
701bf37
Fix name_orig field in preprocessing
LashaO Aug 23, 2023
44647ca
adds helpers for rotated bbox cropping
LashaO Aug 28, 2023
0f2bba9
add autocast by default for training inference
LashaO Aug 30, 2023
75e9f11
Increases jpeg compression for match vis results
LashaO Aug 30, 2023
5986423
Adds handling for oriented bounding boxes
LashaO Aug 30, 2023
8340b69
Modifies plugin dataset to use WBIA chips
LashaO Aug 30, 2023
2f1220f
Adds preprocessing merge safety
LashaO Sep 5, 2023
7a1e8f0
Display data stats with processed name key
LashaO Sep 11, 2023
7a100ec
Fixes negative bbox crop
LashaO Sep 14, 2023
213d925
Treat name+species as new ID when filtering
LashaO Sep 18, 2023
9cfda7c
Fixes training split stats printouts
LashaO Sep 18, 2023
7214118
Adds filter info printouts
LashaO Sep 18, 2023
b04989a
Adds sub-center arcface with dynamic margins head
LashaO Sep 18, 2023
adc972b
Makes unused engine loss_module param optional
LashaO Sep 18, 2023
83a3d50
Switches to vectorized eval stat calculation
LashaO Sep 19, 2023
1dd5b6e
Adds auto margin init for dynamic margin head eval
LashaO Sep 19, 2023
fb5a126
Disable run results directory overwrite
LashaO Sep 22, 2023
82ad628
Fixed gradcam eval metric compatibility
LashaO Sep 27, 2023
dc636e9
Fixes data filtering for grouped eval stats
LashaO Sep 29, 2023
9cef484
fixes subcenter arcface init
LashaO Oct 3, 2023
7fa7e4a
Adds option to preprocess and cache training imgs
LashaO Oct 4, 2023
1dfa51f
Fix test.py compatibility with subcenter arcface
LashaO Oct 4, 2023
101b66d
Updates plugin mappings for multi-species model
LashaO Oct 5, 2023
bdd3118
Adds species keys to plugin
LashaO Oct 5, 2023
d421c5c
Update sweep.py
LashaO Oct 6, 2023
4ae38fb
Add orca bodies
holmbergius Oct 7, 2023
462d856
Add fin whale bodies
holmbergius Oct 7, 2023
e3ae4fc
Adds configurable n_trials to sweep.py
LashaO Oct 7, 2023
6f770b8
fix test.py group eval crash
LashaO Oct 7, 2023
8f34dfa
Adds rightwhale model mappings
LashaO Oct 7, 2023
21bda64
Adds group eval small subset safety catch
LashaO Oct 7, 2023
3756b89
Add humpback fluke key
holmbergius Oct 8, 2023
e788ba9
Fix stat printouts
LashaO Oct 8, 2023
4be2bac
Add legacy right whale mapping
holmbergius Oct 10, 2023
1fdded7
adds gradcam methods
LashaO Oct 10, 2023
c67c9f7
Merge branch 'main' of https://github.com/LashaO/wbia-plugin-miew-id
LashaO Oct 10, 2023
f22fbeb
make dynamic arcface max margin configurable
LashaO Oct 13, 2023
7b4eb6b
Refactor eval code for nonsquare distmat support
LashaO Oct 13, 2023
fa337c5
Refactor preprocess functions
LashaO Oct 13, 2023
253c9d0
fix match vis index display
LashaO Oct 13, 2023
ab8091b
fix redundant function arguement
LashaO Oct 13, 2023
4b46316
Fix compatibility for datasets without species key
LashaO Oct 13, 2023
1af4e26
Add more related Flukebook classes
holmbergius Oct 14, 2023
d7ad793
Adds hard exception for diverging model outputs
LashaO Oct 18, 2023
f052c23
Merge branch 'main' of https://github.com/LashaO/wbia-plugin-miew-id
LashaO Oct 18, 2023
73dab0e
Rename variables for calculate_matches
LashaO Oct 18, 2023
82d12fd
Modify self-match proofing for one-vs-all eval
LashaO Oct 18, 2023
62d9fc9
Update eval functions to support masking
LashaO Oct 20, 2023
7d64c2a
Fix eval_fn
LashaO Oct 20, 2023
f3f7100
Add bryde's whale
holmbergius Oct 21, 2023
d564d07
Add sei whales
holmbergius Oct 21, 2023
c2a5c90
Add rough-toothed dolphin
holmbergius Oct 22, 2023
bbf56af
Fix mAP calculation
LashaO Oct 23, 2023
cda6755
Update _plugin.py for 4 bigcats
tsubramanian Oct 25, 2023
62d6430
Update _plugin.py
tsubramanian Oct 25, 2023
b140e18
Add pilot whales and African carnivore species
holmbergius Oct 25, 2023
c61004c
Adding some species flukes
holmbergius Oct 28, 2023
038de29
Additional species mapping for wild dogs and leopards
holmbergius Nov 3, 2023
31dc45d
Update README.md
LashaO Nov 8, 2023
f513157
Updates default_config
LashaO Nov 8, 2023
48756e2
added temperature scaling
sei-cabidi Oct 11, 2023
6a44edc
separated temperature from model params in config and added citation …
sei-cabidi Oct 11, 2023
178d669
refactored temperature scaling
sei-cabidi Oct 19, 2023
87e4a2c
added swa changes
sei-cabidi Nov 8, 2023
9453257
Add exception handling for sweep trials
LashaO Nov 9, 2023
ce15d22
Update default_config
LashaO Nov 9, 2023
0227c93
Updates readme.md
LashaO Nov 9, 2023
bcd2fdc
Update .gitignore
LashaO Nov 9, 2023
f178abb
Fixes output calculation for visualize
LashaO Nov 9, 2023
43cbfff
swa cleaned. code refactor. remove redundancies
sei-cabidi Nov 17, 2023
d4861e4
bug cleanup
sei-cabidi Nov 17, 2023
4f1c9be
Add alt for sperm whale fluke
holmbergius Nov 18, 2023
e7545c1
Add another hyena definition
holmbergius Nov 20, 2023
7128839
Fixes visualization texts
LashaO Nov 20, 2023
cea2bc0
Makes number of arcface subcenters configurable
LashaO Nov 21, 2023
ce93759
modifies sweep.py
LashaO Nov 28, 2023
a8076c8
added temperature scaling
sei-cabidi Oct 11, 2023
4ce5d54
separated temperature from model params in config and added citation …
sei-cabidi Oct 11, 2023
c0e0f81
refactored temperature scaling
sei-cabidi Oct 19, 2023
7f48ed7
added swa changes
sei-cabidi Nov 8, 2023
8d2f922
swa cleaned. code refactor. remove redundancies
sei-cabidi Nov 17, 2023
f939aac
bug cleanup
sei-cabidi Nov 17, 2023
24bf9ce
small changes
sei-cabidi Nov 29, 2023
971183f
fix merge conflicts
sei-cabidi Nov 29, 2023
f09a3ee
small merge fixes
sei-cabidi Nov 29, 2023
90816e3
cleaned readme, moved calibration metrics to new branch
sei-cabidi Nov 29, 2023
26b695d
updated default config to support swa
sei-cabidi Nov 29, 2023
7dfa3af
cosmetic fix
sei-cabidi Nov 29, 2023
11e4078
cleaned configs
sei-cabidi Nov 29, 2023
a2e6873
cleaned test.py
sei-cabidi Nov 29, 2023
2cb9925
deleted zeno integration
sei-cabidi Nov 29, 2023
1589446
Adds show_image helper function
LashaO Nov 30, 2023
f52f7f1
Updates old code snippets in test.py
LashaO Nov 30, 2023
88ea064
Removes dead import
LashaO Nov 30, 2023
bea7394
Makes config swa params backwards compatible
LashaO Nov 30, 2023
4c43bb8
Merge pull request #3 from sei-cabidi/main
LashaO Nov 30, 2023
c75a083
Fixes variable name typo
LashaO Nov 30, 2023
9746f5d
Updates sweep.py with conditional parameters
LashaO Nov 30, 2023
ea9ec53
Fixes accidental revert of loss margin loading
LashaO Dec 2, 2023
8c3ff32
Remove redundant logit extraction in eval_fn
LashaO Dec 2, 2023
0558bd7
Temorarily disable train set metrics for speedup
LashaO Dec 2, 2023
3c177cf
Temporarily disables printing of train scores
LashaO Dec 2, 2023
2257daa
Makeover for image preprocessing
LashaO Dec 2, 2023
695f572
Fixed preprocessing with force_apply flag
LashaO Dec 3, 2023
280bb24
Fixes sweep.py conditional parameter search
LashaO Dec 5, 2023
ecb47d8
Fix sweep.py low default number of epochs
LashaO Dec 9, 2023
d1b49ec
Changes model checkpointing to best only
LashaO Dec 11, 2023
b946f9a
Update _plugin.py
tsubramanian Dec 14, 2023
79a71f1
Add nine species carnivore model part 2
holmbergius Dec 15, 2023
309c028
Add cross-application targets for 9 cats model
holmbergius Dec 15, 2023
7242d8e
Removes unused loss import
LashaO Dec 16, 2023
9b1af47
fix 9 cat mappings
holmbergius Dec 17, 2023
9dca344
More 9cat reference fixes
holmbergius Dec 17, 2023
7e66725
Adds backwards compatibility case for config load
LashaO Dec 20, 2023
f2f560a
Converts image caching to use torch resize
LashaO Dec 20, 2023
24c382a
Adds scripts folder
LashaO Dec 20, 2023
a0d37cb
Add carchorodon carcharias
holmbergius Dec 21, 2023
8de9253
Updates model refernce for lions and cougars
LashaO Dec 28, 2023
bdd9759
split data -- first version
Jan 29, 2024
0ce1fcf
fix path
Jan 29, 2024
ece18b7
fix stats
Jan 29, 2024
1736035
Add missing wild dog key
holmbergius Jan 30, 2024
96502f7
split code update
Feb 1, 2024
8dfd8d9
cleanup and refactoring
Feb 6, 2024
89d592d
Import fix
Feb 7, 2024
7e2c478
Changes plugin single render call to return images without gradcam ov…
LashaO Feb 7, 2024
ff7ef30
Updates function comment
LashaO Feb 7, 2024
ca0b5e5
Add lion body matching with 9cat model
holmbergius Feb 9, 2024
68a36ba
Add lioness and lion_general
holmbergius Feb 9, 2024
8c5481e
Add beaked whale and Gervais
holmbergius Feb 10, 2024
107b506
Giraffe miewid is as added
tsubramanian Feb 10, 2024
664fb8d
Add giraffespotter species classes
holmbergius Feb 11, 2024
8c2d1f4
Fix yaml file name
holmbergius Feb 11, 2024
3bfdedd
Create model_config.yaml
tsubramanian Feb 12, 2024
0ae2502
Create model_bin_config.json
tsubramanian Feb 12, 2024
21a7d1e
Rename model_config.yaml to model_config.json
tsubramanian Feb 12, 2024
a72c86b
Update model_config.json
tsubramanian Feb 12, 2024
2501516
Update model_bin_config.json
tsubramanian Feb 12, 2024
3ac7f2a
Update _plugin.py
tsubramanian Feb 12, 2024
620ee48
Update _plugin.py
tsubramanian Feb 12, 2024
c199eb1
Dynamic config file load is included
tsubramanian Feb 13, 2024
dd381e8
Dynamic config file load is included
tsubramanian Feb 13, 2024
7e08c74
Merge pull request #4 from kwadraterry/split_data
LashaO Feb 13, 2024
f749e13
Merge pull request #5 from WildMeOrg/dynamic-config
LashaO Feb 14, 2024
6947cfd
add zipiidae sp
holmbergius Feb 17, 2024
30ad746
Updates .gitignore
LashaO Feb 20, 2024
415e4f8
Merge branch 'main' of https://github.com/LashaO/wbia-plugin-miew-id
LashaO Feb 20, 2024
ecc33ee
Adds render comment
LashaO Feb 20, 2024
cb018ab
Update scipy requirement
LashaO Feb 22, 2024
9e4b52d
Fix key Error
tsubramanian Feb 23, 2024
ccc7f34
Adds sea turtle model mappings
LashaO Mar 8, 2024
6057a9e
Dynamic Key load is added
tsubramanian Mar 10, 2024
7ed7ef0
Updates IoT model version
LashaO Mar 12, 2024
3906147
Fix test visualize
LashaO Mar 14, 2024
749dd65
Adds checkpoint url return for the match result
LashaO Mar 15, 2024
51c992e
Merge branch 'main' into fix_key_error
LashaO Mar 15, 2024
5f3f127
Merge pull request #6 from WildMeOrg/fix_key_error
LashaO Mar 15, 2024
417fb97
Revert "Fix key Error"
LashaO Mar 15, 2024
59b3139
Merge pull request #7 from WildMeOrg/revert-6-fix_key_error
LashaO Mar 15, 2024
42c4e5f
Revert "Revert "Fix key Error""
LashaO Mar 15, 2024
a493442
Merge pull request #8 from WildMeOrg/revert-7-revert-6-fix_key_error
LashaO Mar 15, 2024
d832dcb
Adds batched distance matrix fnction
LashaO Apr 4, 2024
79680d7
Adds swim backbone compatibility
LashaO Apr 4, 2024
1a31111
Adds species field key to DataLoader
LashaO Apr 4, 2024
be8665e
Adds latest epoch full checkpointing
LashaO Apr 4, 2024
255625b
Merge pull request #9 from WildMeOrg/fix_key_error
LashaO Apr 4, 2024
c3eb6a9
adds groupwise stats and logging during val
LashaO Apr 4, 2024
0983227
Refactor for circular import
LashaO Apr 4, 2024
81415cc
fix imports
LashaO Apr 4, 2024
5e5779b
fixes
LashaO Apr 4, 2024
3c20b40
Merge pull request #10 from WildMeOrg/groupwise-logging
LashaO Apr 6, 2024
f664585
Fix group eval fallback case
LashaO Apr 8, 2024
a32c10f
Updates gitignore
pilmer Jun 28, 2024
520f2b2
Dev docs (#11)
LashaO Jul 6, 2024
7fa72ec
Removes redundant function arguement
LashaO Aug 5, 2024
a8ae4fa
Fix variable unpacking
LashaO Aug 15, 2024
c9a174b
Update README.md
LashaO Sep 3, 2024
019cc8f
Fixes evaluate checkpoint loading
LashaO Sep 17, 2024
55e9157
Adds model config to trainer
LashaO Oct 22, 2024
c6be944
Adds finetuning capability and example
LashaO Oct 31, 2024
b1e492d
Update README.md
LashaO Dec 3, 2024
80fd0d6
Update README.md
LashaO Dec 3, 2024
c76d3ea
Adds PAIR-X Visualizations (#12)
lshrack Mar 25, 2025
b1cc25b
Fix PairX colorspace
LashaO Mar 25, 2025
baf6e52
fix checkpoint loading map location
LashaO Jul 23, 2025
c480a43
adds evaluator option to return embeddings
LashaO Jul 29, 2025
2ae3513
Update README.md
JasonWildMe Sep 10, 2025
16d9eb7
Update README.md
JasonWildMe Sep 10, 2025
316688c
adds resume from checkpoint capability
LashaO Jan 8, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 17 additions & 4 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,19 @@
wbia_tbd/data/
wbia_tbd/wandb/
wbia_miew_id/data/
wbia_miew_id/wandb/
*.pyc
wbia_tbd/runs/
wbia_miew_id/runs/
.env
TODO.md
TODO.md
.DS_store
wbia_miew_id/*.ipynb
wbia_miew_id/*.pkl
wbia_miew_id/.ipynb_checkpoints*
wbia_miew_id/configs/*
!wbia_miew_id/configs/default_config_new.yaml
*.csv
*.db
*.png
wbia_miew_id/splits/
wbia_miew_id/helpers/split/configs/config_*.yaml
wbia_miew_id.egg*
wbia_miew_id/examples/beluga_example_miewid/*
4 changes: 4 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
[submodule "wbia_miew_id/visualization/pairx"]
path = wbia_miew_id/visualization/pairx
url = https://github.com/pairx-explains/pairx.git
branch = dev-branch
104 changes: 104 additions & 0 deletions .ipynb_checkpoints/README-checkpoint.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@

# WILDBOOK IA - MIEW-ID Plugin

A plugin for matching and interpreting embeddings for wildlife identification.


## Setup

` pip install -r requirements.txt `

Optionally, these environment variables must be set to enable Weights and Biases logging
capability:
```
WANDB_API_KEY={your_wanb_api_key}
WANDB_MODE={'online'/'offline'}
```

## Training
You can create a new line in a code block in markdown by using two spaces at the end of the line followed by a line break. Here's an example:

```
cd wbia_miew_id
python train.py
```

## Data files

The data is expected to be in the coco JSON format. Paths to data files and the image directory are defined in the config YAML file.

The beluga data can be downloaded from [here](https://cthulhu.dyn.wildme.io/public/datasets/beluga-model-data.zip).

## Configuration file

A config file path can be set by:
`python train.py --config {path_to_config}`

- `exp_name`: Name of the experiment
- `project_name`: Name of the project
- `checkpoint_dir`: Directory for storing training checkpoints
- `comment`: Comment text for the experiment
- `viewpoint_list`: List of viewpoint values to keep for all subsets.
- `data`: Subfield for data-related settings
- `images_dir`: Directory containing the all of the dataset images
- `use_full_image_path`: Overrides the images_dir for path construction and instead uses an absolute path that should be defined in the `file_path` file path under the `images` entries for each entry in the COCO JSON. In such a case, `images_dir` can be set to `null`
- `crop_bbox`: Whether to use the `bbox` field of JSON annotations to crop the images. The crops will also be adjusted for rotation if the `theta` field is present for the annotations
- `preprocess_images` pre-applies cropping and resizing and caches the images for training
- `train`: Data parameters regarding the train set used in train.py
- `anno_path`: Path to the JSON file containing the annotations
- `n_filter_min`: Minimum number of samples per name (individual) to keep that individual in the set. Names under the threshold will be discarded
- `n_subsample_max`: Maximum number of samples per name to keep for the training set. Annotations for names over the threshold will be randomly subsampled once at the start of training
- `val`: Data parameters regarding the validation set used in train.py
- `anno_path`
- `n_filter_min`
- `n_subsample_max`
- `test`: Data parameters regarding the test set used in test.py
- `anno_path`
- `n_filter_min`
- `n_subsample_max`
- `checkpoint_path`: Path to model checkpoint to test
- `eval_groups`: Attributes for which to group the testing sets. For example, the value of `['viewpoint']` will create subsets of the test set for each unique value of the viewpoint and run one-vs-all evaluation for each subset separately. The value can be a list - `[['species', 'viewpoint']]` will run evaluation separately for each species+viewpoint combination. `['species', 'viewpoint']` will run grouped eval for each species, and then for each viewpoint. The corresponding fields to be grouped should be present under `annotation` entries in the COCO file. Can be left as `null` to do eval for the full test set.
- `name_keys`: List of keys used for defining a unique name (individual). Fields from multiple keys will be combined to form the final representation of a name. A common use-case is `name_keys: ['name', 'viewpoint']` for treating each name + viewpoint combination as a unique individual
- `image_size`:
- Image height to resize to
- Image width to resize to
- `engine`: Subfields for engine-related settings
- `num_workers`: Number of workers for data loading (default: 0)
- `train_batch_size`: Batch size for training
- `valid_batch_size`: Batch size for validation
- `epochs`: Number of training epochs
- `seed`: Random seed for reproducibility
- `device`: Device to be used for training
- `use_wandb`: Whether to use Weights and Biases for logging
- `use_swa`: Whether to use SWA during training
- `scheduler_params`: Subfields for learning rate scheduler parameters
- `lr_start`: Initial learning rate
- `lr_max`: Maximum learning rate
- `lr_min`: Minimum learning rate
- `lr_ramp_ep`: Number of epochs to ramp up the learning rate
- `lr_sus_ep`: Number of epochs to sustain the maximum learning rate
- `lr_decay`: Rate of learning rate decay per epoch
- `model_params`: Dictionary containing model-related settings
- `model_name`: Name of the model backbone architecture
- `use_fc`: Whether to use a fully connected layer after backbone extraction
- `fc_dim`: Dimension of the fully connected layer
- `dropout`: Dropout rate
- `loss_module`: Loss function module
- `s`: Scaling factor for the loss function
- `margin`: Margin for the loss function
- `pretrained`: Whether to use a pretrained model backbone
- `n_classes`: Number of classes in the training dataset, used for loading checkpoint
- `swa_params`: Subfields for SWA training
- `swa_lr`: SWA learning rate
- `swa_start`: Epoch number to begin SWA training
- `test`: Subfields for plugin-related settings
- `fliplr`: Whether to perform horizontal flipping during testing
- `fliplr_view`: List of viewpoints to apply horizontal flipping
- `batch_size`: Batch size for plugin inference

## Testing
`python test.py --config {path_to_config} --visualize`

The `--visualize` flag is optional and will produce top 5 match results for each individual in the test set, along with gradcam visualizations.

The parameters for the test set are defined under `data.test` of the config.yaml file.
6 changes: 6 additions & 0 deletions .ipynb_checkpoints/Untitled-checkpoint.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"cells": [],
"metadata": {},
"nbformat": 4,
"nbformat_minor": 5
}
187 changes: 165 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,15 @@

# WILDBOOK IA - ID Plugin
# WILDBOOK IA - MiewID Plugin

A plugin for re-identificaiton of wildlife individuals using learned embeddings.
A plugin for matching and interpreting embeddings for wildlife identification.


## Setup

` pip install -r requirements.txt `
```
pip install -r requirements.txt
pip install -e .
```

Optionally, these environment variables must be set to enable Weights and Biases logging
capability:
Expand All @@ -15,17 +18,129 @@ WANDB_API_KEY={your_wanb_api_key}
WANDB_MODE={'online'/'offline'}
```

## Training
You can create a new line in a code block in markdown by using two spaces at the end of the line followed by a line break. Here's an example:
## Multispecies Models

Model specs and dataset overview can be found at the [model card page for the Multispecies-v2 model](https://huggingface.co/conservationxlabs/miewid-msv2) and the [Multispecies-v3 model](https://huggingface.co/conservationxlabs/miewid-msv3)

### Pretrained Model Embeddings Extraction

```
import numpy as np
from PIL import Image
import torch
import torchvision.transforms as transforms
from transformers import AutoModel

model_tag = f"conservationxlabs/miewid-msv2"
model = AutoModel.from_pretrained(model_tag, trust_remote_code=True)

def generate_random_image(height=440, width=440, channels=3):
random_image = np.random.randint(0, 256, (height, width, channels), dtype=np.uint8)
return Image.fromarray(random_image)

random_image = generate_random_image()

preprocess = transforms.Compose([
transforms.Resize((440, 440)),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

input_tensor = preprocess(random_image)
input_batch = input_tensor.unsqueeze(0)

with torch.no_grad():
output = model(input_batch)

print(output)
print(output.shape)

```

### Pretrained Model Evaluation
```
import torch
from wbia_miew_id.evaluate import Evaluator
from transformers import AutoModel

evaluator = Evaluator(
device=torch.device('cuda'),
seed=0,
anno_path='beluga_example_miewid/benchmark_splits/test.csv',
name_keys=['name'],
viewpoint_list=None,
use_full_image_path=True,
images_dir=None,
image_size=(440, 440),
crop_bbox=True,
valid_batch_size=12,
num_workers=8,
eval_groups=[['species', 'viewpoint']],
fliplr=False,
fliplr_view=[],
n_filter_min=2,
n_subsample_max=10,
model_params=None,
checkpoint_path=None,
model=model,
visualize=False,
visualization_output_dir='beluga_example_visualizations'
)
```
cd wbia_tbd
python train.py

## Example Usage

### Example dataset download

```
cd wbia_miew_id
python examples/download_example.py
```

### Training

```
python train.py --config=examples/beluga_example_miewid/benchmark_model/miew_id.msv2_all.yaml
```

### Evaluation

```
python evaluate.py --config=examples/beluga_example_miewid/benchmark_model/miew_id.msv2_all.yaml
```

Optional `--visualize` flag can be used to produce top 5 match results for each individual in the test set, along with gradcam visualizations.

### Data Splitting, Training, and Evaluation Using Python Bindings

Demo notebooks are avaliable at [examples directory](https://github.com/WildMeOrg/wbia-plugin-miew-id/tree/main/wbia_miew_id/examples)

## Data files

The data is expected to be in the coco JSON format. Paths to data files and the image directory are defined in the config YAML file.
### Example dataset

The data is expected to be in the CSV or COCO JSON Format.

[Recommended] The CSV beluga data can be downlaoded from [here](https://cthulhu.dyn.wildme.io/public/datasets/beluga_example_miewid.tar.gz).

The COCO beluga data can be downloaded from [here](https://cthulhu.dyn.wildme.io/public/datasets/beluga-model-data.zip).

### Expected CSV data format

- `theta`: Bounding box rotation in radians
- `viewpoint`: Viewpoint of the individual facing the camera. Used for calculating per-viewpoint stats or separating individuals based on viewpoint
- `name`: Individual ID
- `file_name`: File name
- `viewpoint`: Species name. Used for calculating per-species stats
- `file_path`: Full path to images
- `x, y, w, h`: Bounding box coordinates

|theta |viewpoint |name |file_name|species|file_path|x |y |w |h |
|--------------|--------------------------------|-----|---------|-------|---------|-----|--------------------------------------------------------------------------------------------------------------------|----|---|
|0 |up |1030 |000000006040.jpg|beluga_whale|/datasets/beluga-440/000000006040.jpg|0 |0 |162 |440|
|0 |up |1030 |000000006043.jpg|beluga_whale|/datasets/beluga-440/000000006043.jpg|0 |0 |154 |440|
|0 |up |508 |000000006044.jpg|beluga_whale|/datasets/beluga-440/000000006044.jpg|0 |0 |166 |440|


## Configuration file

Expand All @@ -36,33 +151,46 @@ A config file path can be set by:
- `project_name`: Name of the project
- `checkpoint_dir`: Directory for storing training checkpoints
- `comment`: Comment text for the experiment
- `viewpoint_list`: List of viewpoint values to keep for all subsets.
- `data`: Subfield for data-related settings
- `images_dir`: Directory containing the all of the dataset images
- `train_anno_path`: Path to the JSON file containing training annotations
- `val_anno_path`: Path to the JSON file containing validation annotations
- `viewpoint_list`: List of viewpoints to use.
- `train_n_filter_min`: Minimum number of samples per name (individual) to keep for the training set. Names under the theshold will be discarded.
- `val_n_filter_min`: Minimum number of samples per name (individual) to keep for the validation set. Names under the theshold will be discarded
- `train_n_subsample_max`: Maximum number of samples per name to keep for the training set. Annotations of names above the threshold will be randomly subsampled during loading
- `val_n_subsample_max`: Maximum number of samples per name to keep for the validation set. Annotations of names above the threshold will be randomly subsampled during loading
- `name_keys`: List of keys used for defining a unique name (individual). Fields from multiple keys will be combined to form the final representation of a name. Common use-case is `name_keys: ['name', 'viewpoint']` for treating each name + viewpoint combination as unique
- `use_full_image_path`: Overrides the images_dir for path construction and instead uses an absolute path that should be defined in the `file_path` file path under the `images` entries for each entry in the COCO JSON. In such a case, `images_dir` can be set to `null`
- `crop_bbox`: Whether to use the bounding box metadata to crop the images. The crops will also be adjusted for rotation if the `theta` field is present for the annotations
- `preprocess_images` pre-applies cropping and resizing and caches the images for training
- `train`: Data parameters regarding the train set used in train.py
- `anno_path`: Path to the JSON file containing the annotations
- `n_filter_min`: Minimum number of samples per name (individual) to keep that individual in the set. Names under the threshold will be discarded
- `n_subsample_max`: Maximum number of samples per name to keep for the training set. Annotations for names over the threshold will be randomly subsampled once at the start of training
- `val`: Data parameters regarding the validation set used in train.py
- `anno_path`
- `n_filter_min`
- `n_subsample_max`
- `test`: Data parameters regarding the test set used in test.py
- `anno_path`
- `n_filter_min`
- `n_subsample_max`
- `checkpoint_path`: Path to model checkpoint to test
- `eval_groups`: Attributes for which to group the testing sets. For example, the value of `['viewpoint']` will create subsets of the test set for each unique value of the viewpoint and run one-vs-all evaluation for each subset separately. The value can be a list - `[['species', 'viewpoint']]` will run evaluation separately for each species+viewpoint combination. `['species', 'viewpoint']` will run grouped eval for each species, and then for each viewpoint. The corresponding fields to be grouped should be present under `annotation` entries in the COCO file. Can be left as `null` to do eval for the full test set.
- `name_keys`: List of keys used for defining a unique name (individual). Fields from multiple keys will be combined to form the final representation of a name. A common use-case is `name_keys: ['name', 'viewpoint']` for treating each name + viewpoint combination as a unique individual
- `image_size`:
- Image height to resize to
- Image width to resize to

- `engine`: Subfields for engine-related settings
- `num_workers`: Number of workers for data loading (default: 0)
- `train_batch_size`: Batch size for training
- `valid_batch_size`: Batch size for validation
- `epochs`: Number of training epochs
- `seed`: Random seed for reproducibility
- `device`: Device to be used for training
- `loss_module`: Loss function module
- `use_wandb`: Whether to use Weights and Biases for logging
- `use_swa`: Whether to use SWA during training
- `scheduler_params`: Subfields for learning rate scheduler parameters
- `lr_start`: Initial learning rate
- `lr_max`: Maximum learning rate
- `lr_min`: Minimum learning rate
- `lr_ramp_ep`: Number of epochs to ramp up the learning rate
- `lr_sus_ep`: Number of epochs to sustain the maximum learning rate
- `lr_decay`: Rate of learning rate decay per epoch
- `model_params`: Dictionary containing model-related settings
- `model_name`: Name of the model backbone architecture
- `use_fc`: Whether to use a fully connected layer after backbone extraction
Expand All @@ -72,12 +200,27 @@ A config file path can be set by:
- `s`: Scaling factor for the loss function
- `margin`: Margin for the loss function
- `pretrained`: Whether to use a pretrained model backbone
- `n_classes`: Number of classes in the training dataset, used for loading checkpoint
- `n_classes`: Number of classes in the training dataset, used for loading checkpoint
- `swa_params`: Subfields for SWA training
- `swa_lr`: SWA learning rate
- `swa_start`: Epoch number to begin SWA training
- `test`: Subfields for plugin-related settings
- `fliplr`: Whether to perform horizontal flipping during testing
- `fliplr_view`: List of viewpoints to apply horizontal flipping
- `batch_size`: Batch size for plugin inference

## Notes

This is an initial commit which includes training, inference and WBIA integration capabilities. Release of additional features is underway.
## Citation
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.13647526.svg)](https://doi.org/10.5281/zenodo.13647526)
```bibtex
@misc{WildMe2023,
author = {Otarashvili, Lasha},
title = {MiewID},
year = {2023},
url = {https://github.com/WildMeOrg/wbia-plugin-miew-id},
doi = {10.5281/zenodo.13647526},
}
```

## Copyright
Copyright Conservation X Labs 2025.

Loading