HyperAnimal: Identity Hypersphere Guided Synthetic Datasets Generation for Individual Animal Identification
This repository contains source code to reproduce the following paper: HyperAnimal: Identity Hypersphere Guided Synthetic Datasets Generation for Individual Animal Identification.
Download the generated synthetic animal datasets (2K identities × 10 images) from the paper:
- Synthetic dataset for Red panda
- Synthetic datasets for Giant panda
- Synthetic datasets for Amur tiger
Download the pretrained HyperAnimal diffusion model weights for different species:
- Pretrained HyperAnimal for Red panda
- Pretrained HyperAnimal for Giant panda
- Pretrained HyperAnimal for Amur tiger
Download the pretrained individual animal identification models trained on synthetic HyperAnimal data:
- Pretrained Identification Models for Red panda
- Pretrained Identification Models for Giant panda
- Pretrained Identification Models for Amur tiger
-
Clone the repository
git clone <repository-url> cd HyperAnimal
-
Create and activate the conda environment
conda env create -n hyperanimal -f environment.yml conda activate hyperanimal
- Place unlabeled real red panda images in
data/redpanda/ - The pre-extracted training embeddings are provided in
data/redpanda_embeddings/
-
Identity embeddings extractor:
- Downloaded from identity embeddings
- Save to
models/identification/weights/rpd43.pth
-
Autoencoder Weights:
- Download pre-trained encoder/decoder weights from Encoder and decoder model weights
- Save to
models/autoencoder/vq_f8_encoder.ptandmodels/autoencoder/vq_f8_decoder.pt - Note: The pre-trained autoencoder weights that originally come from the
fhq256LDM from Rombach et al.. Their VQModelInterface submodule has been manually extracted and split into its encoder and decoder models, since the encoder is only used during training and the decoder is only needed for sampling.
-
Make sure that the
dataset: redpanda_FDIEoption is set and that the paths in the corresponding subconfigurationconfigs/dataset/redpanda_FDIE.yamlare pointing to the training images and pre-extracted embeddings. -
Start training:
python main.py
Trained models will be saved in
outputs/rp_f1/checkpoints/
-
Download the pretrained HyperAnimal models (including the
.hydrafolder) and place them inoutputs/rp_f1/checkpoints/. -
Generate synthetic identity contexts, and save them in
data/contexts/syn_2000.npypython create_sample_identity_contexts.py
-
Configure sampling parameters in
configs/sample_rp.yaml, including- Path to trained model
- Path to contexts file
- Number of identities
- Images per identity
-
Generate samples:
python sample.py
Those samples will be saved under
samples/as identity blocks, e.g. a 4x4 grid block of 512x512 images. -
Split identity blocks into individual images:
python split_identity_blocks.py
Generated samples are saved in
samples/
-
With the code provided under
reid/, the training and testing of six identification models should be started via:# Prepare data splits python prepare_gallery_query.py python prepare_train_val.py # Train models ./train.sh # Evaluate models ./test.sh
Important!!! Testing the values in Table 3/9/10
-
Download the pretrained six identification models from identification models, and place them in
- ./reid/model/rp
- ./reid/model/gp
- ./reid/model/atrw
-
Download the test data from Test data, and place them in
- ./reid/test_data/redpanda-test
- ./reid/test_data/iPanda-test
- ./reid/test_data/atrw-test
-
Execute the
test_tb3.shscript in reid filescd HyperAnimal/reid bash test_tb3.shResults (mAP and CMC values) will be saved to
result.txt. These values correspond to Table 3, 9, and 10 in the paper. Specifically, the results can also be found inresult.txtfile from pretrained identification models.
HyperAnimal/
├── configs/ # Configuration YAML files
├── data/ # Training images and embeddings
| ├── embeddings/ # Identity embeddings for training
| ├── contexts/ # Synthetic identity embeddings for sampling
| ├── redpanda/ # Redpanda dataset for training
├── models/ # PyTorch model architectures
| ├── autoencoder/ # Autoencoder models
| ├── diffusion/ # DDPM implementation
| ├── identification/ # Identification model weights
├── outputs/ # Trained model checkpoints
├── samples/ # Generated samples
├── utils/ # Utility modules and scripts
├── reid/ # Animal identification training code
├── main.py # Main training script
├── sample.py # Sampling script
├── create_sample_identity_contexts.py # Context generation
├── split_identity_blocks.py # Sample processing
├── extract_identity_embeddings.py # Embedding extraction
└── environment.yml # Conda environment specification
