Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: link to HF models #60

Merged
merged 2 commits into from
Apr 18, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
109 changes: 75 additions & 34 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# clip-eval

Welcome to `clip-eval`, a repository for evaluating text-to-image models like CLIP, SigLIP, and the like.
Welcome to `clip-eval`, a repository for benchmarking text-to-image models **on your own data**!

Evaluate machine learning models against a benchmark of datasets to assess their performance on the generated embeddings, and visualize changes in embeddings from one model to another within the same dataset.
> Evaluate your (or HF) text-to-image embedding models like [CLIP][openai/clip-vit-large-patch14-336] from OpenAI against your (or HF) datasets to estimate how well the model will perform on your classification dataset.

## Installation

Expand All @@ -28,7 +28,6 @@ Evaluate machine learning models against a benchmark of datasets to assess their
export ENCORD_SSH_KEY_PATH=<path_to_the_encord_ssh_key_file>
```


## CLI Usage

### Embeddings Generation
Expand All @@ -37,6 +36,7 @@ To build embeddings, run the CLI command `clip-eval build`.
This commands allows you to interactively select the model and dataset combinations on which to build the embeddings.

Alternatively, you can choose known (model, dataset) pairs using the `--model-dataset` option. For example:

```
clip-eval build --model-dataset clip/plants
```
Expand All @@ -47,6 +47,7 @@ To evaluate models, use the CLI command `clip-eval evaluate`.
This command enables interactive selection of model and dataset combinations for evaluation.

Alternatively, you can specify known (model, dataset) pairs using the `--model-dataset` option. For example:

```
clip-eval evaluate --model-dataset clip/plants
```
Expand All @@ -59,21 +60,21 @@ This command allows to visualise the reduction of embeddings from two models on
The animations will be saved at the location specified by the environment variable `CLIP_EVAL_OUTPUT_PATH`.
By default, this path corresponds to the repository directory.


## Datasets

This repository contains classification datasets sourced from [Hugging Face](https://huggingface.co/datasets) and [Encord](https://app.encord.com/projects).
> Currently, only image and image groups datasets are supported, with potential for future expansion to include video datasets.

| Dataset Title | Source | Title in Source |
|:--------------------------|:-------------|:-------------------------------------|
| Alzheimer-MRI | Hugging Face | Falah/Alzheimer_MRI |
| chest-xray-classification | Hugging Face | trpakov/chest-xray-classification |
| LungCancer4Types | Hugging Face | Kabil007/LungCancer4Types |
| plants | Hugging Face | sampath017/plants |
| rsicd | Encord | - |
| skin-cancer | Hugging Face | marmal88/skin_cancer |
| sports-classification | Hugging Face | HES-XPLAIN/SportsImageClassification |

> ⚠️ Currently, only image and image groups datasets are supported, with potential for future expansion to include video datasets.

| Dataset Title | Implementation | HF Dataset |
| :------------------------ | :------------------------------ | :----------------------------------------------------------------------------------- |
| Alzheimer-MRI | [Hugging Face][hf-dataset-impl] | [Falah/Alzheimer_MRI][Falah/Alzheimer_MRI] |
| chest-xray-classification | [Hugging Face][hf-dataset-impl] | [trpakov/chest-xray-classification][trpakov/chest-xray-classification] |
| LungCancer4Types | [Hugging Face][hf-dataset-impl] | [Kabil007/LungCancer4Types][Kabil007/LungCancer4Types] |
| plants | [Hugging Face][hf-dataset-impl] | [sampath017/plants][sampath017/plants] |
| skin-cancer | [Hugging Face][hf-dataset-impl] | [marmal88/skin_cancer][marmal88/skin_cancer] |
| sports-classification | [Hugging Face][hf-dataset-impl] | [HES-XPLAIN/SportsImageClassification][HES-XPLAIN/SportsImageClassification] |
| rsicd | [Encord][encord-dataset-impl] | <span style="color: red">\*</span> Requires ssh key and access to the Encord project |

### Add a Dataset from a Known Source

Expand All @@ -83,6 +84,7 @@ You can find the explicit schema in `sources/dataset-definition-schema.json`.

Check out the declarations of known sources at `clip_eval.dataset.types` and refer to the existing dataset definitions in the `sources/datasets` folder for guidance.
Below is an example of a dataset definition for the [plants](https://huggingface.co/datasets/sampath017/plants) dataset sourced from Hugging Face:

```json
{
"dataset_type": "HFDataset",
Expand All @@ -101,6 +103,7 @@ For datasets sourced from Encord, other set of fields are required. These includ
### Add a Dataset Source

Expanding the dataset sources involves two key steps:

1. Create a dataset class that inherits from `clip_eval.dataset.Dataset` and specifies the input requirements for extracting data from the new source.
This class should encapsulate the necessary logic for fetching and processing dataset elements.
2. Generate a dataset definition in JSON format and save it in the `sources/datasets` folder, following the guidelines outlined in the previous section.
Expand All @@ -114,6 +117,7 @@ Expanding the dataset sources involves two key steps:
Alternatively, you can programmatically add a dataset, which will be available only for the current session, using the `register_dataset()` method of the `clip_eval.dataset.DatasetProvider` class.

Here is an example of how to register a dataset from Hugging Face using Python code:

```python
from clip_eval.dataset import DatasetProvider, Split
from clip_eval.dataset.types import HFDataset
Expand All @@ -129,27 +133,36 @@ To permanently remove a dataset, simply delete the corresponding JSON file store
This action removes the dataset from the list of available datasets in the CLI, disabling the option to create any further embedding using its data.
However, all embeddings previously built on that dataset will remain intact and available for other tasks such as evaluation and animation.


## Models

This repository contains models sourced from [Hugging Face](https://huggingface.co/models), [OpenCLIP](https://github.com/mlfoundations/open_clip) and local implementations based on OpenCLIP models.

| Model Title | Source | Title in Source |
|:-----------------|:--------------|:----------------------------------------------|
| apple | OpenCLIP | hf-hub:apple/DFN5B-CLIP-ViT-H-14 |
| bioclip | OpenCLIP | hf-hub:imageomics/bioclip |
| clip | Hugging Face | openai/clip-vit-large-patch14-336 |
| eva-clip | OpenCLIP | BAAI/EVA-CLIP-8B-448 |
| fashion | Hugging Face | patrickjohncyh/fashion-clip |
| plip | Hugging Face | vinid/plip |
| pubmed | Hugging Face | flaviagiammarino/pubmed-clip-vit-base-patch32 |
| rsicd | Hugging Face | flax-community/clip-rsicd |
| rsicd-encord | LocalOpenCLIP | ViT-B-32 |
| siglip_large | Hugging Face | google/siglip-large-patch16-256 |
| siglip_small | Hugging Face | google/siglip-base-patch16-224 |
| street | Hugging Face | geolocal/StreetCLIP |
| tinyclip | Hugging Face | wkcn/TinyCLIP-ViT-40M-32-Text-19M-LAION400M |
| vit-b-32-laion2b | OpenCLIP | ViT-B-32 |
_TODO_: Some more prose about what's the difference between implementations.

### Hugging Face Models

| Model Title | Implementation | HF Model |
| :--------------- | :---------------------------- | :--------------------------------------------------------------------------------------------- |
| apple | [OpenCLIP][open-model-impl] | [apple/DFN5B-CLIP-ViT-H-14][apple/DFN5B-CLIP-ViT-H-14] |
| apple | [OpenCLIP][open-model-impl] | [apple/DFN5B-CLIP-ViT-H-14][apple/DFN5B-CLIP-ViT-H-14] |
| bioclip | [OpenCLIP][open-model-impl] | [imageomics/bioclip][imageomics/bioclip] |
| eva-clip | [OpenCLIP][open-model-impl] | [BAAI/EVA-CLIP-8B-448][BAAI/EVA-CLIP-8B-448] |
| vit-b-32-laion2b | [OpenCLIP][local-model-impl] | [laion/CLIP-ViT-B-32-laion2B-s34B-b79K][laion/CLIP-ViT-B-32-laion2B-s34B-b79K] |
| clip | [Hugging Face][hf-model-impl] | [openai/clip-vit-large-patch14-336][openai/clip-vit-large-patch14-336] |
| fashion | [Hugging Face][hf-model-impl] | [patrickjohncyh/fashion-clip][patrickjohncyh/fashion-clip] |
| plip | [Hugging Face][hf-model-impl] | [vinid/plip][vinid/plip] |
| pubmed | [Hugging Face][hf-model-impl] | [flaviagiammarino/pubmed-clip-vit-base-patch32][flaviagiammarino/pubmed-clip-vit-base-patch32] |
| rsicd | [Hugging Face][hf-model-impl] | [flax-community/clip-rsicd][flax-community/clip-rsicd] |
| siglip_large | [Hugging Face][hf-model-impl] | [google/siglip-large-patch16-256][google/siglip-large-patch16-256] |
| siglip_small | [Hugging Face][hf-model-impl] | [google/siglip-base-patch16-224][google/siglip-base-patch16-224] |
| street | [Hugging Face][hf-model-impl] | [geolocal/StreetCLIP][geolocal/StreetCLIP] |
| tinyclip | [Hugging Face][hf-model-impl] | [wkcn/TinyCLIP-ViT-40M-32-Text-19M-LAION400M][wkcn/TinyCLIP-ViT-40M-32-Text-19M-LAION400M] |

### Locally Trained Models

| Model Title | Implementation | Weights |
| :----------- | :-------------------------------- | :------ |
| rsicd-encord | [LocalOpenCLIP][local-model-impl] | - |

### Add a Model from a Known Source

Expand All @@ -159,6 +172,7 @@ You can find the explicit schema in `sources/model-definition-schema.json`.

Check out the declarations of known sources at `clip_eval.model.types` and refer to the existing model definitions in the `sources/models` folder for guidance.
Below is an example of a model definition for the [clip](https://huggingface.co/openai/clip-vit-large-patch14-336) model sourced from Hugging Face:

```json
{
"model_type": "ClosedCLIPModel",
Expand All @@ -177,6 +191,7 @@ Additionally, on models sourced from OpenCLIP the optional `pretrained` field ma
### Add a Model Source

Expanding the model sources involves two key steps:

1. Create a model class that inherits from `clip_eval.model.Model` and specifies the input requirements for loading models from the new source.
This class should encapsulate the necessary logic for processing model elements and generating embeddings.
2. Generate a model definition in JSON format and save it in the `sources/models` folder, following the guidelines outlined in the previous section.
Expand All @@ -190,6 +205,7 @@ Expanding the model sources involves two key steps:
Alternatively, you can programmatically add a model, which will be available only for the current session, using the `register_model()` method of the `clip_eval.model.ModelProvider` class.

Here is an example of how to register a model from Hugging Face using Python code:

```python
from clip_eval.model import ModelProvider
from clip_eval.model.types import ClosedCLIPModel
Expand All @@ -205,7 +221,6 @@ To permanently remove a model, simply delete the corresponding JSON file stores
This action removes the model from the list of available models in the CLI, disabling the option to create any further embedding with it.
However, all embeddings previously built with that model will remain intact and available for other tasks such as evaluation and animation.


## Set Up the Development Environment

1. Create the virtual environment, add dev dependencies and set up pre-commit hooks.
Expand All @@ -219,7 +234,6 @@ However, all embeddings previously built with that model will remain intact and
export ENCORD_SSH_KEY_PATH=<path_to_the_encord_ssh_key_file>
```


## Contributing

Contributions are welcome!
Expand All @@ -228,6 +242,7 @@ Please feel free to open an issue or submit a pull request with your suggestions
### Adding Dataset Sources

To contribute by adding dataset sources, follow these steps:

1. Store the file containing the new dataset class implementation in the `clip_eval/dataset/types` folder.
Don't forget to add a reference to the class in the `__init__.py` file in the same folder.
This ensures that the new dataset type is accessible by default for all dataset definitions, eliminating the need to explicitly state the `module_path` field for datasets from such source.
Expand All @@ -238,9 +253,35 @@ To contribute by adding dataset sources, follow these steps:
### Adding Model Sources

To contribute by adding model sources, follow these steps:

1. Store the file containing the new model class implementation in the `clip_eval/model/types` folder.
Don't forget to add a reference to the class in the `__init__.py` file in the same folder.
This ensures that the new model type is accessible by default for all model definitions, eliminating the need to explicitly state the `module_path` field for models from such source.
2. Open a pull request with the necessary changes. Make sure to include tests validating that model loading, processing and embedding generation are working as expected.
3. Document the addition of the model source, providing details on its structure, usage, and any specific considerations or instructions for integration.
This ensures that users have clear guidance on how to leverage the new model source effectively.

[Falah/Alzheimer_MRI]: https://huggingface.co/datasets/Falah/Alzheimer_MRI
[trpakov/chest-xray-classification]: https://huggingface.co/datasets/trpakov/chest-xray-classification
[Kabil007/LungCancer4Types]: https://huggingface.co/datasets/Kabil007/LungCancer4Types
[sampath017/plants]: https://huggingface.co/datasets/sampath017/plants
[marmal88/skin_cancer]: https://huggingface.co/datasets/marmal88/skin_cancer
[HES-XPLAIN/SportsImageClassification]: https://huggingface.co/datasets/HES-XPLAIN/SportsImageClassification
[apple/DFN5B-CLIP-ViT-H-14]: https://huggingface.co/apple/DFN5B-CLIP-ViT-H-14
[imageomics/bioclip]: https://huggingface.co/imageomics/bioclip
[openai/clip-vit-large-patch14-336]: https://huggingface.co/openai/clip-vit-large-patch14-336
[BAAI/EVA-CLIP-8B-448]: https://huggingface.co/BAAI/EVA-CLIP-8B-448
[patrickjohncyh/fashion-clip]: https://huggingface.co/patrickjohncyh/fashion-clip
[vinid/plip]: https://huggingface.co/vinid/plip
[flaviagiammarino/pubmed-clip-vit-base-patch32]: https://huggingface.co/flaviagiammarino/pubmed-clip-vit-base-patch32
[flax-community/clip-rsicd]: https://huggingface.co/flax-community/clip-rsicd
[google/siglip-large-patch16-256]: https://huggingface.co/google/siglip-large-patch16-256
[google/siglip-base-patch16-224]: https://huggingface.co/google/siglip-base-patch16-224
[geolocal/StreetCLIP]: https://huggingface.co/geolocal/StreetCLIP
[wkcn/TinyCLIP-ViT-40M-32-Text-19M-LAION400M]: https://huggingface.co/wkcn/TinyCLIP-ViT-40M-32-Text-19M-LAION400M
[laion/CLIP-ViT-B-32-laion2B-s34B-b79K]: https://huggingface.co/laion/CLIP-ViT-B-32-laion2B-s34B-b79K
[open-model-impl]: https://github.com/encord-team/text-to-image-eval/blob/main/clip_eval/model/types/open_clip_model.py
[hf-model-impl]: https://github.com/encord-team/text-to-image-eval/blob/main/clip_eval/model/types/hugging_face_clip.py
[local-model-impl]: https://github.com/encord-team/text-to-image-eval/blob/main/clip_eval/model/types/local_clip_model.py
[hf-dataset-impl]: https://github.com/encord-team/text-to-image-eval/blob/main/clip_eval/dataset/types/hugging_face.py
[encord-dataset-impl]: https://github.com/encord-team/text-to-image-eval/blob/main/clip_eval/dataset/types/encord_ds.py
Loading