From bc49bb4f20e3e1910c8a42bd003771f3f7e26d1a Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Frederik=20Hvilsh=C3=B8j?=
<93145535+frederik-encord@users.noreply.github.com>
Date: Wed, 24 Apr 2024 10:04:30 +0200
Subject: [PATCH] chore: add known results (#66)
---
README.md | 80 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 80 insertions(+)
diff --git a/README.md b/README.md
index 1424035..90d3325 100644
--- a/README.md
+++ b/README.md
@@ -94,6 +94,86 @@ By default, this path corresponds to the repository directory.
+## Some Example Results
+
+One example of where this `tti-eval` is useful is to test different open-source models against different open-source datasets within a specific domain.
+Below, we focused on the medical domain. We evaluate nine different models of which three of them are domain specific.
+The models are evaluated against four different medical datasets. Note, Further down this page, you will find links to all models and datasets.
+
+
+
+
+
Figure 1: Linear probe accuracy across four different medical datasets. General purpose models are colored green while models trained for the medical domain are colored red.
+
+
+
+
+The raw numbers from the experiment
+
+### Weighted KNN Accuracy
+
+| Model/Dataset | Alzheimer-MRI | LungCancer4Types | chest-xray-classification | skin-cancer |
+| :--------------- | :-----------: | :--------------: | :-----------------------: | :---------: |
+| apple | 0.6777 | 0.6633 | 0.9687 | 0.7985 |
+| bioclip | 0.8952 | 0.7800 | 0.9771 | 0.7961 |
+| clip | 0.6986 | 0.6867 | 0.9727 | 0.7891 |
+| plip | 0.8021 | 0.6767 | 0.9599 | 0.7860 |
+| pubmed | 0.8503 | 0.5767 | 0.9725 | 0.7637 |
+| siglip_large | 0.6908 | 0.6533 | 0.9695 | 0.7947 |
+| siglip_small | 0.6992 | 0.6267 | 0.9646 | 0.7780 |
+| tinyclip | 0.7389 | 0.5900 | 0.9673 | 0.7589 |
+| vit-b-32-laion2b | 0.7559 | 0.5967 | 0.9654 | 0.7738 |
+
+### Zero-shot Accuracy
+
+| Model/Dataset | Alzheimer-MRI | LungCancer4Types | chest-xray-classification | skin-cancer |
+| :--------------- | :-----------: | :--------------: | :-----------------------: | :---------: |
+| apple | 0.4460 | 0.2367 | 0.7381 | 0.3594 |
+| bioclip | 0.3092 | 0.2200 | 0.7356 | 0.0431 |
+| clip | 0.4857 | 0.2267 | 0.7381 | 0.1955 |
+| plip | 0.0104 | 0.2267 | 0.3873 | 0.0797 |
+| pubmed | 0.3099 | 0.2867 | 0.7501 | 0.1127 |
+| siglip_large | 0.4876 | 0.3000 | 0.5950 | 0.0421 |
+| siglip_small | 0.4102 | 0.0767 | 0.7381 | 0.1541 |
+| tinyclip | 0.2526 | 0.2533 | 0.7313 | 0.1113 |
+| vit-b-32-laion2b | 0.3594 | 0.1533 | 0.7378 | 0.1228 |
+
+---
+
+### Image-to-image Retrieval
+
+| Model/Dataset | Alzheimer-MRI | LungCancer4Types | chest-xray-classification | skin-cancer |
+| :--------------- | :-----------: | :--------------: | :-----------------------: | :---------: |
+| apple | 0.4281 | 0.2786 | 0.8835 | 0.6437 |
+| bioclip | 0.4535 | 0.3496 | 0.8786 | 0.6278 |
+| clip | 0.4247 | 0.2812 | 0.8602 | 0.6347 |
+| plip | 0.4406 | 0.3174 | 0.8372 | 0.6289 |
+| pubmed | 0.4445 | 0.3022 | 0.8621 | 0.6228 |
+| siglip_large | 0.4232 | 0.2743 | 0.8797 | 0.6466 |
+| siglip_small | 0.4303 | 0.2613 | 0.8660 | 0.6348 |
+| tinyclip | 0.4361 | 0.2833 | 0.8379 | 0.6098 |
+| vit-b-32-laion2b | 0.4378 | 0.2934 | 0.8551 | 0.6189 |
+
+---
+
+### Linear Probe Accuracy
+
+| Model/Dataset | Alzheimer-MRI | LungCancer4Types | chest-xray-classification | skin-cancer |
+| :--------------- | :-----------: | :--------------: | :-----------------------: | :---------: |
+| apple | 0.5482 | 0.5433 | 0.9362 | 0.7662 |
+| bioclip | 0.6139 | 0.6600 | 0.9433 | 0.7933 |
+| clip | 0.5547 | 0.5700 | 0.9362 | 0.7704 |
+| plip | 0.5469 | 0.5267 | 0.9261 | 0.7630 |
+| pubmed | 0.5482 | 0.5400 | 0.9278 | 0.7269 |
+| siglip_large | 0.5286 | 0.5200 | 0.9496 | 0.7697 |
+| siglip_small | 0.5449 | 0.4967 | 0.9327 | 0.7606 |
+| tinyclip | 0.5651 | 0.5733 | 0.9280 | 0.7484 |
+| vit-b-32-laion2b | 0.5684 | 0.5933 | 0.9302 | 0.7578 |
+
+---
+
+
+
## Datasets
This repository contains classification datasets sourced from [Hugging Face](https://huggingface.co/datasets) and [Encord](https://app.encord.com/projects).