Skip to content

Releases: keras-team/keras-hub

v0.23.0

21 Oct 17:34
2dc4d39

Choose a tag to compare

Summary:

New Models:

We've integrated a range of cutting-edge models, each designed to tackle specific challenges in their respective domains:

  • Cell2Sentence: A single-cell, biology-aware model built on the Gemma-2 architecture, designed to interpret complex biological data.

  • T5Gemma: A new encoder-decoder model, ideal for sequence-to-sequence tasks like translation and summarization.

  • PARSeq: An end-to-end, ViT-based model for scene text recognition (STR), excelling at reading text in natural images.

  • D-FINE: A high-performance, real-time object detection model.

  • DepthAnythingV2: A monocular depth estimation (MDE) model trained on a combination of synthetic labeled data and real-world unlabeled images.

  • Qwen3 Moe: The largest language model in the Qwen series, utilizing a Mixture-of-Experts (MoE) architecture for enhanced performance and efficiency.

  • MobileNetV5: A state-of-the-art vision encoder specifically designed for high-efficiency AI on edge devices.

  • SmolLM3: A compact yet powerful language model excelling in reasoning, long-context understanding, and multilingual capabilities.

Improvements & Enhancements

This update also includes several key improvements to enhance the platform's stability, compatibility, and flexibility:

  • export_to_transformers: You can now export trainable models, tokenizers, and configurations directly into the Hugging Face Transformers format using export_to_transformers. This feature is currently available for Gemma models, with support for more architectures coming soon.
  • OpenVINO Backend Support: We've integrated OpenVINO inference support, enabling optimized inference for Mistral, Gemma, and GPT-2 models.
  • Bidirectional Attention Mask: Gemma models now support a bidirectional attention mask, enabling more effective fine-tuning on tasks that require understanding the full context of a sequence.
  • CLIP & SD3 Model Refactor: The CLIP and Stable Diffusion 3 models have been refactored to improve numerical stability. Updated checkpoints are now available to ensure seamless and reliable performance.

What's Changed

New Contributors

Full Changelog: v0.22.2...v0.23.0

v0.23.0.dev0

20 Oct 23:35
4de2ff6

Choose a tag to compare

v0.23.0.dev0 Pre-release
Pre-release

What's Changed

New Contributors

Full Changelog: v0.22.2...v0.23.0.dev0

v0.22.2

12 Sep 15:31
f4b648d

Choose a tag to compare

New Model: VaultGemma

VaultGemma is a 1-billion-parameter, 26-layer, text-only decoder model trained with sequence-level differential privacy (DP).
Derived from Gemma 2, its architecture notably drops the norms after the Attention and MLP blocks and uses full attention for all layers, rather than alternating with local sliding attention.
The pretrained model is available with a 1024-token sequence length.

What's Changed

Full Changelog: v0.22.1...v0.22.2

v0.22.1

15 Aug 18:59
56ba520

Choose a tag to compare

What's Changed

Full Changelog: v0.22.0...v0.22.1

v0.22.0

14 Aug 18:01

Choose a tag to compare

Summary:

New Models:

We've integrated a range of cutting-edge models, each designed to tackle specific challenges in their respective domains:

  • Gemma 3 270M: Released Gemma 3 270M parameter model and instruction tuned, 18-layer, text-only model designed for
    hyper-efficient AI, particularly for task-specific fine-tuning.

  • Qwen3: A powerful, large-scale multilingual language model, excelling in various natural language processing tasks, from text generation to complex reasoning.

  • DeiT: Data-efficient Image Transformers (DeiT), specifically designed to train Vision Transformers effectively with less data, making high-performance visual models more accessible.

  • HGNetV2: An advanced version of the Hybrid-Grouped Network, known for its efficient architecture in computer vision tasks, particularly optimized for performance on diverse hardware.

  • DINOV2: A state-of-the-art Self-Supervised Vision Transformer, enabling the learning of robust visual representations without relying on explicit labels, ideal for foundation models.

  • ESM & ESM2: Evolutionary Scale Modeling (ESM & ESM2), powerful protein language models used for understanding protein sequences and structures, with ESM2 offering improved capabilities for bioinformatics research.

Improvements & Enhancements

This update also includes several key improvements to enhance the platform's stability, compatibility, and flexibility:

  • Python 3.10 Minimum Support: Updated the minimum supported Python version to 3.10, ensuring compatibility with the latest libraries and features.
  • Gemma Conversion (Keras to SafeTensors): Added a new conversion script to effortlessly convert Gemma models from Keras format to Hugging Face's Safetensor format.
  • Gemma3 Conversion Script: Added conversion script for Gemma3 models, streamlining their integration into the Hugging Face ecosystem.
  • ViT Non-Square Image Support: Enhanced the Vision Transformer (ViT) model to now accept non-square images as input, providing greater flexibility for various computer vision applications.
  • LLM Left Padding Method: Added support for left padding in our LLM padding methods, offering more control and compatibility for specific model architectures and inference requirements.

What's Changed

Complete list of all the changes included in this release.

New Contributors

Full Changelog: v0.21.1...v0.22.0

For detailed documentation and usage examples/guides, please refer to our updated guides on https://keras.io/keras_hub/

v0.22.0.dev0

13 Aug 18:32
4e3435f

Choose a tag to compare

v0.22.0.dev0 Pre-release
Pre-release

What's Changed

New Contributors

Read more

v0.21.1

03 Jun 23:28
c019e50

Choose a tag to compare

Summary:

  • Comprehensive docstrings to QwencausalLM, resolve integration test issues for Keras-IO, and coverage tracking for Keras-Hub.

What's Changed

Full Changelog: v0.21.0...v0.21.1

v0.21.0

28 May 19:07
933efe6

Choose a tag to compare

Summary

  • New Models.

    • Xception: Added Xception architecture for image classification tasks.
    • Qwen: Added Qwen2.5 large language models and presets of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters.
    • Qwen MoE: Added transformer-based Mixture of Experts (MoE) decoder-only language model with a base variant having 2.7B activated parameters during runtime.
    • Mixtral: Added Mixtral LLM, a pretrained generative Sparse Mixture of Experts with pre-trained and instruction tuned models having 7 billion activated parameters.
    • Moonshine: Added Moonshine, a speech recognition task model.
    • CSPNet: Added Cross Stage Partial Network (CSPNet) classification task model.
    • Llama3: Added support for Llama 3.1 and 3.2.
  • Added sharded weight support to KerasPresetSaver and KerasPresetLoader, defaulting to a 10GB maximum shard size.

What's Changed

New Contributors

Full Changelog: v0.20.0...v0.21.0

v0.20.0

03 Apr 23:48
d907fed

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.19.3...v0.20.0

v0.20.0.dev1

03 Apr 19:11
50807f2

Choose a tag to compare

v0.20.0.dev1 Pre-release
Pre-release

What's Changed

Full Changelog: v0.20.0.dev0...v0.20.0.dev1