Skip to content

v0.23.0

Latest

Choose a tag to compare

@sachinprasadhs sachinprasadhs released this 21 Oct 17:34
2dc4d39

Summary:

New Models:

We've integrated a range of cutting-edge models, each designed to tackle specific challenges in their respective domains:

  • Cell2Sentence: A single-cell, biology-aware model built on the Gemma-2 architecture, designed to interpret complex biological data.

  • T5Gemma: A new encoder-decoder model, ideal for sequence-to-sequence tasks like translation and summarization.

  • PARSeq: An end-to-end, ViT-based model for scene text recognition (STR), excelling at reading text in natural images.

  • D-FINE: A high-performance, real-time object detection model.

  • DepthAnythingV2: A monocular depth estimation (MDE) model trained on a combination of synthetic labeled data and real-world unlabeled images.

  • Qwen3 Moe: The largest language model in the Qwen series, utilizing a Mixture-of-Experts (MoE) architecture for enhanced performance and efficiency.

  • MobileNetV5: A state-of-the-art vision encoder specifically designed for high-efficiency AI on edge devices.

  • SmolLM3: A compact yet powerful language model excelling in reasoning, long-context understanding, and multilingual capabilities.

Improvements & Enhancements

This update also includes several key improvements to enhance the platform's stability, compatibility, and flexibility:

  • export_to_transformers: You can now export trainable models, tokenizers, and configurations directly into the Hugging Face Transformers format using export_to_transformers. This feature is currently available for Gemma models, with support for more architectures coming soon.
  • OpenVINO Backend Support: We've integrated OpenVINO inference support, enabling optimized inference for Mistral, Gemma, and GPT-2 models.
  • Bidirectional Attention Mask: Gemma models now support a bidirectional attention mask, enabling more effective fine-tuning on tasks that require understanding the full context of a sequence.
  • CLIP & SD3 Model Refactor: The CLIP and Stable Diffusion 3 models have been refactored to improve numerical stability. Updated checkpoints are now available to ensure seamless and reliable performance.

What's Changed

New Contributors

Full Changelog: v0.22.2...v0.23.0