Releases: keras-team/keras-hub
v0.11.0
Summary
This release has no major feature updates, but changes the location our source code is help. Source code is split into a src/ and api/ directory with an explicit API surface similar to core Keras.
When adding or removing new API in a PR, use ./shell/api_gen.sh to update the autogenerated api/ files. See our contributing guide.
What's Changed
- Change the order of importing
kerasby @james77777778 in #1596 - Add backend info to HF model card by @SamanehSaadat in #1599
- Bump required kagglehub version to 0.2.4 by @SamanehSaadat in #1600
- Bump
bert_tiny_en_uncased_sst2classifier version by @SamanehSaadat in #1602 - Allow a task preprocessor to be an argument in from_preset by @SamanehSaadat in #1603
- API Generation by @sampathweb in #1608
- Update readme with some recent changes by @mattdangerw in #1575
- Bump the python group with 2 updates by @dependabot in #1611
- Version bump 0.11.0.dev0 by @mattdangerw in #1615
- Unexport models from the 0.11 release by @mattdangerw in #1614
- Version bump 0.11.0 by @mattdangerw in #1616
New Contributors
- @james77777778 made their first contribution in #1596
Full Changelog: v0.10.0...v0.11.0
v0.10.0
Summary
- Added support for
Task(CausalLMandClassifier) saving and loading which allows uploadingTasks. - Added basic Model Card for Hugging Face upload.
- Added support for a
positionsarray in ourRotaryEmbeddinglayer.
What's Changed
- 0.9 is out, nightly should be a preview of 0.10 now by @mattdangerw in #1570
- Do the reverse embedding in the same dtype as the input embedding by @mattdangerw in #1548
- Add support for positions array in
keras_nlp.layers.RotaryEmbeddinglayer by @tirthasheshpatel in #1571 - Support Task Saving/Loading by @SamanehSaadat in #1547
- Improve error handling for non-keras model loading attempts by @SamanehSaadat in #1577
- Add Model Card for Hugging Face Upload by @SamanehSaadat in #1578
- Add Saving Tests by @SamanehSaadat in #1590
- Improve error handling for missing TensorFlow dependency in keras_nlp. by @SamanehSaadat in #1585
- Fix Keras import by @sampathweb in #1593
- Check kagglehub version before upload by @SamanehSaadat in #1594
- Version bump to 0.10.0.dev0 by @SamanehSaadat in #1595
- Version bump 0.10.0.dev1 by @SamanehSaadat in #1601
- Version bump to 0.10.0.dev2 by @SamanehSaadat in #1604
- Version bump to 0.10.0 by @SamanehSaadat in #1606
Full Changelog: v0.9.3...v0.10.0
v0.9.3
Patch release with fixes for Llama and Mistral saving.
What's Changed
- Fix saving bug for untied weights with keras 3.2 by @mattdangerw in #1568
- Version bump for dev release by @mattdangerw in #1569
- Version bump 0.9.3 by @mattdangerw in #1572
Full Changelog: v0.9.2...v0.9.3
v0.9.2
Summary
- Initial release of CodeGemma.
- Bump to a Gemma 1.1 version without download issues on Kaggle.
What's Changed
- Fix
print_fnissue in task test by @SamanehSaadat in #1563 - Update presets for code gemma by @mattdangerw in #1564
- version bump 0.9.2.dev0 by @mattdangerw in #1565
- Version bump 0.9.2 by @mattdangerw in #1566
Full Changelog: v0.9.1...v0.9.2
v0.9.1
Patch fix for bug with stop_token_ids.
What's Changed
- Fix the new stop_token_ids argument by @mattdangerw in #1558
- Fix tests with the "auto" default for stop token ids by @mattdangerw in #1559
- Version bump for 0.9.1 by @mattdangerw in #1560
Full Changelog: v0.9.0...v0.9.1
v0.9.0
The 0.9.0 release adds new models, hub integrations, and general usability improvements.
Summary
- Added the Gemma 1.1 release.
- Added the Llama 2, BLOOM and ELECTRA models.
- Expose new base classes. Allow
from_preset()on base classes.keras_nlp.models.Backbonekeras_nlp.models.Taskkeras_nlp.models.Classifierkeras_nlp.models.CausalLMkeras_nlp.models.Seq2SeqLMkeras_nlp.models.MaskedLM
- Some initial features for uploading to model hubs.
backbone.save_to_preset,tokenizer.save_to_preset,keras_nlp.upload_preset.from_presetandupload_presetnow work with the Hugging Face Models Hub.- More features (task saving, lora saving), and full documentation coming soon.
- Numerical fixes for the Gemma model at mixed_bfloat16 precision. Thanks unsloth for catching!
# Llama 2. Needs Kaggle consent and login, see https://github.com/Kaggle/kagglehub
causal_lm = keras_nlp.models.LlamaCausalLM.from_preset(
"llama2_7b_en",
dtype="bfloat16", # Run at half precision for inference.
)
causal_lm.generate("Keras is a", max_length=128)
# Base class usage.
keras_nlp.models.Classifier.from_preset("bert_base_en", num_classes=2)
keras_nlp.models.Tokenizer.from_preset("gemma_2b_en")
keras_nlp.models.CausalLM.from_preset("gpt2_base_en", dtype="mixed_bfloat16")What's Changed
- Add dtype arg to Gemma HF conversion script by @nkovela1 in #1452
- Fix gemma testing import by @mattdangerw in #1462
- Add docstring for PyTorch conversion script install instructions by @nkovela1 in #1471
- Add an annotation to tests that need kaggle auth by @mattdangerw in #1470
- Fix Mistral memory consumption with JAX and default dtype bug by @tirthasheshpatel in #1460
- Bump the master version to 0.9 by @mattdangerw in #1473
- Pin to TF 2.16 RC0 by @sampathweb in #1478
- Fix gemma rms_normalization's use of epsilon by @cpsauer in #1472
- Add
FalconBackboneby @SamanehSaadat in #1475 - CI - Add kaggle creds to pull model by @sampathweb in #1459
- bug in example for ReversibleEmbedding by @TheCrazyT in #1484
- doc fix for constrastive sampler by @mattdangerw in #1488
- Remove broken link to masking and padding guide by @mattdangerw in #1487
- Fix a typo in causal_lm_preprocessors by @SamanehSaadat in #1489
- Fix dtype accessors of tasks/backbones by @mattdangerw in #1486
- Auto-labels 'gemma' on 'gemma' issues/PRs. by @shmishra99 in #1490
- Add BloomCausalLM by @abuelnasr0 in #1467
- Remove the bert jupyter conversion notebooks by @mattdangerw in #1492
- Add
FalconTokenizerby @SamanehSaadat in #1485 - Add
FalconPreprocessorby @SamanehSaadat in #1498 - Rename 176B presets & Add other presets into bloom_presets.py by @abuelnasr0 in #1496
- Add bloom presets by @abuelnasr0 in #1501
- Create workflow for auto assignment of issues and for stale issues by @sachinprasadhs in #1495
- Update requirements to TF 2.16 by @sampathweb in #1503
- Expose Task and Backbone by @mattdangerw in #1506
- Clean up and add our gemma conversion script by @mattdangerw in #1493
- Don't auto-update JAX GPU by @sampathweb in #1507
- Keep rope at float32 precision by @grasskin in #1497
- Bump the python group with 2 updates by @dependabot in #1509
- Fixes for the LLaMA backbone + add dropout by @tirthasheshpatel in #1499
- Add
LlamaPreprocessorandLlamaCausalLMPreprocessorby @tirthasheshpatel in #1511 - Always run the rotary embedding layer in float32 by @tirthasheshpatel in #1508
- CI: Fix psutil - Remove install of Python 3.9 and alias of python3 by @sampathweb in #1514
- Update gemma_backbone.py for sharding config. by @qlzh727 in #1491
- Docs/modelling layers by @mykolaskrynnyk in #1502
- Standardize docstring by @sachinprasadhs in #1516
- Support tokenization of special tokens for word_piece_tokenizer by @abuelnasr0 in #1397
- Upload Model to Kaggle by @SamanehSaadat in #1512
- Add scoring mode to MistralCausalLM by @RyanMullins in #1521
- Add Mistral Instruct V0.2 preset by @tirthasheshpatel in #1520
- Add Tests for Kaggle Upload Validation by @SamanehSaadat in #1524
- Add presets for Electra and checkpoint conversion script by @pranavvp16 in #1384
- Allow saving / loading from Huggingface Hub preset by @Wauplin in #1510
- Stop on multiple end tokens by @grasskin in #1518
- Fix doc:
mistral_base_en->mistral_7b_enby @asmith26 in #1528 - Add lora example to GemmaCausalLM docstring by @SamanehSaadat in #1527
- Add LLaMA Causal LM with 7B presets by @tirthasheshpatel in #1526
- Add task base classes; support out of tree library extensions by @mattdangerw in #1517
- Doc fixes by @mattdangerw in #1530
- Run the LLaMA and Mistral RMS Layer Norm in float32 by @tirthasheshpatel in #1532
- Adds score API to GPT-2 by @RyanMullins in #1533
- increase pip timeout to 1000s to avoid connection resets by @sampathweb in #1535
- Adds the score API to LlamaCausalLM by @RyanMullins in #1534
- Implement compute_output_spec() for tokenizers with vocabulary. by @briango28 in #1523
- Remove staggler type annotiations by @mattdangerw in #1536
- Always run SiLU activation in float32 for LLaMA and Mistral by @tirthasheshpatel in #1540
- Bump the python group with 2 updates by @dependabot in #1538
- Disallow saving to preset from keras 2 by @SamanehSaadat in #1545
- Fix the rotary embedding computation in LLaMA by @tirthasheshpatel in #1544
- Fix re-compilation bugs by @mattdangerw in #1541
- Fix preprocessor from_preset bug by @mattdangerw in #1549
- Fix a strange issue with preprocessing layer output types by @mattdangerw in #1550
- Fix lowercase bug in wordpiece tokenizer by @abuelnasr0 in #1543
- Small docs updates by @mattdangerw in #1553
- Add a few new preset for gemma by @mattdangerw in #1556
- Remove the dev prefix for 0.9.0 release by @mattdangerw in #1557
New Contributors
- @cpsauer made their first contribution in #1472
- @SamanehSaadat made their first contribution in #1475
- @TheCrazyT made their first contribution in #1484
- @shmishra99 made their first contribution in #1490
- @sachinprasadhs made their first contribution in #1495
- @mykolaskrynnyk made their first contribution in #1502
- @RyanMullins made their first contribution in #1521
- @Wauplin made their first contribution in #1510
- @asmith26 made their first contribution in #1528
- @briango28 made their first contribution in #1523
Full Changelog: v0.8.2...v0.9.0
v0.8.2
Summary
- Mistral fixes for dtype and memory usage. #1458
What's Changed
- Fix Mistral memory consumption with JAX and default dtype bug by @tirthasheshpatel in #1460
- Version bump for dev release by @mattdangerw in #1474
Full Changelog: v0.8.1...v0.8.2.dev0
v0.8.1
Minor fixes to Kaggle Gemma assets.
What's Changed
- Update to the newest version of Gemma on Kaggle by @mattdangerw in #1454
- Dev release 0.8.1.dev0 by @mattdangerw in #1456
- 0.8.1 version bump by @mattdangerw in #1457
Full Changelog: v0.8.0...v0.8.1
v0.8.0
The 0.8.0 release focuses on generative LLM features in KerasNLP.
Summary
- Added the
MistralandGemmamodels. - Allow passing
dtypedirectly to backbone and task constructors. - Add a settable
sequence_lengthproperty to all preprocessing layers. - Added
enable_lora()to the backbone class for parameter efficient fine-tuning. - Added layer attributes to backbone models for easier access to model internals.
- Added
AlibiBiaslayer.
# Pass dtype to a model.
causal_lm = keras_nlp.MistralCausalLM.from_preset(
"mistral_instruct_7b_en",
dtype="bfloat16"
)
# Settable sequence length property.
causal_lm.preprocessor.sequence_length = 128
# Lora API.
causal_lm.enable_lora(rank=4)
# Easy layer attributes.
for layer in causal_lm.backbone.transformer_layers:
print(layer.count_params())What's Changed
- Fix test for recent keras 3 change by @mattdangerw in #1400
- Pass less state to jax generate function by @mattdangerw in #1398
- Add llama tokenizer by @mattdangerw in #1401
- Add Bloom Model by @abuelnasr0 in #1382
- Try fixing tests by @mattdangerw in #1411
- Revert "Pass less state to jax generate function (#1398)" by @mattdangerw in #1412
- Bloom tokenizer by @abuelnasr0 in #1403
- Update black formatting by @mattdangerw in #1415
- Add Alibi bias layer by @abuelnasr0 in #1404
- Pin to
tensorflow-hub 0.16.0to fix CI error by @sampathweb in #1420 - Update TF Text and remove TF Hub deps by @sampathweb in #1423
- Pin Jax Version in GPU CI by @sampathweb in #1430
- Add Bloom preprocessor by @abuelnasr0 in #1424
- Add layer attributes for all functional models by @mattdangerw in #1421
- Allow setting dtype per model by @mattdangerw in #1431
- Add a Causal LM model for Mistral by @tirthasheshpatel in #1429
- Fix bart by @mattdangerw in #1434
- Add a settable property for sequence_length by @mattdangerw in #1437
- Add dependabot to update GH Actions and Python dependencies by @pnacht in #1380
- Bump the github-actions group with 1 update by @dependabot in #1438
- Add 7B presets for Mistral by @tirthasheshpatel in #1436
- Update byte_pair_tokenizer.py to close merges file properly by @divyashreepathihalli in #1440
- bump version to 0.8 by @mattdangerw in #1441
- Update our sampler documentation to reflect usage by @mattdangerw in #1444
- Add Gemma model by @mattdangerw in #1448
- Version bump for dev release by @mattdangerw in #1449
- Version bump to 0.8.0 by @mattdangerw in #1450
New Contributors
- @dependabot made their first contribution in #1438
- @divyashreepathihalli made their first contribution in #1440
Full Changelog: v0.7.0...v0.8.0
v0.17.0.dev0
Summary
- 📢 KerasNLP and KerasCV are now becoming KerasHub 📢. KerasCV and KerasNLP have been consolidated into KerasHub package
- Models available now in KerasHub are albert, bart, bert, bloom, clip, csp_darknet, deberta_v3, deeplab_v3, densenet, distil_bert, efficientnet, electra, f_net, falcon, gemma, gpt2, gpt_neo_x, llama, llama3, mistral, mit, mobilenet, opt, pali_gemma, phi3, resnet, retinanet, roberta, sam, stable_diffusion_3, t5, vae, vgg, vit_det, whisper, xlm_roberta and xlnet.
- A new preprocessor flow has been added for vision and audio models
What's Changed
- Update python version in readme to 3.8 by @haifeng-jin in #618
- Modify our pip install line so we upgrade tf by @mattdangerw in #616
- Use Adam optimizer for quick start by @mattdangerw in #620
- Clean up class name and
selfin calls tosuper()by @mbrukman in #628 - Update word_piece_tokenizer.py by @ADITYADAS1999 in #617
- Add DeBERTaV3 Conversion Script by @abheesht17 in #633
- Add AlbertTokenizer and AlbertPreprocessor by @abheesht17 in #627
- Create
Backbonebase class by @jbischof in #621 - Add TPU testing by @chenmoneygithub in #591
- Add Base Preprocessor Class by @abheesht17 in #638
- Add keras_nlp.samplers by @chenmoneygithub in #563
- Add ALBERT Backbone by @abheesht17 in #622
- Add a small script to count parameters in our presets by @mattdangerw in #610
- Clean up examples/ directory by @ADITYADAS1999 in #637
- Fix Small BERT Typo by @abheesht17 in #651
- Rename examples/bert -> examples/bert_pretraining by @mattdangerw in #647
- Add FNet Preprocessor by @abheesht17 in #646
- Add FNet Backbone by @abheesht17 in #643
- Small DeBERTa Docstring Fixes by @abheesht17 in #666
- Add Fenced Docstring Testing by @abheesht17 in #640
- Corrected the epsilon value by @soma2000-lang in #665
- Consolidate docstring formatting weirdness in Backbone and Preprocessor base classes by @mattdangerw in #654
- Fix
value_diminTransformerDecoder's cross-attn layer by @abheesht17 in #667 - Add ALBERT Presets by @abheesht17 in #655
- Add Base Task Class by @abheesht17 in #671
- Implement TopP, TopK and Beam samplers by @chenmoneygithub in #652
- Add FNet Presets by @abheesht17 in #659
- Bump the year to 2023 by @mattdangerw in #679
- Add BART Backbone by @abheesht17 in #661
- Handle trainable and name in the backbone base class by @mattdangerw in #680
- Ignore Task Docstring for Testing by @abheesht17 in #683
- Light-weight benchmarking script by @NusretOzates in #664
- Conditionally import tf_text everywhere by @mattdangerw in #684
- Expose
token_embeddingas a Backbone Property by @abheesht17 in #676 - Move
from_presetto base tokenizer classes by @shivance in #673 - add f_net_classifier and f_net_classifier_test by @ADITYADAS1999 in #670
- import rouge_scorer directly from rouge_score package by @sampathweb in #691
- Fix typo in requirements file juypter -> jupyter by @mattdangerw in #693
- Temporary fix to get nightly green again by @mattdangerw in #696
- GPT2 Text Generation APIs by @chenmoneygithub in #592
- Run keras saving tests on nightly and fix RobertaClassifier test by @mattdangerw in #692
- Speed up pip install keras-nlp; simplify deps by @mattdangerw in #697
- Add
AlbertClassifierby @shivance in #668 - Make tokenizer, backbone, preprocessor properties settable on base class by @mattdangerw in #700
- Update to latest black by @mattdangerw in #708
- RobertaMaskedLM task and preprocessor by @mattdangerw in #653
- Default compilation for BERT/RoBERTa classifiers by @jbischof in #695
- Add start/end token padding to
GPT2Preprocessorby @chenmoneygithub in #704 - Don't install tf stable when building our nightly image by @mattdangerw in #711
- Add OPT Backbone and Tokenizer by @mattdangerw in #699
- Small OPT Doc-string Edits by @abheesht17 in #716
- Default compilation other classifiers by @Plutone11011 in #714
- Add BartTokenizer and BART Presets by @abheesht17 in #685
- Add an add_prefix_space Arg in BytePairTokenizer by @shivance in #715
- Opt presets by @mattdangerw in #707
- fix import of tensorflow_text in tf_utils by @sampathweb in #723
- Check for masked token in roberta tokenizer by @mattdangerw in #742
- Improve test coverage for special tokens in model tokenizers by @mattdangerw in #743
- Fix the sampler truncation strategy by @chenmoneygithub in #713
- Add ALBERT Conversion Script by @abheesht17 in #736
- Add FNet Conversion Script by @abheesht17 in #737
- Add BART Conversion Script by @abheesht17 in #739
- Pass Correct LayerNorm Epsilon value to TransformerEncoder in Backbones by @TheAthleticCoder in #731
- Improving the layer Description. by @Neeshamraghav012 in #734
- Adding ragged support to SinePositionEncoding by @apupneja in #751
- Fix trailing space by @mattdangerw in #755
- Adding an AlbertMaskedLM task + Fix Projection layer dimension in MaskedLMHead by @shivance in #725
- New docstring example for TokenAndPosition Embedding layer. by @Neeshamraghav012 in #760
- Add a note for TPU issues for deberta_v3 by @mattdangerw in #758
- Add missing exports to models API by @mattdangerw in #763
- Autogenerate preset table by @Cyber-Machine in #690
- Version bump to 0.5.0 by @mattdangerw in #767
- Adding a FNetMaskedLM task model and preprocessor by @apupneja in #740
- Add a DistilBertMaskedLM task model by @ADITYADAS1999 in #724
- Add cache support to decoding journey by @chenmoneygithub in #745
- Handle [MASK] token in DebertaV3Tokenizer by @abheesht17 in #759
- Update README for 2.4.1 release by @mattdangerw in #757
- Fix typo in test docstring by @jbischof in #791
- Fixed Incorrect Links for FNet and DeBERTaV3 models by @Cyber-Machine in #793
- Patch 1 - doc-string spell fix by @atharvapurdue in #781
- Don't rely on core keras initializer config details by @mattdangerw in #802
- Simplify the cache decoding graph by @mattdangerw in #780
- Fix Fenced Doc-String #782 by @atharvapurdue in #785
- Solve #721 Deberta masklm model by @Plutone11011 in #732
- Add from_config to sampler by @mattdangerw in #803
- BertMaskedLM Task Model and Preprocessor by @Cyber-Machine in #774
- Stop generation once end_t...