Skip to content

Releases: knaw-huc/loghi-htr

v2.0.4

05 Jun 10:19
2bd9abd
Compare
Choose a tag to compare
Version 2.0.4 released

2.0.3

26 Apr 07:31
2bd9abd
Compare
Choose a tag to compare

Release Notes for Loghi-HTR Version 2.0.3

Date: 2024-04-26

Overview

This release introduces a minor improvement to the error handling mechanism when dealing with text partitions.

Additional Improvements

  • Improved Error Handling: The software now raises a ValueError if a specified partition contains no valid text lines. This update ensures that users receive a specific and actionable error message early in the processing pipeline, preventing unnecessary processing and clarifying the exact nature of the issue. This improvement is critical for those working with large and varied text datasets where partition integrity is crucial.

Contributors

  • @TimKoornstra: Enhanced the error handling mechanism in this version, ensuring the software provides more accurate feedback to users during data preparation phases.

Full Changelog: 2.0.2...2.0.3

2.0.2

18 Apr 12:31
a8947ec
Compare
Choose a tag to compare

Release Notes for Loghi-HTR Version 2.0.2

Date: 2024-04-18

Overview

Version 2.0.2 enhances Loghi-HTR with improved TensorFlow strategy automation, API error handling, memory management, and security updates.

Major Updates

  • Automated TensorFlow Distribution Strategy: Adjusts TensorFlow strategy based on the GPU count, optimizing for various hardware configurations.
  • Enhanced API Error Handling:
    • Issues a 400 response code for input validation errors, improving API reliability.
    • Handles zero-byte images to prevent processing errors.
  • Memory Management Overhaul: Addresses a memory leak in API by shifting to numpy objects for better garbage collection.

Additional Improvements

  • Gunicorn Security Update: Upgraded to the latest version as per security advisory CVE-2024-1135.
  • Bug Fix - assert_cardinality Warning: Resolved dataset processing warnings to ensure smoother operations.

Docker Image

The Docker image for version 2.0.2 can be obtained using:

docker pull loghi/docker.htr:2.0.2

Contributors

  • @TimKoornstra: Implemented all updates and fixes in this release.

Full Changelog: 2.0.1...2.0.2

2.0.1

10 Apr 14:01
718a115
Compare
Choose a tag to compare

Release Notes for HTR Version 2.0.1

Date: 2024-04-10

Overview

Version 2.0.1 of HTR introduces critical updates to enhance model accuracy and configuration clarity. This release notably corrects a CTC loss calculation bug and updates the README to guide users on essential model configurations.

Major Updates

  • CTC Loss Calculation Bug Fix: Addressed an issue affecting the accuracy of the CTC loss calculation under specific dataset and batch size conditions, ensuring more reliable model training outcomes.

Additional Improvements

  • README Update for Model Configuration: The README now includes an important note on the necessity of a config.json file within the LOGHI_MODEL_PATH, specifically for the channels key, to ensure model compatibility and optimal performance.

Contributors

  • @TimKoornstra: Key contributions to bug resolution and documentation improvements.

Full Changelog: 2.0.0...2.0.1

2.0.0

04 Apr 12:06
Compare
Choose a tag to compare

Release Notes for Loghi-HTR Version 2.0.0

Date: 2024-04-04

Overview

Version 2.0.0 of Loghi-HTR marks a significant milestone in the evolution of our handwriting text recognition software. This release introduces comprehensive enhancements across the board, from data processing and model architecture to user interaction and system efficiency. Key updates include advanced visualization tools for in-depth analysis, a modular and easily navigable code structure, and a second version of our API designed for higher performance and better resource management. We've also focused on refining our GPU handling, data loading, and augmentation processes for optimized performance. Additionally, this version sees a revamp in configuration handling and logging for a more user-friendly experience, alongside the introduction of custom learning rate schedules and significant code quality improvements. Deprecated features and arguments have been carefully evaluated and updated to streamline operations and pave the way for future advancements. With version 2.0.0, users can expect a more powerful, efficient, and intuitive LoghiHTR, ready to meet the challenges of modern handwriting text recognition tasks.

Major Updates

  • Modular Code Structure: Significantly improved organization with functions grouped into subfolders within the src directory, aiding in maintainability.
  • API v2:
    • Improved support of gunicorn. This changes how the API should be started. For reference, check the example scripts in the src/api directory.
    • API refactored for efficiency. Key enhancements include:
      • Simplified queue system for faster processing.
      • New /health and /ready endpoints to monitor overall API and process status.
      • Optional user login through SimpleSecurity integration.
      • Separate decoding process for better GPU utilization.
    • See the updated README for detailed API changes and instructions.
  • Robust Logging: Streamlined logging with a more structured system, comprehensive validation logs including metric tables, and execution timers.
  • Improved Configuration Handling:
    • Run Loghi using a configuration file (--config_file) for greater flexibility.
    • Command-line arguments override config file settings for easy adjustments.
    • Revamped config.json structure for improved readability.
  • Enhanced Visualizations:
    • Time-step prediction visualizer: Highlights the top-3 most probable characters considered by the model at each time-step.
    • Filter activations visualizer: Shows how convolutional layers respond to input images and random noise, enabling analysis of different model architectures.
    • PDF combiner: Creates a single-sheet export of all generated visualizations.

Additional Improvements

  • Custom Learning Rate Schedule: Supports warmup, exponential decay, and linear decay.
  • GPU Handling Refinements
  • Revamped Data Loaders and Augmentations:
    • Data management classes refactored (DataLoader is now DataManager).
    • Data augmentations performed on the GPU for significant performance boost.
  • Code Quality Enhancements: Code simplifications, bug fixes, and improvements.
  • User Experience Improvements: The vis_arg_parser aligns with loghi-htr for a familiar command-line experience.

Deprecations (Effective May 2024)

Several arguments in LoghiHTR are being deprecated to streamline functionality and improve user experience. Here is a summary of the changes and the reasoning behind them:

  • --do_train: Future training processes will be initiated through a more flexible method by providing a train_list. This change allows for a more intuitive setup for training sessions.

  • --do_inference: Inference will be activated by supplying an inference_list, simplifying the command line interface and making it more intuitive to perform inferences.

  • --use_mask: Masking will be enabled by default, removing the need for explicit command-line toggling and reflecting the common use case directly in the application's behavior.

  • --no_auto: This argument will be removed to streamline the command line options, as auto-correction or similar functionalities will be incorporated more seamlessly into the application's logic.

  • --height: The height parameter will be inferred automatically from the VGSL specification, simplifying model configuration and ensuring consistency across model inputs.

  • --channels: Like height, the number of channels will be automatically inferred from the VGSL specification, reducing the need for manual specification and potential errors.

  • --output_charlist: The character list will be saved to output/charlist.txt by default, standardizing output file locations and reducing command line clutter.

  • --config_file_output: Configuration details will be saved to output/config.json by default, aligning with the standardized approach for output management.

  • --thaw: With models being saved with all layers thawed by default, this argument becomes unnecessary, simplifying model saving and loading processes.

  • --existing_model: The use of --existing_model will be replaced by the --model argument, streamlining the process of loading or creating models.

Additionally, we are phasing out support for the classic .pb-style TensorFlow SavedModel format. Starting May 2024, LoghiHTR will automatically convert any old models loaded in the .pb format to the new .keras format. This conversion process is designed to be seamless and will save the converted model to the specified output/model-name directory. This change aligns with our commitment to using the latest and most efficient formats, ensuring better performance and ease of use.

Docker Image

The Docker image for version 2.0.0 can be obtained using the following command:

docker pull loghi/docker.htr:2.0.0

Important Notes

  • Due to the significant changes, please test your workflows thoroughly and report issues.
  • We have strived for a smooth update, but some disruptions may occur. If you encounter problems, please open an issue on the project's GitHub repository.

Contributors

  • @TimKoornstra: A major force behind this release, Tim contributed to several key areas including the main refactor & organization of files, the introduction of an improved learning rate schedule, enhancements in argument handling and configuration, the development of API v2, and numerous quality of life and code quality improvements. His contributions have been instrumental in shaping the direction and capabilities of LoghiHTR 2.0.0.

  • @Thelukepet: Contributed to revamping visualization files and played a pivotal role in the V1 DataGenerator and Data Augmentation Revamp on GPU. These contributions have significantly improved data handling and model visualization capabilities.

  • @MMaas3: Made a notable first contribution by enhancing security features. This addition is crucial for the secure and reliable operation of Loghi-HTR.


Full Changelog: 1.3.12...2.0.0

1.3.12

22 Mar 09:06
96aefa9
Compare
Choose a tag to compare

Release Notes for Loghi-HTR Version 1.3.12

Date: 2024-03-22

Overview

Version 1.3.12 of Loghi-HTR introduces several enhancements and bug fixes to improve data loading, augmentation, model handling, and confidence score calculations.

Enhancements

  • DataLoader Improvements:

    • The DataLoader now skips lines that are empty after stripping, ensuring cleaner data processing.
  • Random JPEG Augmentation Adjustments:

    • The --random_jpeg augmentation has been adjusted to be less extreme, providing more realistic augmentations.
  • Existing Model Channel Resetting:

    • When using the --existing_model option, the channels are now always reset to ensure consistent model behavior.

Bug Fixes

  • Confidence Score Clamping:

    • Fixed a bug where confidence scores could exceed 1 due to precision errors. All confidence scores are now clamped to the range [0, 1]. A warning is logged whenever this clamping occurs.
  • SavedModel Format Conversion:

    • The SavedModel format is now converted and saved to the new .keras format in the output/model.name directory. Starting from May 2024, the legacy format will only be usable for inference.

Contributors

  • @rvankoert: Responsible for the DataLoader improvements, random JPEG augmentation adjustments, and existing model channel resetting.
  • @TimKoornstra: Responsible for the confidence score clamping and SavedModel format conversion.

Full Changelog: 1.3.8...1.3.12

1.3.8

19 Jan 10:18
6c80561
Compare
Choose a tag to compare

Release Notes for Loghi-HTR Version 1.3.8

Date: 2024-01-19

Overview

Version 1.3.8 of Loghi-HTR introduces a range of new features and updates to enhance testing procedures, handle Out-of-Vocabulary (OOV) vocabulary more effectively, and improve data normalization and validation processes.

New Features

  • Enabling Test List Usage:

    • Added functionality to use a test_list for streamlined testing procedures.
  • OOV Vocabulary Implementation:

    • Implemented handling for Out-of-Vocabulary (OOV) words.
    • Replaced [UNK] tokens with � (a less common character), enabling it to be counted as a single character in Character Error Rate (CER) calculations.
  • Outputting Results to File:

    • Validation and test results can now be outputted to a .csv file in the output folder.

Enhancements

  • Data Normalization and Validation Process Updates:

    • Separated validation and evaluation datasets for more precise control:
      • validation_dataset: Used with the --do_validate option, not normalized.
      • evaluation_dataset: Used for evaluation during training, undergoes normalization.
  • Default OOV Token Settings:

    • OOV tokens are enabled by default for testing and validation, but not for training and evaluation.

Bug Fixes

  • General Stability and Performance Enhancements:
    • Addressed various minor issues to improve overall system stability and performance.

Contributors

  • @TimKoornstra: Responsible for the implementation of OOV vocabulary handling, test list functionality, and enhancements in data normalization and validation processes.

Full Changelog: 1.3.7...1.3.8

1.3.6

08 Dec 14:18
966b89c
Compare
Choose a tag to compare

Release Notes for Loghi-HTR Version 1.3.6

Date: 2023-12-08

Overview

Version 1.3.6 of Loghi-HTR introduces new features for enhanced usability and performance, including Conda support and a Prometheus endpoint for queue monitoring. This release also contains important enhancements to the config file and a significant bug fix in the API.

New Features

  • Conda Support with environment.yml: An environment.yml file has been added to facilitate easy setup of Conda environments. Users can initialize their environment with:
conda env create -f environment.yml
conda activate loghi-htr
  • Prometheus Endpoint for Queue Monitoring: A Prometheus endpoint is now available in the API at the route "/prometheus" (GET method), enabling monitoring of the length of queues for better system performance insights.

Enhancements

  • Config File Enhancements:
  • The url-code containing the GitHub link (https://github.com/knaw-huc/loghi) has been added to the config file.
  • The model_name has also been included in the config file for improved model management.

Bug Fixes

  • API Concurrency Bug Fix: Addressed a concurrency issue in the API where simultaneous instances might try to create the same folder, leading to potential crashes. This fix ensures more stable API operations.

Contributors

  • @rvankoert: Contributed to adding the GitHub link and model_name to the config file.
  • @MMaas3: Responsible for creating the environment.yml and the Prometheus endpoint.
  • @TimKoornstra: Fixed the concurrency bug in the API.

Full changelog: 1.3.0...1.3.6

1.3.0

14 Nov 12:19
7c1d8ac
Compare
Choose a tag to compare

Release Notes for Loghi-HTR Version 1.3.0

Date: 2023-11-14

Overview

In version 1.3.0, we've introduced significant improvements, including enhanced normalization features for CER and CER lower, a simplified confidence interval, and various API enhancements. Fixes have been made to the ResidualBlock implementation and freezing mechanism, and models now automatically save in the new .keras file format. Several changes have also been made to the API to improve usability and performance.

New Features

  • Normalization for CER and CER Lower: Added functionality to normalize for Character Error Rate (CER) and its lower case version using the --normalization_file argument. This update also displays the ground truth and prediction in a normalized form.
  • Simplified Confidence Interval: Introduced a more straightforward method for calculating confidence intervals.

Enhancements

  • Model File Format: Models now automatically get saved in the new .keras file format, while still supporting loading of both .pb and .keras files.

Bug Fixes

  • ResidualBlock Implementation Fix: Addressed an issue where saving a model and then continuing training was not working properly.
  • ResidualBlock Freezing Fix: Corrected the freezing of convolutional layers in the residual blocks with --freeze_conv_layers.

API Specific Changes

  • Environment Variable Simplifications: Removed the necessity of LOGHI_INPUT_CHANNELS and LOGHI_CHARLIST_PATH environment variables, which are now read directly from the model's config.json and charlist.txt respectively.
  • Reduced OOM Errors: Enhanced batch processing to split recursively on Out-Of-Memory (OOM) errors, failing only the problematic image instead of the entire batch.
  • Improved Image Padding: Adjusted image padding during processing for better alignment with training, marginally improving confidence and output.
  • Dynamic Model Switching in API: Introduced the ability to switch models during an API call using the "model" field, though it's advised to use caution as it can slow down inference.
  • Error Output for Failed Predictions: Text line images that fail during prediction are now outputted to LOGHI_OUTPUT_PATH/group_id/identifier.error with the specific error message.

Contributors

  • @Thelukepet: Major contributions to normalization for CER and CER lower, and the simplified confidence interval.
  • @TimKoornstra: Significant contributions across various aspects including bug fixes, API enhancements, and overall improvements.

Full changelog: 1.2.10...1.3.0

1.2.10

30 Oct 08:03
dc94fcf
Compare
Choose a tag to compare

Release Notes for Loghi-HTR Version 1.2.3 -> 1.2.10

Date: 2023-10-27

Overview

Version 1.2.10 brings a suite of bug fixes, API improvements, and functional enhancements. This update focuses on increased flexibility in model training, improved normalization processes, added image processing augmentations, and enhanced multi-GPU support in the API.

New Features

  • Normalization File: Introduced the --normalization_file argument, replacing --normalize. Characters to be replaced can now be specified in a JSON file, where the key is the character to replace, and the value is the replacement character. This facilitates training models with a focus on reducing or changing uncommon characters to more common, similar ones.
    Example:

    {
      "ø": "o",
      "æ": "ae"
    }
  • Sauvola and Otsu Binarizations: Reintroduced these methods for image preprocessing.

  • New Image Augmentations: Added blur (--do_blur) and invert (--do_invert) augmentations.

  • Silent Training: Added --training_verbosity_mode with options [auto, 0, 1, 2] for controlling training output verbosity.

API Enhancements

  • Multi-GPU Support: Enhanced to support multiple GPUs.
  • Improved Logging: Added process IDs for easier debugging and enhanced overall logging.
  • GPU Usage: Improved GPU usage handling, including the use of mixed_float16 policy if supported.
  • Image Preparation: Moved all image preparation tasks to a dedicated worker.
  • Batch Queue Size: Adjusted the prepared max queue size to reflect batches instead of individual images.
  • API Response Codes: Updated to align more closely with Laypa and Tooling standards.

Bug Fixes

  • Freezing Layers Capitalization: Fixed an issue where capitalization was ignored when setting layers to non-trainable.
  • Charlist Inference: The charlist is no longer required when replacing the final layer; it can now be inferred from texts.
  • Normalization in Multi-Character Spaces: When a normalization file is used, all multi-character spaces are now replaced with a single space.
  • Config.json githash: Ensured that githash in the config.json file does not contain spaces.
  • Existing Model Naming: Fixed an issue where setting a new name for an existing model using --existing_model and --model_name didn't work as expected.
  • Random Shear in Augmentation: Corrected an issue where the random shear augmentation incorrectly applied elastic transform.
  • Prediction Padding Value: Resolved a bug in prediction due to an incorrect padding value.
  • Multiprocessing with Gunicorn/Flask: Moved multiprocessing outside of the gunicorn/flask apps.

Docker Image

The Docker image for version 1.2.10 can be obtained using the following command:

docker pull loghi/docker.htr:1.2.10

Contributors

  • @MMaas3: Key contributions in reintroducing binarizations, adding extra augmentations, and fixing the githash issue in the config.json file.
  • @rvankoert: Significant work in addressing the capitalization issue for freezing layers and the charlist inference.
  • @TimKoornstra: Major contributions across various aspects including enhancements, bug fixes, and general improvements in the application.

Full Changelog: 1.2.3...1.2.10