Skip to content

0.7.0

Latest
Compare
Choose a tag to compare
@github-actions github-actions released this 14 Jul 16:21
· 9 commits to devel since this release

We are pleased to announce the release of bdikit version 0.7.0.
This version brings powerful new capabilities such as the addition of a built-in MCP server for integration with AI agents and new primitives for value matching.

Below is a list of the main changes included in this release:

New Feature

  • MCP Server: Introduced an integrated MCP server to enable interaction with BDI-Kit via AI assistants and agent frameworks. (#122)
  • Evaluation of Schema and Value Matches: Added methods to evaluate schema and value matches. (#118)
  • Contextual Matching Support: Enabled users to attach contextual information to source or target datasets to improve matching quality. (#117)
  • Caching for Schema Matching: Implemented a caching mechanism to avoid recomputing expensive match operations. (#119)
  • Value Matching Caching: Added caching support for value matching functions to improve performance and reduce redundant computations. Also enhanced the caching mechanism for schema matching. (#124)
  • Numeric Mapping Support: Introduced a numeric transformer primitive to handle numeric conversions during value matching. (#112)

Enhancement

  • Match Filling Across Matchers: Improved schema matcher consistency by filling in missing matches across methods. (#115)
  • MaxValSim Compatibility: Updated the max_val_sim method to support newer schema and value matcher APIs.
  • Sorting and Completion of Matches: Matches are now sorted to ensure uniformity across outputs.
  • Magneto as Default Matcher: Set Magneto as the default schema matcher.
  • Valentine Matching Refinement: Ensured strict one-to-one matching in Valentine-based matchers.

API Change

  • LLM Method Renaming: Renamed LLM-based methods for consistency and improved clarity. (#121)
  • Unification of Value Matching Output: Standardized outputs from value matching methods to align with schema matching formats. (#116) By default, the match_values() and rank_value_matches() functions now return a single DataFrame instead of a list of DataFrames. Note: This change is backward-incompatible.
  • Terminology Update: Renamed 'columns' to 'attributes' in the outputs of various methods to ensure consistent terminology across the toolkit. (#123)

Fix

  • NaN Handling in Value Matching: Skipped NaN values in target attributes to avoid runtime errors.
  • Numeric String Handling: Improved parsing of numbers represented as strings in matching pipelines.
  • GitHub Actions Compatibility: Updated GitHub Action dependencies to maintain compatibility with Python signing workflows.

Documentation

  • Table-to-Table Harmonization: Added a comprehensive example demonstrating how to harmonize entire tables. (#120)
  • Quick-Start Guide: Created a quick-start example to help new users begin using BDI-Kit more easily. (#111)
  • Numeric Mapper Documentation: Documentation of the numeric transformer primitive for numeric data integration.
  • Versioned UI and Docs: Enhanced the documentation site with version selectors and links. (#110)