We are pleased to announce the release of bdikit version 0.7.0.
This version brings powerful new capabilities such as the addition of a built-in MCP server for integration with AI agents and new primitives for value matching.
Below is a list of the main changes included in this release:
New Feature
- MCP Server: Introduced an integrated MCP server to enable interaction with BDI-Kit via AI assistants and agent frameworks. (#122)
- Evaluation of Schema and Value Matches: Added methods to evaluate schema and value matches. (#118)
- Contextual Matching Support: Enabled users to attach contextual information to source or target datasets to improve matching quality. (#117)
- Caching for Schema Matching: Implemented a caching mechanism to avoid recomputing expensive match operations. (#119)
- Value Matching Caching: Added caching support for value matching functions to improve performance and reduce redundant computations. Also enhanced the caching mechanism for schema matching. (#124)
- Numeric Mapping Support: Introduced a numeric transformer primitive to handle numeric conversions during value matching. (#112)
Enhancement
- Match Filling Across Matchers: Improved schema matcher consistency by filling in missing matches across methods. (#115)
- MaxValSim Compatibility: Updated the
max_val_sim
method to support newer schema and value matcher APIs. - Sorting and Completion of Matches: Matches are now sorted to ensure uniformity across outputs.
- Magneto as Default Matcher: Set Magneto as the default schema matcher.
- Valentine Matching Refinement: Ensured strict one-to-one matching in Valentine-based matchers.
API Change
- LLM Method Renaming: Renamed LLM-based methods for consistency and improved clarity. (#121)
- Unification of Value Matching Output: Standardized outputs from value matching methods to align with schema matching formats. (#116) By default, the
match_values()
andrank_value_matches()
functions now return a single DataFrame instead of a list of DataFrames. Note: This change is backward-incompatible. - Terminology Update: Renamed
'columns'
to'attributes'
in the outputs of various methods to ensure consistent terminology across the toolkit. (#123)
Fix
- NaN Handling in Value Matching: Skipped NaN values in target attributes to avoid runtime errors.
- Numeric String Handling: Improved parsing of numbers represented as strings in matching pipelines.
- GitHub Actions Compatibility: Updated GitHub Action dependencies to maintain compatibility with Python signing workflows.
Documentation
- Table-to-Table Harmonization: Added a comprehensive example demonstrating how to harmonize entire tables. (#120)
- Quick-Start Guide: Created a quick-start example to help new users begin using BDI-Kit more easily. (#111)
- Numeric Mapper Documentation: Documentation of the numeric transformer primitive for numeric data integration.
- Versioned UI and Docs: Enhanced the documentation site with version selectors and links. (#110)