23 Dec 15:05

IsaacWarren

35fe92c

2025.12.1 Latest

Latest

What's Changed

Handle tuple column selection in df.loc by @ehsantn in #968
Fix Conda Build Pipeline by @scott-routledge2 in #966
Skip tests to unblock PR CI by @scott-routledge2 in #967
Avoid double cast for integer division by @ehsantn in #970
Support negative n in df.head() by @ehsantn in #972
2025.12 Release Notes by @scott-routledge2 in #971
Fix slicing with negative step by @ehsantn in #973
Keep input timestamp unit in Series.dt.tz_localize by @ehsantn in #974
Fix testing issues on Nightly by @scott-routledge2 in #969
Cast binary op result to output type by @ehsantn in #976
Fix output Series.name for pd.to_datetime() by @ehsantn in #977
Support concurrent async messages to the same rank in shuffle by @scott-routledge2 in #975
Skip Iceberg DDL tests on PR CI by @scott-routledge2 in #980
TPCH benchmarking improvements by @scott-routledge2 in #964
Avoid using arrow types for to_datetime by @scott-routledge2 in #983
Upgrade to Numba 0.63.1 by @ehsantn in #978
AWS titan-text eol by @IsaacWarren in #984
Fixes for TPCH scripts by @scott-routledge2 in #986
Fix overflow in parquet read cardinality est by @scott-routledge2 in #988

Full Changelog: 2025.12...2025.12.1

Contributors

ehsantn, IsaacWarren, and scott-routledge2

Assets 2

10 Dec 20:23

scott-routledge2

2025.12

71f3530

2025.12

Bodo 2025.12 Release (Date: 12/11/2025) {#December_2025}

🎉 Highlights

This release, we are excited to add Join Filters in the plan optimizer, significantly improving performance on real workloads. We also improve Bodo's timezone support and fix several minor bugs.

✨ New Features

Support datetime.datetime in query plans.
Validate repl argument of Series.str.replace same as Pandas.
Improve concat output order.
Support Series.take().
Support timezones in to_datetime.

🏎️ Performance Improvements

Add Join Filters to remove rows with keys that won’t match a join key as early as possible.
Box/unbox date arrays using Arrow.
Box/unbox time arrays using Arrow.

🐛 Bug Fixes

Avoid hang in scattering BodoSeries.
Fix explode bug for Arrow large list type.
Allow int64/uint64 mismatch in some internal data structures.
Fix for ArrowExtensionArray iloc indexing.
Fix timezone in convert_dtypes.
Fix for nested join.
Fixed remove unused column pass to keep column references alive in conjunction with duckdb version upgrade.
Better support for operations that result in empty dataframes.

Full Changelog: 2025.11.2...2025.12

Assets 2

25 Nov 16:57

scott-routledge2

2025.11.2

8aa0fec

2025.11.2

What's Changed

Pipeline reordering. by @DrTodd13 in #933
NaN/NA handling. by @DrTodd13 in #926
2025.11 Release notes by @IsaacWarren in #936
Numba 0.62.1 upgrade by @scott-routledge2 in #934
Add metrics for Iceberg Read/Parquet Read by @scott-routledge2 in #935
More fixes for narwhals tests. by @DrTodd13 in #937
Filter manifest files based on partition summaries by @scott-routledge2 in #938
Pin Numba to 0.62 by @scott-routledge2 in #940
Increase timeout for BodoSQL smoke test by @IsaacWarren in #939
Support np.ufunc calls on BodoSeries by @ehsantn in #941
Todd/prune fix by @DrTodd13 in #942

Full Changelog: 2025.11.1...2025.11.2

Contributors

ehsantn, DrTodd13, and 2 other contributors

Assets 2

19 Nov 18:57

IsaacWarren

2025.11.1

555d285

2025.11.1

What's Changed

Support groupby.size() (with no value cols) by @scott-routledge2 in #904
Don't redistribute data twice in example by @IsaacWarren in #912
Fix test_basic_iceberg_read path issue on nightly by @ehsantn in #916
CTE column pruning by @DrTodd13 in #910
First attempt at adding Narwhals to test suite. by @DrTodd13 in #918
Avoid JIT imports in BodoSQL C++ backend by @ehsantn in #907
Support passing BodoDataFrames to BodoSQL C++ backend without extra plan executions by @ehsantn in #919
Timezone Support by @scott-routledge2 in #915
Support creating empty DataFrames by @ehsantn in #921
Fixes for test_unique narwhals test by @scott-routledge2 in #920
Change param name to not conflict on windows. by @DrTodd13 in #924
Expose pandas.Timestamp in bodo.pandas by @scott-routledge2 in #922
Fix BodoSQLContext use inside JIT by @ehsantn in #928
Fix drop_duplicates() for non-trivial Indexes by @ehsantn in #925
Run Narwhals tests on a single worker and fix issues by @scott-routledge2 in #923
BSE-5174: Duckdb Planner Upgrade by @IsaacWarren in #917
Support specifying Glue Catalog in pd.read_sql_table by @scott-routledge2 in #929
BSE-5206: Fix windows duckdb by @IsaacWarren in #930
Support passing BodoSQL as an arg to JIT by @scott-routledge2 in #932

Full Changelog: 2025.11.0...2025.11.1

Contributors

ehsantn, DrTodd13, and 2 other contributors

Assets 2

05 Nov 17:49

IsaacWarren

2025.11.0

c447d22

2025.11.0

What's Changed

Add release notes for 2025.10.1 by @scott-routledge2 in #890
Fix docker release files symlink target by @IsaacWarren in #894
TPCH improvements. by @DrTodd13 in #893
Support TPC-H Q5 in BodoSQL C++ backend by @ehsantn in #886
Support filesystem Iceberg catalog in BodoSQL C++ backend by @ehsantn in #895
Support Iceberg filter/project/limit in BodoSQL C++ backend by @ehsantn in #898
Skip pandas ddp example by @IsaacWarren in #892
Improvements based on Narwhals tests. by @DrTodd13 in #897
Capture API usage. by @DrTodd13 in #896
Add all duckdb timestamp types by @IsaacWarren in #900
Call tokenize with just a tokenizer by @IsaacWarren in #901
Support join filters in BodoSQL C++ backend by @ehsantn in #902
Finetune on Iceberg Data by @IsaacWarren in #903
Update demo notebook by @scott-routledge2 in #908
Fix two narwhals issues. by @DrTodd13 in #906
Upgrade to arrow 22 by @scott-routledge2 in #905
Remove Python 3.9 in a few places by @ehsantn in #911
Add iceberg marker to test by @ehsantn in #913

Full Changelog: 2025.10.2...2025.11.0

Contributors

ehsantn, DrTodd13, and 2 other contributors

Assets 2

21 Oct 14:35

IsaacWarren

2025.10.2

df9a6b5

2025.10.2

What's Changed

Remove reindex check in publish_binary scripts by @scott-routledge2 in #889
Refactor Guides Docs by @scott-routledge2 in #882
BSE-5132: prepare_dataset by @IsaacWarren in #873

Full Changelog: 2025.10.1...2025.10.2

Contributors

IsaacWarren and scott-routledge2

Assets 2

20 Oct 16:22

scott-routledge2

2025.10.1

3f28e18

2025.10.1

What's Changed

Speed up TPCH. by @DrTodd13 in #856
Convert to BodoDataFrame/BodoSeries on fallback by @scott-routledge2 in #855
Add support for running str.match in arrow compute. by @DrTodd13 in #865
Add C++ backend for BodoSQL by @ehsantn in #861
Support Parquet read in BodoSQL C++ backend by @ehsantn in #866
Avoid converting output to DF lib to fix dev docs test by @ehsantn in #867
BSE-5119: torch trainer by @IsaacWarren in #846
Overload array dunder method to convert BodoDataFrames of floats/ints to ndarrays with the correct dtype by @scott-routledge2 in #868
Support join in BodoSQL C++ backend by @ehsantn in #869
Initial filter support in BodoSQL C++ backend by @ehsantn in #871
Skip slice test by @IsaacWarren in #872
Initial groupby support for BodoSQL C++ backend by @ehsantn in #874
Combine chunks before passing table to arrow_table_to_bodo by @scott-routledge2 in #877
Support large string types in AI functions by @scott-routledge2 in #878
First batch of narwhals support. by @DrTodd13 in #875
Fix copy elision compile error on Mac by @ehsantn in #883
Initial sort support for BodoSQL C++ backend by @ehsantn in #884
Todd/rmod fix by @DrTodd13 in #876
Distributed training example by @IsaacWarren in #879
Skip artifactory upload except for platform package by @scott-routledge2 in #885

Full Changelog: 2025.10...2025.10.1

Contributors

ehsantn, DrTodd13, and 2 other contributors

Assets 2

03 Oct 19:09

scott-routledge2

2025.10

de6d7ba

2025.10

Bodo 2025.10 Release (Date: 10/03/2025)

🎉 Highlights

This release, we are excited to significantly improve the responsiveness of Bodo DataFrames with lazy JIT imports, optimize performance with Common Table Expressions (CTEs), as well as upgrade to Arrow 21.

✨ New Features

Getting the length of a BodoDataFrame or BodoSeries now returns a lazily evaluated BodoScalar.
Add support for subset argument to drop_duplicates.

🏎️ Performance Improvements

Support lazy BodoScalar binary operations for better optimizations.
Recognize duplicate computations in execution trees and execute them only once using Common Table Expressions (CTEs).
Support internal gather/scatter calls without JIT for faster response times.
Support Iceberg read/write without JIT import for faster response times.

⚙️ Dependency Changes

Upgraded Arrow dependency to 21.0.

Assets 2

18 Sep 20:23

scott-routledge2

2025.9

888788d

2025.9

Bodo 2025.9 Release (Date: 09/18/2025)

🎉 Highlights

This release, we are excited to significantly improve the import time of the Bodo package, as well as introduce new features like Series.where support and lazy BodoScalars.

✨ New Features

Bodo DataFrames now imports the JIT compiler lazily only when necessary, which reduces import time substantially.
Support for Series.where().
Series reductions such as “sum” or “max” now produce a BodoScalar that is evaluated lazily and can be used in some operations such as Series.where() and filter expressions without execution.
Optimized support for “not in series” cases like df[~df.A.isin(df.B)] using anti-join.
Support for bodo.pandas uses inside JIT functions.
Anthropic models used through AWS Bedrock now use Anthropic’s messages API to support newer versions of Claude.

🐛 Bug Fixes

Fix for join non-equi condition keys that are not part of the output.
Fix for Series expression with non-range Indexes.

🏎️ Performance Improvements

Improved the initialization time for cfuncs used in the acceleration of user defined functions in Series.map and DataFrame.apply calls.

⚙️ Dependency Changes

Added upper bound to Numba dependency to avoid issues with version 0.62.

Assets 2

28 Aug 18:53

IsaacWarren

2025.8.2

6ea7566

2025.8.2

New Features

Support for AWS Bedrock backend for llm_generate and embed.
Support passing user defined functions that return scalars to groupby.agg and groupby.apply
Support renaming DataFrame column using df.columns = [...] syntax
Add API map_partition_with_state to DataFrame that allows you to do a one-time initialization of state on each worker which can then be used to map batches of rows from a DataFrame to produce a new DataFrame.
Added JIT fallback to Bodo DataFrames such that operations not supported natively in DataFrames can use the equivalent operation from Bodo engine.

Performance Improvements

Improve Series.quantile/describe performance.
Improve the performance of fetching row counts for Parquet datasets
Improve package import time and worker spinup time substantially

Bug Fixes

Fix a crash with llm_generate and embed in Jupyter Notebooks/when an asyncio executor is already running.
Fix OpenAI environment variables not being sent to workers.
Fix bug in loss computation when fitting LogisticRegression in parallel.
Fix crash when running map/apply on large numbers of workers

Assets 2

Releases: bodo-ai/Bodo

2025.12.1

What's Changed

Contributors

Uh oh!

2025.12

Bodo 2025.12 Release (Date: 12/11/2025) {#December_2025}

🎉 Highlights

✨ New Features

🏎️ Performance Improvements

🐛 Bug Fixes

Uh oh!

2025.11.2

What's Changed

Contributors

Uh oh!

2025.11.1

What's Changed

Contributors

Uh oh!

2025.11.0

What's Changed

Contributors

Uh oh!

2025.10.2

What's Changed

Contributors

Uh oh!

2025.10.1

What's Changed

Contributors

Uh oh!

2025.10

Bodo 2025.10 Release (Date: 10/03/2025)

🎉 Highlights

✨ New Features

🏎️ Performance Improvements

⚙️ Dependency Changes

Uh oh!

2025.9

Bodo 2025.9 Release (Date: 09/18/2025)

🎉 Highlights

✨ New Features

🐛 Bug Fixes

🏎️ Performance Improvements

⚙️ Dependency Changes

Uh oh!

2025.8.2

New Features

Performance Improvements

Bug Fixes

Uh oh!