|
1 |
| -# blaze-v4.0.0 |
2 |
| - |
3 |
| -## New features |
4 |
| -* supports spark3.0/3.1/3.2/3.3/3.4/3.5. |
5 |
| -* supports integrating with Apache Celeborn. |
6 |
| -* supports native ORC input format. |
7 |
| -* supports bloom filter join introduced in spark 3.5. |
8 |
| -* supports forceShuffledHashJoin for running tpch/tpcds benchmarks. |
9 |
| -* new supported native expression/functions: year, month, day, md5. |
10 |
| - |
11 |
| -## Bug fixes |
12 |
| -* add missing UDTF.terminate() invokes. |
13 |
| -* fix NPE while executing some native spark physical plans. |
14 |
| - |
15 |
| -## Performance |
16 |
| -* use custom implemented hash table for faster joining, supporting SIMD, bulk searching, memory prefetching, etc. |
17 |
| -* improve shuffle write performance. |
18 |
| -* reuse FSDataInputStream for same input file. |
| 1 | +# blaze-v4.0.1: |
| 2 | + |
| 3 | +# New Feature |
| 4 | + |
| 5 | +* Initial supports to ORC input file format. |
| 6 | +* Initial supports to RSS framework and Apache Celeborn shuffle service. |
| 7 | + |
| 8 | +# Improvement |
| 9 | + |
| 10 | +* Optimize AggExec by supporting Implement columnar-based aggregation. |
| 11 | +* Use custom implemented hashmap implement for aggregation. |
| 12 | +* Supports specialized count(0). |
| 13 | +* Optimize bloom filter by reusing same bloom filter in the same executor. |
| 14 | +* Optimize bloom filter by supporting shrinking. |
| 15 | +* Optimize reading parquet files by supporting parallel reading. |
| 16 | +* Improve spill file deletion logics. |
| 17 | + |
| 18 | +# Bug fixes |
| 19 | + |
| 20 | +* Fix file not found for path with url encoded character. |
| 21 | +* Fix Hashaggregate convert job throwing ScalaReflectionException. |
| 22 | +* Fix pruning error while reading parquet files with multiple row groups. |
| 23 | +* Fix incorrect number of tasks due to missing shuffleOrigin. |
| 24 | +* Fix record batch creating error when hash joining with empty input. |
| 25 | + |
| 26 | +# Other |
| 27 | +* Upgrade datafusion/arrow dependency to v42/v53. |
| 28 | +* Replace gxhash with foldhash for better compatibility on some hardwares. |
| 29 | +* Other minor improvement & fixes. |
| 30 | + |
| 31 | +# PRs |
| 32 | +* AggExec: implement columnar accumulator states. by @richox in https://github.com/kwai/blaze/pull/646 |
| 33 | +* Bump bigdecimal from 0.4.5 to 0.4.6 by @dependabot in https://github.com/kwai/blaze/pull/638 |
| 34 | +* Bump bytes from 1.7.2 to 1.8.0 by @dependabot in https://github.com/kwai/blaze/pull/625 |
| 35 | +* Bump bytes from 1.8.0 to 1.9.0 by @dependabot in https://github.com/kwai/blaze/pull/671 |
| 36 | +* Bump object_store from 0.11.0 to 0.11.1 by @dependabot in https://github.com/kwai/blaze/pull/622 |
| 37 | +* Bump sonic-rs from 0.3.13 to 0.3.14 by @dependabot in https://github.com/kwai/blaze/pull/623 |
| 38 | +* Bump sonic-rs from 0.3.14 to 0.3.16 by @dependabot in https://github.com/kwai/blaze/pull/647 |
| 39 | +* Bump tempfile from 3.13.0 to 3.14.0 by @dependabot in https://github.com/kwai/blaze/pull/641 |
| 40 | +* Bump tokio from 1.40.0 to 1.41.0 by @dependabot in https://github.com/kwai/blaze/pull/629 |
| 41 | +* Bump tokio from 1.41.0 to 1.41.1 by @dependabot in https://github.com/kwai/blaze/pull/642 |
| 42 | +* Bump tokio from 1.41.0 to 1.41.1 by @dependabot in https://github.com/kwai/blaze/pull/676 |
| 43 | +* Bump uuid from 1.10.0 to 1.11.0 by @dependabot in https://github.com/kwai/blaze/pull/618 |
| 44 | +* Create RecordBatch with num_rows option to avoid bhj error caused by empty output_schema by @wForget in https://github.com/kwai/blaze/pull/683 |
| 45 | +* Fix build on windows by @wForget in https://github.com/kwai/blaze/pull/666 |
| 46 | +* Fix file not found for path with url encoded character by @wForget in https://github.com/kwai/blaze/pull/679 |
| 47 | +* Followup to #674, add -r for rm by @wForget in https://github.com/kwai/blaze/pull/681 |
| 48 | +* Introduce base blaze sql test suite by @wForget in https://github.com/kwai/blaze/pull/674 |
| 49 | +* [BLAZE-287][FOLLOWUP] Use JavaUtils#newConcurrentHashMap to speed up ConcurrentHashMap#computeIfAbsent by @SteNicholas in https://github.com/kwai/blaze/pull/615 |
| 50 | +* [BLAZE-573][FOLLOWUP] Bump Spark from 3.4.3 to 3.4.4 by @SteNicholas in https://github.com/kwai/blaze/pull/640 |
| 51 | +* [BLAZE-627] Make ORC and Parquet format detection more generic by @dixingxing0 in https://github.com/kwai/blaze/pull/628 |
| 52 | +* [BLAZE-664] Bump Celeborn version from 0.5.1 to 0.5.2 by @SteNicholas in https://github.com/kwai/blaze/pull/665 |
| 53 | +* [MINOR] Avoid NPE when native lib is not found by @wForget in https://github.com/kwai/blaze/pull/668 |
| 54 | +* add new blaze logo by @richox in https://github.com/kwai/blaze/pull/633 |
| 55 | +* chore: Make spotless plugin happy by @zuston in https://github.com/kwai/blaze/pull/653 |
| 56 | +* code refactoring by @richox in https://github.com/kwai/blaze/pull/658 |
| 57 | +* code refactoring by @richox in https://github.com/kwai/blaze/pull/677 |
| 58 | +* doc: update tpc-h benchmark result by @richox in https://github.com/kwai/blaze/pull/614 |
| 59 | +* fix Hashaggregate convert job throw ScalaReflectionException by @leizhang5s in https://github.com/kwai/blaze/pull/637 |
| 60 | +* fix pruning error while reading parquet files with multiple row groups by @richox in https://github.com/kwai/blaze/pull/616 |
| 61 | +* fix running error for Spark 3.2.0 and 3.2.1 by @XorSum in https://github.com/kwai/blaze/pull/602 |
| 62 | +* fix(shuffle): Progagate shuffle origin to native exchange exec to make AQE rebalance valid by @zuston in https://github.com/kwai/blaze/pull/663 |
| 63 | +* fix(spill): Delete spill file when dropping for rust FileSpill by @zuston in https://github.com/kwai/blaze/pull/660 |
| 64 | +* fix(spill): Explicitly delete spill file for FileBasedSpillBuf after release by @zuston in https://github.com/kwai/blaze/pull/654 |
| 65 | +* improve NativeOrcScan by @richox in https://github.com/kwai/blaze/pull/631 |
| 66 | +* improve memory management by @richox in https://github.com/kwai/blaze/pull/621 |
| 67 | +* improvement: Add numOfPartitions metrics for exchange exec to align with vanilla spark by @zuston in https://github.com/kwai/blaze/pull/669 |
| 68 | +* optimize bloom filter by @richox in https://github.com/kwai/blaze/pull/620 |
| 69 | +* parquet reading improvements by @richox in https://github.com/kwai/blaze/pull/650 |
| 70 | +* release version v4.0.0 by @richox in https://github.com/kwai/blaze/pull/613 |
| 71 | +* replace gxhash with foldhash by @richox in https://github.com/kwai/blaze/pull/624 |
| 72 | +* supports specialized count(0) by @richox in https://github.com/kwai/blaze/pull/619 |
| 73 | +* tpcd benchmarkrunner : add orc format support by @leizhang5s in https://github.com/kwai/blaze/pull/639 |
| 74 | +* update to datafusion-v42 by @richox in https://github.com/kwai/blaze/pull/574 |
| 75 | +* use custom implemented hashmap for aggregation by @richox in https://github.com/kwai/blaze/pull/617 |
0 commit comments