Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge branch 'branch-23.08' into main [skip ci] #1319

Merged
merged 128 commits into from
Aug 10, 2023

Conversation

NvTimLiu
Copy link
Collaborator

@NvTimLiu NvTimLiu commented Aug 8, 2023

Merge branch 'branch-23.08' into main

Change version to v23.08.0

Signed-off-by: Tim Liu [email protected]

NOTE: merge this PR via "Create a merge commit"

nvauto and others added 30 commits May 23, 2023 09:47
[auto-merge] bot-auto-merge-branch-23.06 to branch-23.08 [skip ci] [bot]
Signed-off-by: spark-rapids automation <[email protected]>
[auto-merge] bot-auto-merge-branch-23.06 to branch-23.08 [skip ci] [bot]
Signed-off-by: spark-rapids automation <[email protected]>
[auto-merge] bot-auto-merge-branch-23.06 to branch-23.08 [skip ci] [bot]
Signed-off-by: spark-rapids automation <[email protected]>
[auto-merge] bot-auto-merge-branch-23.06 to branch-23.08 [skip ci] [bot]
* Init version 23.08.0-SNAPSHOT

Signed-off-by: Peixin Li <[email protected]>

* include doc update

Signed-off-by: Peixin Li <[email protected]>

---------

Signed-off-by: Peixin Li <[email protected]>
Signed-off-by: spark-rapids automation <[email protected]>
[auto-merge] bot-auto-merge-branch-23.06 to branch-23.08 [skip ci] [bot]
Signed-off-by: spark-rapids automation <[email protected]>
[auto-merge] bot-auto-merge-branch-23.06 to branch-23.08 [skip ci] [bot]
…p ci] [bot] (NVIDIA#1166)

* Update submodule cudf to 097b828c21772a2399e9bae0d4f8c7234cc0f456

Signed-off-by: spark-rapids automation <[email protected]>

* Update submodule cudf to 9a0f87c320c5322bc88732dde3c4147792d607e0

Signed-off-by: spark-rapids automation <[email protected]>

* Update submodule cudf to fd13c877e10e9fa69fa63ec7cd5b64bf6a6805d5

Signed-off-by: spark-rapids automation <[email protected]>

* Update submodule cudf to a03da13ceb294db766c2bc6ade400471584656ba

Signed-off-by: spark-rapids automation <[email protected]>

* Update submodule cudf to 126fa3515b22315e145bf8b921f4869e98665499

Signed-off-by: spark-rapids automation <[email protected]>

* Update submodule cudf to 37f76c820ddf833a80b4ce706b9c3e84908e51ff

Signed-off-by: spark-rapids automation <[email protected]>

* Update submodule cudf to 5b3e3abeaacc6069bc6ea9d92ebc28408b82ff37

Signed-off-by: spark-rapids automation <[email protected]>

---------

Signed-off-by: spark-rapids automation <[email protected]>
Signed-off-by: spark-rapids automation <[email protected]>
[auto-merge] bot-auto-merge-branch-23.06 to branch-23.08 [skip ci] [bot]
Signed-off-by: spark-rapids automation <[email protected]>
[auto-merge] bot-auto-merge-branch-23.06 to branch-23.08 [skip ci] [bot]
Signed-off-by: spark-rapids automation <[email protected]>
[auto-merge] bot-auto-merge-branch-23.06 to branch-23.08 [skip ci] [bot]
Signed-off-by: spark-rapids automation <[email protected]>
[auto-merge] bot-auto-merge-branch-23.06 to branch-23.08 [skip ci] [bot]
nvdbaranec and others added 4 commits August 1, 2023 20:02
* Back port spark-specific murmur32 hash code from cudf.

* Run pre-commit to format files. We were behind a bit.

* Update pre-commit config to 16.0.1 to match cudf. Re-ran formatting.

* Change jni bindings to use the spark-rapids-jni implementation of murmur hash instead of the cudf version.  Brought over
cpp and java tests.

* Documentation fix.

* Fix cpp tests to actually call the spark_rapids_jni murmur hash.

* First pass at xxhash64. cpp tests passing.

* Improve cpp tests - null cases and more floating point edge cases.

* Add Java tests.

* Moved murmur32 hash implementaion from cudf to spark-rapids-jni

Signed-off-by: db <[email protected]>

* PR review changes.

* Fix copyright data in Hash.java

* Enable 32 bit decimal hash test.

* Implement xxhash64 on the gpu

Signed-off-by: db <[email protected]>

* Add missing newlines.

* PR review changes.

* Remove default xxhash64 class constructor.  Remove unused parameter (row index) from remaining constructor.

* Merge thirdparty/cudf from 23.08

* Basic bloom filter support. c++ side only. Could use some more tests.

* More tests.

* Rectify thirdparty/cudf

* Java bindings and tests.

* Add more tests and general cleanup.

Signed-off-by: db <[email protected]>

* End-of-file formatting.

* More end-of-file formatting.

* Merge thirdparty/cudf

* Add bloom_filter_put benchmark.  Fixed several benchmark build breakages.

* Submodule update

* Fix small issue from cudf merge.

* Wave of PR review feedback.

* Add static versions of put() and probe() that take bloom filter components instead of an instance. Change BloomFilterInterfaces to take a
BaseDeviceMemoryBuffer instead of a DeviceMemoryBuffer. Handle some exception cases. Reordered some function parameter lists for consistency/cleanliness.

* Change an Exception to a Throwable.

* Produce big-endian swizzled bloom filters from the GPU.  Change the BloomFilter class to be more restrictive about bloom filter bit sizes:
must always be a multiple of 64 bits.

* Change bloom filter Java functions to use a long for bloomFilterBits.  Handles nulls in the c++ code : build will ignore null input values and probe will return
null for any input value.

* Java tests for build/probe with null inputs.

* Rework BloomFilter interface to wrap the entire Spark bloom filter buffer as an opaque cudf Scalar.

* Doc updates. Add checking to the merge function to verify all input bloom filters have matching num_hashes and num_longs
parameters.

* Re-enable Java merge tests. Update benchmarks.

* Change bloom filter list_scalar type to be uint8. Add an additional interface for probing directly from a buffer. Improve error
checking in unpacking code.

* Add a note and reference to an issue for removing the package/bounce workaround for certain Scalar accessors.

* Eof newline.

---------

Signed-off-by: db <[email protected]>
@NvTimLiu
Copy link
Collaborator Author

NvTimLiu commented Aug 8, 2023

build

@pxLi
Copy link
Collaborator

pxLi commented Aug 8, 2023

We will update cudf submodule to released tag when available later this week.

This change is for in-advance commit history review, thanks!

@pxLi pxLi marked this pull request as draft August 8, 2023 02:18
Copy link
Member

@jlowe jlowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the pending cudf submodule commit, lgtm.

Copy link
Collaborator

@pxLi pxLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please upmerge the PR to latest, and merge the change as Create a merge commit

@pxLi
Copy link
Collaborator

pxLi commented Aug 10, 2023

Feel free to [skip ci] for this change, RC build and test will cover the same thanks

@NvTimLiu NvTimLiu marked this pull request as ready for review August 10, 2023 01:43
@NvTimLiu NvTimLiu changed the title Merge branch 'branch-23.08' into main Merge branch 'branch-23.08' into main [skip ci] Aug 10, 2023
@NvTimLiu
Copy link
Collaborator Author

build

@NvTimLiu NvTimLiu merged commit 73fcd5c into NVIDIA:main Aug 10, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants