Add GpuConv operator for the `conv` 10<->16 expression #8925

gerashegalov · 2023-08-03T18:44:29Z

Contributes to #8511

POC only supports 10/16<->10/16 radix conversions, without overflow checks it's guaranteed to produce identical results to CPU only for

decimal strings not longer than 18 characters
hexadecimal strings not longer than 15 characters

Signed-off-by: Gera Shegalov [email protected]

Signed-off-by: Gera Shegalov <[email protected]>

…gerashegalov/issue8511

Signed-off-by: Gera Shegalov <[email protected]>

integration_tests/src/main/python/string_test.py

integration_tests/src/main/python/asserts.py

sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala

integration_tests/src/main/python/string_test.py

sql-plugin/src/main/scala/org/apache/spark/sql/rapids/stringFunctions.scala

…into gerashegalov/issue8511

…issue8511

…into gerashegalov/issue8511

Signed-off-by: Gera Shegalov <[email protected]>

…into gerashegalov/issue8511

Signed-off-by: Gera Shegalov <[email protected]>

gerashegalov · 2023-08-18T18:56:34Z

build

gerashegalov · 2023-08-18T20:10:35Z

build

integration_tests/src/main/python/string_test.py

revans2 · 2023-08-21T13:40:17Z

sql-plugin/src/main/scala/org/apache/spark/sql/rapids/stringFunctions.scala

+      case _ =>
+        willNotWorkOnGpu(because = "only literal 10 or 16 for from_base and to_base are supported")
+    }
+    if (SQLConf.get.ansiEnabled) {


ANSI only shows up in 3.4.0+ https://issues.apache.org/jira/browse/SPARK-42427 and if the expression is in ANSI mode or not should come from expr not directly from SQLConf.get.ansiEnabled.

revans2 · 2023-08-21T13:48:18Z

sql-plugin/src/main/scala/org/apache/spark/sql/rapids/stringFunctions.scala

+        willNotWorkOnGpu(because = "only literal 10 or 16 for from_base and to_base are supported")
+    }
+    if (SQLConf.get.ansiEnabled) {
+      willNotWorkOnGpu(because = " the GPU has no overflow checking.")


I'm not sure we can enable this by default. Even in 3.1.x Spark checks for overflow when encoding the value, and will return -1 if it sees an overflow. We do not do that.

https://github.com/apache/spark/blob/61e034807dc555d7ceadc534fcb0c82d50fc8719/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/NumberConverter.scala#L56-L59

The ANSI mode fix in 3.4.0 only added in an exception instead of returning -1.

The following was run on Spark 3.3.0

scala> val df = Seq("9223372036854775807", "-9223372036854775808", "9223372036854775808", "-9223372036854775809", "10223372036854775807", "-10223372036854775808", null).toDF scala> df.repartition(1).selectExpr("conv(value, 10, 16)", "value").show(false) +-------------------+---------------------+ |conv(value, 10, 16)|value | +-------------------+---------------------+ |7FFFFFFFFFFFFFFF |9223372036854775807 | |8000000000000000 |-9223372036854775808 | |8000000000000000 |9223372036854775808 | |7FFFFFFFFFFFFFFF |-9223372036854775809 | |8DE0B6B3A763FFFF |10223372036854775807 | |721F494C589C0000 |-10223372036854775808| |null |null | +-------------------+---------------------+ scala> spark.conf.set("spark.rapids.sql.enabled", false) scala> df.repartition(1).selectExpr("conv(value, 10, 16)", "value").show(false) +-------------------+---------------------+ |conv(value, 10, 16)|value | +-------------------+---------------------+ |7FFFFFFFFFFFFFFF |9223372036854775807 | |FFFFFFFFFFFFFFFF |-9223372036854775808 | |8000000000000000 |9223372036854775808 | |FFFFFFFFFFFFFFFF |-9223372036854775809 | |8DE0B6B3A763FFFF |10223372036854775807 | |FFFFFFFFFFFFFFFF |-10223372036854775808| |null |null | +-------------------+---------------------+

I am aware of the -1-equivalent output, i.e conversion to 18446744073709551615 representation as a string in to_base if to_base is positive and -1 if to_base is negative (which we already fallback to CPU).

>>> spark.createDataFrame([('-1',), ('18446744073709551615',), ('18446744073709551616',)], 'a string').selectExpr('conv(a, 10, 10)').show() +--------------------+ | conv(a, 10, 10)| +--------------------+ |18446744073709551615| |18446744073709551615| |18446744073709551615| +--------------------+

My reasoning is that since Spark does not allow distinguishing 18446744073709551615 as a result of the overflow check or based on the original data, it does not really matter. However, it's true that the customer may have a process in place making sure that 18446744073709551615 is uniquely due to an overflow, and filter out these by filtering the output != '18446744073709551615'.

So we can disable by default for unlimited StringType and safely enable of StringType(length) for lengths guaranteed not to overflow.

I agree that is probably the reason that Spark added in an ANSI mode because it really is ambiguous. Seeing FFFFFFFFFFFFFFFF on the output when converting to hex is ambiguous and you just don't know if an overflow happened or not without further processing.

For me I don't really want to enable this by default if it is only partially done. But I can see an argument for allowing it. @sameerz @mattf do you have an opinion if we should put this in with conv enabled by default while we work on a better long term solution? Would it be okay to put it in with conv disabled by default and some docs so users know how to enable it if there are potential incompatibilities?

revans2 · 2023-08-21T14:34:18Z

I also ran some performance tests because I was nervous about the use of regular expressions to try and implement this. I am less concerned now. Not because the regular expressions are bad, but because the CPU code is really bad too.

I generated 1 - billion rows of two longs. One column I just cast to a string and the other column I converted to base 16, then wrote the data out to parquet.

I then ran the following to get a baseline for reading in the data an doing a min/max on it.

spark.time(spark.read.parquet("/data/tmp/CONV_IN").selectExpr("MIN(b)", "MAX(b)", "MIN(c)", "MAX(c)").show(false))

The median of 5 runs for the CPU was 56.607 seconds and the GPU was 10.487 seconds (the GPU a6000 is about 5.4x faster than my old 6 core 12 thread desktop CPU)

I then ran the following to understand the cost of conv

spark.time(spark.read.parquet("/data/tmp/CONV_IN").selectExpr("MIN(conv(b, 10, 16))", "MAX(conv(b, 16, 10))", "MIN(conv(c, 10, 16))", "MAX(conv(c, 10, 10))").show(false))

The CPU took 568.396 seconds (I only ran it once) and the GPU took 183.554 seconds (again I only ran it once the GPU was 100% utilized the entire query and it started to throttle from heat).

That results in a difference of 511.789 seconds for the CPU runs and 173.067 seconds for the GPU runs. So the GPU is about 3 times faster than the CPU for conv. Not great, but not so bad that I think we need to block it. A custom kernel should speed this up massively compared to the CPU because it is clear the CPU code is not well optimized.

gerashegalov · 2023-08-21T15:41:42Z

@revans2 Thank for doing the measurements. This PR is meant as a stepping stone to prevent CPU fallbacks for the cases that libcudf already can support. I will work on the custom kernel as a follow-on

Signed-off-by: Gera Shegalov <[email protected]>

revans2 · 2023-08-23T14:22:59Z

build

Signed-off-by: Gera Shegalov <[email protected]>

gerashegalov · 2023-08-23T17:20:02Z

build

Signed-off-by: Gera Shegalov <[email protected]>

gerashegalov · 2023-08-24T16:27:28Z

build

gerashegalov · 2023-08-24T21:18:16Z

build

pxLi · 2023-08-25T06:02:10Z

build

gerashegalov added 8 commits July 30, 2023 08:21

Implement conv on GPU

e7ca8b4

Signed-off-by: Gera Shegalov <[email protected]>

Replacement rule of Conv

ffa2f03

Signed-off-by: Gera Shegalov <[email protected]>

wip

e155ed9

Signed-off-by: Gera Shegalov <[email protected]>

Output only unified diff when GPU output deviates

ae58d32

Signed-off-by: Gera Shegalov <[email protected]>

format

22df461

Signed-off-by: Gera Shegalov <[email protected]>

wip

2eedc3a

Merge remote-tracking branch 'gerashegalov/unifiedDiffInAssert' into …

ad034fc

…gerashegalov/issue8511

wip

c1b7461

Signed-off-by: Gera Shegalov <[email protected]>

gerashegalov added the task Work required that improves the product but is not user facing label Aug 3, 2023

gerashegalov self-assigned this Aug 3, 2023

gerashegalov linked an issue Aug 3, 2023 that may be closed by this pull request

[FEA] Support conv function #8511

Open

4 tasks

revans2 reviewed Aug 3, 2023

View reviewed changes

gerashegalov added 9 commits August 7, 2023 19:36

Merge branch 'branch-23.10' of https://github.com/NVIDIA/spark-rapids …

73b1612

…into gerashegalov/issue8511

Merge remote-tracking branch 'origin/branch-23.10' into gerashegalov/…

6d98293

…issue8511

wip

a080254

reviews

0b24016

reviews

ea3ae7b

test

2e40ee1

Merge branch 'branch-23.10' of https://github.com/NVIDIA/spark-rapids …

9c0cb59

…into gerashegalov/issue8511

Scalar doColumnar

fbebd70

happy path done

46cf184

Signed-off-by: Gera Shegalov <[email protected]>

gerashegalov changed the title ~~[WIP] Add GpuConv operator for the conv expression~~ Add GpuConv operator for the conv expression Aug 12, 2023

gerashegalov added 3 commits August 12, 2023 03:00

docgen

5dcf3c0

Signed-off-by: Gera Shegalov <[email protected]>

more cases supported

e73e241

Signed-off-by: Gera Shegalov <[email protected]>

Adjust API

d4a8862

Signed-off-by: Gera Shegalov <[email protected]>

gerashegalov mentioned this pull request Aug 17, 2023

[FEA] Support conv function #8511

Open

4 tasks

gerashegalov added 3 commits August 17, 2023 16:21

Minor fixes

24c14ed

Signed-off-by: Gera Shegalov <[email protected]>

Merge branch 'branch-23.10' of https://github.com/NVIDIA/spark-rapids …

590ebb5

…into gerashegalov/issue8511

different treatment of spaces in 320+

e44244a

Signed-off-by: Gera Shegalov <[email protected]>

gerashegalov marked this pull request as ready for review August 18, 2023 18:55

gerashegalov changed the title ~~Add GpuConv operator for the conv expression~~ Add GpuConv operator for the conv expression [databricks] Aug 18, 2023

gerashegalov requested a review from revans2 August 18, 2023 22:56

revans2 reviewed Aug 21, 2023

View reviewed changes

gerashegalov mentioned this pull request Aug 23, 2023

[BUG] CastStrings_toIntegersWithBase hits mask_to_bools.cu:42: nullmask is null NVIDIA/spark-rapids-jni#1363

Closed

gerashegalov added 2 commits August 23, 2023 00:05

reviews, disabled by default

bdbd070

Signed-off-by: Gera Shegalov <[email protected]>

docgen

6f26335

Signed-off-by: Gera Shegalov <[email protected]>

gerashegalov requested a review from revans2 August 23, 2023 07:12

revans2 previously approved these changes Aug 23, 2023

View reviewed changes

removed extra line from doc

aa3777f

Signed-off-by: Gera Shegalov <[email protected]>

gerashegalov dismissed revans2’s stale review via aa3777f August 23, 2023 15:51

revans2 previously approved these changes Aug 23, 2023

View reviewed changes

one line doc

dffc30f

Signed-off-by: Gera Shegalov <[email protected]>

gerashegalov dismissed revans2’s stale review via dffc30f August 23, 2023 20:49

revans2 approved these changes Aug 24, 2023

View reviewed changes

gerashegalov changed the title ~~Add GpuConv operator for the conv expression [databricks]~~ Add GpuConv operator for the conv expression Aug 24, 2023

gerashegalov changed the title ~~Add GpuConv operator for the conv expression~~ Add GpuConv operator for the conv 10<->16 expression Aug 25, 2023

gerashegalov merged commit f68d30f into NVIDIA:branch-23.10 Aug 25, 2023
26 of 27 checks passed

gerashegalov deleted the gerashegalov/issue8511 branch August 25, 2023 14:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add GpuConv operator for the `conv` 10<->16 expression #8925

Add GpuConv operator for the `conv` 10<->16 expression #8925

gerashegalov commented Aug 3, 2023 •

edited

Loading

gerashegalov commented Aug 18, 2023

gerashegalov commented Aug 18, 2023

revans2 Aug 21, 2023

revans2 Aug 21, 2023

gerashegalov Aug 21, 2023

revans2 Aug 21, 2023

revans2 commented Aug 21, 2023

gerashegalov commented Aug 21, 2023

revans2 commented Aug 23, 2023

gerashegalov commented Aug 23, 2023

gerashegalov commented Aug 24, 2023

gerashegalov commented Aug 24, 2023

pxLi commented Aug 25, 2023

Add GpuConv operator for the conv 10<->16 expression #8925

Add GpuConv operator for the conv 10<->16 expression #8925

Conversation

gerashegalov commented Aug 3, 2023 • edited Loading

gerashegalov commented Aug 18, 2023

gerashegalov commented Aug 18, 2023

revans2 Aug 21, 2023

Choose a reason for hiding this comment

revans2 Aug 21, 2023

Choose a reason for hiding this comment

gerashegalov Aug 21, 2023

Choose a reason for hiding this comment

revans2 Aug 21, 2023

Choose a reason for hiding this comment

revans2 commented Aug 21, 2023

gerashegalov commented Aug 21, 2023

revans2 commented Aug 23, 2023

gerashegalov commented Aug 23, 2023

gerashegalov commented Aug 24, 2023

gerashegalov commented Aug 24, 2023

pxLi commented Aug 25, 2023

Add GpuConv operator for the `conv` 10<->16 expression #8925

Add GpuConv operator for the `conv` 10<->16 expression #8925

gerashegalov commented Aug 3, 2023 •

edited

Loading