-
Notifications
You must be signed in to change notification settings - Fork 268
Open
Labels
? - Needs TriageNeed team to review and classifyNeed team to review and classifybugSomething isn't workingSomething isn't working
Description
Describe the bug
The GPU output of from_json is different from CPU's output.
Steps/Code to reproduce bug
import org.apache.spark.sql.functions._
import org.apache.spark.sql.types._
import org.apache.spark.sql.Row
val st = new StructType().add("c1", LongType).add("c2", ArrayType(new StructType().add("c3", LongType).add("c4", StringType)))
val df2 = Seq("""{"data": {"c2": [19], "c1": 123456}}""").toDF("c0")
df2.select(from_json($"c0", new StructType().add("data", st))).show
spark.conf.set("spark.rapids.sql.enabled", "false")
df2.select(from_json($"c0", new StructType().add("data", st))).show
GPU:
+----------------+
| from_json(c0)|
+----------------+
|{{123456, null}}|
+----------------+
CPU:
+-------------+
|from_json(c0)|
+-------------+
| {null}|
+-------------+
Spark-shell command:
spark-shell --master local[2] --conf spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation,org.apache.spark.sql.catalyst.optimizer.ConstantFolding --conf spark.rapids.sql.enabled=true --conf spark.plugins=com.nvidia.spark.SQLPlugin --conf spark.sql.queryExecutionListeners=org.apache.spark.sql.rapids.ExecutionPlanCaptureCallback --conf spark.rapids.sql.explain=ALL --conf spark.rapids.sql.test.isFoldableNonLitAllowed=true --conf spark.rapids.sql.csv.read.decimal.enabled=true --conf spark.rapids.sql.format.avro.enabled=true --conf spark.rapids.sql.format.avro.read.enabled=true --conf spark.rapids.sql.format.hive.text.write.enabled=true --conf spark.rapids.sql.format.json.enabled=true --conf spark.rapids.sql.format.json.read.enabled=true --conf spark.rapids.sql.incompatibleDateFormats.enabled=true --conf spark.rapids.sql.python.gpu.enabled=true --conf spark.rapids.sql.rowBasedUDF.enabled=true --conf spark.rapids.sql.window.collectList.enabled=true --conf spark.rapids.sql.window.collectSet.enabled=true --conf spark.rapids.sql.window.range.byte.enabled=true --conf spark.rapids.sql.window.range.short.enabled=true --conf spark.rapids.sql.expression.Ascii=true --conf spark.rapids.sql.expression.Conv=true --conf spark.rapids.sql.expression.GetJsonObject=true --conf spark.rapids.sql.expression.JsonToStructs=true --conf spark.rapids.sql.expression.StructsToJson=true --conf spark.rapids.sql.exec.CollectLimitExec=true --conf spark.rapids.sql.exec.FlatMapCoGroupsInPandasExec=true --conf spark.rapids.sql.exec.WindowInPandasExec=true --conf spark.rapids.sql.hasExtendedYearValues=false --conf spark.unsafe.exceptionOnMemoryLeak=true --conf spark.sql.session.timeZone=UTC
Expected behavior
GPU should output as same as CPU does.
Environment details (please complete the following information)
- Environment location: [Local mode]
Additional context
It's a further test case related to #10901 which has been fixed.
Metadata
Metadata
Assignees
Labels
? - Needs TriageNeed team to review and classifyNeed team to review and classifybugSomething isn't workingSomething isn't working