Skip to content

[AutoSparkUT]"SPARK-6743: no columns from cache" in SQLQuerySuite failed #14098

@GaryShen2008

Description

@GaryShen2008

Describe the bug
"SPARK-6743: no columns from cache" in SQLQuerySuite failed as below, when using "spark.sql.cache.serializer", "com.nvidia.spark.ParquetCachedBatchSerializer".

java.lang.AssertionError: The number of columns and the number of types don't match Table{columns=[ColumnVector{rows=3, type=INT32, nullCount=Optional.empty, offHeap=(ID: 29673 7fb0625de6f0)}, ColumnVector{rows=3, type=INT32, nullCount=Optional.empty, offHeap=(ID: 29674 7fb0620e0d90)}, ColumnVector{rows=3, type=INT32, nullCount=Optional.empty, offHeap=(ID: 29675 7fb0620e0de0)}], cudfTable=140395536096672, rows=3} []

Steps/Code to reproduce bug

I got below code to reproduce a similar failure, but the exception is not same.
But it seems still a bug related to the UT case.
Spark-shell

import org.apache.spark.sql.functions._
import org.apache.spark.sql.Row

val testData = Seq(
  (83, 0, 38),
  (26, 0, 79),
  (43, 81, 24)
).toDF("a", "b", "c")

// Create temp view and cache it
testData.createOrReplaceTempView("cachedData")
spark.catalog.cacheTable("cachedData")

spark.conf.set("spark.sql.crossJoin.enabled", "true")

spark.sql("SELECT t1.b FROM cachedData, cachedData t1 GROUP BY t1.b").collect()

GPU failed by below exception

java.lang.ArrayIndexOutOfBoundsException: 0
	at com.nvidia.spark.rapids.GpuColumnVectorFromBuffer.from(GpuColumnVectorFromBuffer.java:79)
	at com.nvidia.spark.rapids.GpuColumnVectorFromBuffer.from(GpuColumnVectorFromBuffer.java:49)
	at com.nvidia.spark.rapids.spill.SpillableColumnarBatchFromBufferHandle$.$anonfun$apply$1(SpillFramework.scala:772)
	at com.nvidia.spark.rapids.Arm$.withResource(Arm.scala:30)
	at com.nvidia.spark.rapids.spill.SpillableColumnarBatchFromBufferHandle$.apply(SpillFramework.scala:770)
	at com.nvidia.spark.rapids.SpillableColumnarBatch$.apply(SpillableColumnarBatch.scala:392)
	at org.apache.spark.sql.rapids.execution.SerializeConcatHostBuffersDeserializeBatch.$anonfun$batch$2(GpuBroadcastExchangeExec.scala:106)
	at com.nvidia.spark.rapids.NvtxId.apply(NvtxRangeWithDoc.scala:84)
	at org.apache.spark.sql.rapids.execution.SerializeConcatHostBuffersDeserializeBatch.$anonfun$batch$1(GpuBroadcastExchangeExec.scala:89)
	at scala.Option.getOrElse(Option.scala:189)

CPU without --conf spark.sql.cache.serializer=com.nvidia.spark.ParquetCachedBatchSerializer ran successfully.

Array([0], [81])

Spark-Shell command:

/data/spark-3.3.0-bin-hadoop3/bin/spark-shell   --master local[2]   --conf spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation,org.apache.spark.sql.catalyst.optimizer.ConstantFolding   --conf spark.rapids.sql.enabled=true   --conf spark.plugins=com.nvidia.spark.SQLPlugin   --conf spark.sql.queryExecutionListeners=org.apache.spark.sql.rapids.ExecutionPlanCaptureCallback   --conf spark.rapids.sql.explain=ALL   --conf spark.rapids.sql.test.isFoldableNonLitAllowed=true   --conf spark.rapids.sql.csv.read.decimal.enabled=true   --conf spark.rapids.sql.format.avro.enabled=true   --conf spark.rapids.sql.format.avro.read.enabled=true   --conf spark.rapids.sql.format.hive.text.write.enabled=true   --conf spark.rapids.sql.format.json.enabled=true   --conf spark.rapids.sql.format.json.read.enabled=true   --conf spark.rapids.sql.incompatibleDateFormats.enabled=true   --conf spark.rapids.sql.python.gpu.enabled=true   --conf spark.rapids.sql.rowBasedUDF.enabled=true   --conf spark.rapids.sql.window.collectList.enabled=true   --conf spark.rapids.sql.window.collectSet.enabled=true   --conf spark.rapids.sql.window.range.byte.enabled=true   --conf spark.rapids.sql.window.range.short.enabled=true   --conf spark.rapids.sql.expression.Ascii=true   --conf spark.rapids.sql.expression.Conv=true   --conf spark.rapids.sql.expression.GetJsonObject=true   --conf spark.rapids.sql.expression.JsonToStructs=true   --conf spark.rapids.sql.expression.StructsToJson=true   --conf spark.rapids.sql.exec.CollectLimitExec=true   --conf spark.rapids.sql.exec.FlatMapCoGroupsInPandasExec=true   --conf spark.rapids.sql.exec.WindowInPandasExec=true   --conf spark.rapids.sql.hasExtendedYearValues=false   --conf spark.unsafe.exceptionOnMemoryLeak=true   --conf spark.sql.session.timeZone=UTC --conf spark.sql.cache.serializer=com.nvidia.spark.ParquetCachedBatchSerializer

Expected behavior
GPU should run successfully as CPU does without throwing exception.

Environment details (please complete the following information)

  • Environment location: [Spark 3.3.0, Local mode]

Metadata

Metadata

Assignees

No one assigned

    Labels

    ? - Needs TriageNeed team to review and classifybugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions