Skip to content

Join operator lost data due to decimal join key. #11980

@beliefer

Description

@beliefer

Backend

VL (Velox)

Bug description

There are two tables: A and B.

// The main schema of A
CREATE TABLE A (
  invoiceid DECIMAL(20,0),
  time TIMESTAMP,
  update_time TIMESTAMP,
  dt STRING,
  hour STRING)
USING orc
PARTITIONED BY (dt, hour)
LOCATION 'hdfs://test/A'
TBLPROPERTIES (
  'transient_lastDdlTime'='1718612895')
// The main schema of B.
CREATE TABLE B (
  `invoiceid` DECIMAL(20,0),
  `status` STRING,
  `update_time` TIMESTAMP,
  `dt` STRING,
  `hour` STRING)
USING orc
PARTITIONED BY (dt, hour)
LOCATION 'hdfs://test/B'
TBLPROPERTIES (
  'transient_lastDdlTime'='1718612913')

// The orc file schema in hdfs://test/A

Type: struct<invoiceid:decimal(38,18), time:timestamp, update_time:timestamp>

// The orc file schema in hdfs://test/B

Type: struct<invoiceid:decimal(38,18), status:string, update_time:timestamp>

// The query info.
SELECT count(*)
FROM (
   SELECT c.invoiceid AS invoice_id,
       c.orderid AS order_id,
       c.time AS order_time,
       c.update_time AS update_time
FROM
    (SELECT *
     FROM
         (SELECT *,
                 ROW_NUMBER() OVER (PARTITION BY invoiceid
                                    ORDER BY update_time DESC) AS rn
          FROM A
          WHERE dt >= '20260421') b
     WHERE rn = 1) c
JOIN
    (SELECT *
     FROM B
     WHERE dt >= '20260421'
         AND status = 7) d ON c.invoiceid = d.invoiceid
);

The root cause

Scan operator of A is Spark Scan hive due to the time and update_time are TIMESTAMP.
Scan operator of B is Gluten NativeScan hive.

Left side: Scan hive (vanilla) → DECIMAL (20,2) row → RowToVeloxColumnar → DECIMAL(20,2) columnar                                                                         
Right: NativeScan hive (Velox) → may be widen when reading ORC internally DECIMAL(38,2) columnar                                                                                   
                                                                                                                                                                      
→ Inconsistent join key types on both sides → Key matching failed in Velox hash join → 0 line output
                                                                                                                                                                      
The problem with RowToVeloxColumnarEXE itself:
When converting RowToVeloxColumnar, the Velox Vector is constructed strictly according to the DECIMAL (20,2) declared in Spark schema, without actively expanding the precision. When NativeScan reads Hive ORC, Velox will internally decode it
Type regularization, for example, DECIMAL (20,2) may be mapped to an internal precision of 128 bit representation DECIMAL (38,2) or Velox ShortDecimal/LongDecimal.
                                                                                                                                                                      
This results in a situation where the plan level type is consistent (both are DECIMAL (20,2)), but the type tags of the actual columnar vectors are inconsistent, causing the key comparison logic of Velox HashJoin to go to the type mismatch branch → join
Output 0 rows.

Gluten version

Gluten-1.5

Spark version

Spark-3.5.x

Spark configurations

No response

System information

No response

Relevant logs

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingtriage

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions