Skip to content
Open
22 changes: 21 additions & 1 deletion include/substrait-mlir/Dialect/Substrait/IR/SubstraitAttrs.td
Original file line number Diff line number Diff line change
Expand Up @@ -290,10 +290,30 @@ def Substrait_ParametrizedAttributes {
/// Attribute of one of the currently supported atomic types.
def Substrait_AtomicAttribute : AnyAttrOf<Substrait_SimpleAttributes.attrs#Substrait_ParametrizedAttributes.attrs>;

def Substrait_ListAttr : Substrait_Attr<"List", "list", [TypedAttrInterface]> {
let summary = "Substrait list attribute";
let description = [{
This attribute represents a list of atomic attributes, parameterized with a ListType.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the restriction to atomic element types intentional? If so, we should probably add a TODO (here and to the type if it has the same restriction). Note that we can have "lists of lists" and "structs of lists of structs of lists" etc. in Substrait.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add TODO. I think better to do in a separate PR.

}];
// TODO: Extend to list of container attributes.
// TODO: Implement custom parser and printer.
let parameters = (ins "mlir::ArrayAttr":$value, "ListType":$type);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we typically use Substrait_ListType here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh no wait: I'm pretty sure that ListType is the syntax I should use in this file, not Substrait_ListType. I did the same thing for my other attributes that referenced types, i.e. VarCharAttr has a parameter called VarCharType, NOT Substrait_VarCharType.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the other syntax doesn't work? Maybe we've been doing it wrong the whole time? I don't have the details in my head and haven't checked, so maybe there is a reason why we are doing it this way...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://mlir.llvm.org/docs/DefiningDialects/AttributesAndTypes/#parameters

Substrait_ListType is not AttrParameter or TypeParameter, but rather a TypeDef. Therefore, based on my understanding of what the above link says, we have to use the "raw c++Type string instead", which in this case would be ListType.

(Other syntax does not work btw, I tried)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, cool, thanks for confirming.

let assemblyFormat = [{ `<` $value `,` $type `>` }];
let genVerifyDecl = 1;
}

def Substrait_ContainerAttributes {
list<Attr> attrs = [
Substrait_ListAttr, // List
];
}

def Substrait_ContainerAttribute : AnyAttrOf<Substrait_ContainerAttributes.attrs>;

/// Attribute of one of the currently supported atomic or container types.
def Substrait_ExpressionAttribute :
AnyAttrOf<[
// TODO: add container attributes here once we have them.
Substrait_ContainerAttribute,
Substrait_AtomicAttribute,
]>;

Expand Down
23 changes: 21 additions & 2 deletions include/substrait-mlir/Dialect/Substrait/IR/SubstraitTypes.td
Original file line number Diff line number Diff line change
Expand Up @@ -180,14 +180,33 @@ def Substrait_AtomicTypes {
/// One of the currently supported atomic types.
def Substrait_AtomicType :
AnyTypeOf<Substrait_SimpleTypes.types#Substrait_ParametrizedTypes.types>;

def Substrait_ListType : Substrait_Type<"List", "list"> {
let summary = "Substrait list type";
let description = [{
This type represents a list type, where type is atomic.
}];
// TODO: Extend to list of container attributes.
let parameters = (ins Substrait_AtomicType:$type);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here: if the restriction to atomic types is intentional, then add a TODO. Adding support for arbitrary nesting of lists and structs is probably not trivial and it may be a good strategy to do it in two steps.

let assemblyFormat = [{ `<` $type `>` }];
let genVerifyDecl = 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is possible and most certainly preferrable to verify this using tablegen (more). Take a look at TupleOf in CommonTypeConstraints.td upstream. (NestedTupleOf will then be helpful as inspiration for nested Substrait structs, lists, and maps...). One problem here is that we don't see in this file which types can be used, plus we now have to maintain two lists in two different files, which risk to run out of sync at some point.

}

def Substrait_ContainerTypes {
list<Type> types = [
Substrait_ListType, // List
];
}

/// Any container type, i.e., structs, maps, lists, and nestings thereof.
def Substrait_ContainerType : NestedTupleOf<Substrait_AtomicTypes.types>;
def Substrait_ContainerType : AnyTypeOf<
Substrait_ContainerTypes.types # [NestedTupleOf<Substrait_AtomicTypes.types>]
>;

/// Currently supported expression types.
def Substrait_ExpressionTypes {
list<Type> types =
Substrait_AtomicTypes.types;
Substrait_AtomicTypes.types#Substrait_ContainerTypes.types;
// TODO: add container types here once we have them.
}

Expand Down
41 changes: 41 additions & 0 deletions lib/Dialect/Substrait/IR/Substrait.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,29 @@ LogicalResult mlir::substrait::IntervalDaySecondAttr::verify(
return success();
}

LogicalResult mlir::substrait::ListAttr::verify(
llvm::function_ref<mlir::InFlightDiagnostic()> emitError, ArrayAttr value,
ListType listType) {

mlir::Type expectedType =
listType.getType(); // The atomic type the list is parameterized with

for (mlir::Attribute attr : value) {
auto typedAttr = mlir::dyn_cast<mlir::TypedAttr>(attr);
if (!typedAttr) {
return emitError()
<< "'ListAttr' values must be typed attributes, but got: " << attr;
}

if (typedAttr.getType() != expectedType) {
return emitError() << "mismatched element type in 'ListAttr': expected "
<< expectedType << ", but got " << typedAttr.getType();
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ultra-nit: blank line after the loop body.

Suggested change
}
}


return success();
}

LogicalResult mlir::substrait::VarCharAttr::verify(
llvm::function_ref<mlir::InFlightDiagnostic()> emitError, StringAttr value,
VarCharType type) {
Expand Down Expand Up @@ -144,6 +167,24 @@ LogicalResult mlir::substrait::DecimalType::verify(
return success();
}

LogicalResult mlir::substrait::ListType::verify(
llvm::function_ref<mlir::InFlightDiagnostic()> emitError, mlir::Type type) {
// ListType must be parameterized with an atomic type.
if (!mlir::isa<mlir::IntegerType, mlir::Float32Type, mlir::Float64Type,
substrait::StringType, substrait::BinaryType,
substrait::TimestampType, substrait::TimestampTzType,
substrait::DateType, substrait::TimeType,
substrait::IntervalYearMonthType,
substrait::IntervalDaySecondType, substrait::UUIDType,
substrait::FixedCharType, substrait::VarCharType,
substrait::FixedBinaryType, substrait::DecimalType>(type))
return emitError()
<< "'ListType' must be parameterized with an atomic type, but got: "
<< type;

return mlir::success();
}

// Count the number of digits in an APInt in base 10.
static size_t countDigits(const APInt &value) {
llvm::SmallVector<char> buffer;
Expand Down
70 changes: 61 additions & 9 deletions lib/Target/SubstraitPB/Export.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,9 @@ class SubstraitExporter {
std::unique_ptr<pb::Any> exportAny(StringAttr attr);
FailureOr<std::unique_ptr<NamedStruct>>
exportNamedStruct(Location loc, ArrayAttr fieldNames, TupleType relationType);
FailureOr<std::unique_ptr<Expression::Literal>>
exportAttribute(Attribute value,
std::function<InFlightDiagnostic()> emitError);
FailureOr<std::unique_ptr<pb::Message>> exportOperation(Operation *op);
FailureOr<std::unique_ptr<proto::Type>> exportType(Location loc,
mlir::Type mlirType);
Expand Down Expand Up @@ -431,6 +434,20 @@ SubstraitExporter::exportType(Location loc, mlir::Type mlirType) {
return std::move(type);
}

// Handle list type.
if (auto listType = llvm::dyn_cast<ListType>(mlirType)) {
// TODO(ingomueller): support other nullability modes.
auto listTypeProto = std::make_unique<proto::Type::List>();
listTypeProto->set_nullability(
Type_Nullability::Type_Nullability_NULLABILITY_REQUIRED);
listTypeProto->set_allocated_type(
exportType(loc, listType.getType()).value().release());

auto type = std::make_unique<proto::Type>();
type->set_allocated_list(listTypeProto.release());
return std::move(type);
}

// Handle tuple types.
if (auto tupleType = llvm::dyn_cast<TupleType>(mlirType)) {
auto structType = std::make_unique<proto::Type::Struct>();
Expand Down Expand Up @@ -947,17 +964,18 @@ SubstraitExporter::exportOperation(FilterOp op) {
return rel;
}

FailureOr<std::unique_ptr<Expression>>
SubstraitExporter::exportOperation(LiteralOp op) {
FailureOr<std::unique_ptr<Expression::Literal>>
SubstraitExporter::exportAttribute(
Attribute value, std::function<InFlightDiagnostic()> emitError) {
// Build `Literal` message depending on type.
Attribute value = op.getValue();
mlir::Type literalType = getAttrType(value);
auto literal = std::make_unique<Expression::Literal>();

// `IntegerType`s.
if (auto intType = dyn_cast<IntegerType>(literalType)) {
if (!intType.isSigned())
op->emitOpError("has integer value with unsupported signedness");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change (here and two times below)?

Maybe the intention was to only emit an error message below in exportOperation(LiteralOp). However, we have more information here and I think it's better to emit the more precise message here (and none below).

Copy link
Collaborator Author

@dshaaban01 dshaaban01 Apr 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I created a function called exportAttribute(Attribute value) that contains all of the logic that was previously in exportOperation(LiteralOp).

Then I altered exportOperation(LiteralOp op) to look like this.

FailureOr<std::unique_ptr<Expression>>
SubstraitExporter::exportOperation(LiteralOp op) {
  // Build `Literal` message depending on type.
  Attribute value = op.getValue();
  auto literal = exportAttribute(value);

  if (failed(literal))
    return op->emitOpError("has unsupported value");

  // Build `Expression` message.
  auto expression = std::make_unique<Expression>();
  expression->set_allocated_literal(literal->release());

  return expression;
}

Since exportAttribute(Attribute value) does not have access to the op, I can't emit an error message via the op. And not sure how to pass the error strings into failure() failures. Wanted to talk to you about this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. Three patterns come to mind: (1) Use InFlightDiagnostics but maybe you have the same problem of not having an op; (2) provide a Location loc argument and emit the error at that location; or (3) provide an emitError function argument and use that to emit the error.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay will do. And do you agree with the restructuring? Should I similarly implement exportAttribute(Attribute value) ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, cool. Yeah, I think it's a good idea to cut this mega function into pieces :) Thanks!

// has integer value with unsupported signedness
return emitError() << " has integer value with unsupported signedness";
switch (intType.getWidth()) {
case 1:
literal->set_boolean(mlir::cast<IntegerAttr>(value).getSInt());
Expand All @@ -976,7 +994,8 @@ SubstraitExporter::exportOperation(LiteralOp op) {
literal->set_i64(mlir::cast<IntegerAttr>(value).getSInt());
break;
default:
op->emitOpError("has integer value with unsupported width");
// has integer value with unsupported width
return emitError() << " has integer value with unsupported width";
}
}
// `FloatType`s.
Expand All @@ -990,7 +1009,7 @@ SubstraitExporter::exportOperation(LiteralOp op) {
literal->set_fp64(mlir::cast<FloatAttr>(value).getValueAsDouble());
break;
default:
op->emitOpError("has float value with unsupported width");
return emitError() << " float value with unsupported width";
}
}
// `StringType`.
Expand Down Expand Up @@ -1067,12 +1086,45 @@ SubstraitExporter::exportOperation(LiteralOp op) {
decimal->set_precision(decimalType.getPrecision());
decimal->set_value(res);
literal->set_allocated_decimal(decimal.release());
} else
op->emitOpError("has unsupported value");
} // `ListType`.
else if (auto listType = dyn_cast<ListType>(literalType)) {
auto list = std::make_unique<::substrait::proto::Expression_Literal_List>();
auto listAttr = mlir::cast<ListAttr>(value);
for (mlir::Attribute attr : listAttr.getValue()) {

FailureOr<std::unique_ptr<Expression::Literal>> listElement =
exportAttribute(attr, emitError);
if (failed(listElement))
return emitError() << "Failed to export list element attribute: "
<< attr;

auto *literalExpr = dyn_cast<Expression_Literal>(listElement->get());
if (!literalExpr)
return emitError()
<< "Exported list element is not a valid Expression_Literal: "
<< attr;

list->add_values()->CopyFrom(*literalExpr);
}
literal->set_allocated_list(list.release());
} else {
return emitError() << " has unsupported value";
}
return literal;
}

FailureOr<std::unique_ptr<Expression>>
SubstraitExporter::exportOperation(LiteralOp op) {
// Build `Literal` message depending on type.
Attribute value = op.getValue();
auto literal = exportAttribute(value, [&]() { return op->emitOpError(); });

if (failed(literal))
return op->emitOpError("has unsupported value");

// Build `Expression` message.
auto expression = std::make_unique<Expression>();
expression->set_allocated_literal(literal.release());
expression->set_allocated_literal(literal->release());

return expression;
}
Expand Down
Loading
Loading