Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mlir: Add Enzyme ops removal on structured control flow #2200

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

Pangoraw
Copy link
Contributor

@Pangoraw Pangoraw commented Dec 18, 2024

TODO:

  • scf: support non-constant iterations Cache<f32> -> tensor<?xf32>.
  • scf: push/pop only once if a value is pushed multiple times.
  • Cache of tensor (nested for).
  • passes: add option in enzyme pass to try to remove enzyme ops after generating the function. This should help with higher order diff. ref MLIR: post optimization pipeline #2214.
  • scf: graph min-cut.

return mlir::enzyme::CacheInfo::batchType(mlir::ShapedType::kDynamic);
}

mlir::Type mlir::enzyme::CacheInfo::batchType(int64_t dim) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so there is already an Enzyme Autodiff Type interface, which should have a method for batching (and if not that would probably be the right place for this)

jumerckx and others added 7 commits December 21, 2024 10:13
This still requires changes in the tblgenerated derivative files. For example, createForwardModeTangent in MulFOpFwdDerivative could be altered like this:
```
  LogicalResult createForwardModeTangent(Operation *op0, OpBuilder &builder, MGradientUtils *gutils) const
  {
    auto op = cast<arith::MulFOp>(op0);
    if (gutils->width != 1) {
      auto newop = gutils->getNewFromOriginal(op0);
      for (auto res : newop->getResults()) {
        res.setType(mlir::RankedTensorType::get({gutils->width}, res.getType()));
      }
    }
    gutils->eraseIfUnused(op);
    if (gutils->isConstantInstruction(op))
      return success();
    mlir::Value res = nullptr;
    if (!gutils->isConstantValue(op->getOperand(0)))
    {
      auto dif = gutils->invertPointerM(op->getOperand(0), builder);
      {
        mlir::Value itmp = ({
          // Computing MulFOp
          auto fwdarg_0 = dif;
          dif.dump();
          // TODO: gutils->makeBatched(...)
          auto fwdarg_1 = gutils->getNewFromOriginal(op->getOperand(1));
          builder.create<arith::MulFOp>(op.getLoc(), fwdarg_0, fwdarg_1);
        });
        itmp.dump();
        if (!res)
          res = itmp;
        else
        {
          auto operandType = cast<AutoDiffTypeInterface>(res.getType());
          res = operandType.createAddOp(builder, op.getLoc(), res, itmp);
        }
      }
    }
    if (!gutils->isConstantValue(op->getOperand(1)))
    {
      auto dif = gutils->invertPointerM(op->getOperand(1), builder);
      {
        mlir::Value itmp = ({
          // Computing MulFOp
          auto fwdarg_0 = dif;
          dif.dump();
          auto fwdarg_1 = gutils->getNewFromOriginal(op->getOperand(0));
          builder.create<arith::MulFOp>(op.getLoc(), fwdarg_0, fwdarg_1);
        });
        if (!res)
          res = itmp;
        else
        {
          auto operandType = cast<AutoDiffTypeInterface>(res.getType());
          res = operandType.createAddOp(builder, op.getLoc(), res, itmp);
        }
      }
    }
    assert(res);
    gutils->setDiffe(op->getResult(0), res, builder);
    return success();
  }
```
@Pangoraw Pangoraw marked this pull request as ready for review January 3, 2025 14:44
if (!gutils->isConstantValue(prev))
gutils->addToDiffe(prev, post, builder);
auto numIters = getConstantNumberOfIterations(forOp);
Value inductionVariable; // [0, N[ counter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: presumably N]

@@ -27,7 +27,11 @@ getFunctionTypeForClone(mlir::FunctionType FTy, DerivativeMode mode,
for (auto &&[Ty, returnPrimal, returnShadow, activity] : llvm::zip(
FTy.getResults(), returnPrimals, returnShadows, ReturnActivity)) {
if (returnPrimal) {
RetTypes.push_back(Ty);
if (width != 1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this shouldn't be modified since width only applies to the derivative not primal return

@@ -232,6 +240,11 @@ FunctionOpInterface CloneFunctionWithReturns(

{
auto &blk = NewF.getFunctionBody().front();
if (width != 1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly this seems wrong?

assert(width == 1 && "unsupported width != 1");
return self;
Type getShadowType(Type self, int64_t width) const {
return batchType(self, width);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in a separate different PR, it may be worthwhile switching getShadowType and the likes to take an ArrayRef<int64_t> indices to batch on (@jumerckx did something similar when adding batched differentiation broadcast earlier)

Copy link
Member

@wsmoses wsmoses left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, though there's some unrelated batch stuff here that probably shouldn't be here (maybe leftover from debugging)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants