Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In-place *_jacvec! assumes square jacobian #184

Open
danielwe opened this issue May 17, 2022 · 6 comments
Open

In-place *_jacvec! assumes square jacobian #184

danielwe opened this issue May 17, 2022 · 6 comments

Comments

@danielwe
Copy link

  1. auto_jacvec! and num_jacvec! only work when the output and input dimensions are the same, i.e., the jacobian is square. The issue is this line (and the equivalent for num_jacvec!), which should probably use both the length and eltype of dy.

  1. Manual cache allocation works but is surprisingly hard to get right, the implicit prescription in the README gets the wrong tag type for the duals: https://github.com/JuliaDiff/SparseDiffTools.jl#jacobian-vector-and-hessian-vector-products

  2. The corresponding lines in JacVec should similarly be fixed:

    cache2 = Dual{typeof(ForwardDiff.Tag(DeivVecTag(),eltype(x))),eltype(x),1}.(x, ForwardDiff.Partials.(tuple.(x)))


  1. Besides, shouldn't auto_jacvec(!) be upstreamed to ForwardDiff? Strange to have to pull in a sparse-related package to get this functionality. Calculating J_f(x) . y efficiently ForwardDiff.jl#319

MWE:

using SparseDiffTools

function f!(dx, x)
    dx[1] = x[2] + x[3]
    dx[2] = x[1]
end

x = randn(3)
v = randn(3)
dx = zeros(2)

auto_jacvec!(dx, f!, x, v)
@ChrisRackauckas
Copy link
Member

I think manual cache allocation is what needs to be done, because that's how the output vector would then be defined. Does that make this a documentation issue?

Besides, shouldn't auto_jacvec(!) be upstreamed to ForwardDiff? Strange to have to pull in a sparse-related package to get this functionality. JuliaDiff/ForwardDiff.jl#319

Possibly, or it should be an AbstractDifferentation.jl interface piece, and all AD packages should then adopt that interface.

@danielwe
Copy link
Author

danielwe commented May 20, 2022

A documentation fix would definitely make it easier to preallocate, which is useful regardless, but I think it's possible to make the allocating version work too. Just replace cache2 = similar(cache1) with

    cache2 = Dual{typeof(ForwardDiff.Tag(DeivVecTag(),eltype(dy))),eltype(dy),1}.(dy, ForwardDiff.Partials.(tuple.(dy))),

Here's the adaptation I landed on for my own use. I factored out the cache allocation into a function that can be used separately, making preallocation easy. If upstreamed to ForwardDiff this would of course turn into something like a ForwardDiff.JVPConfig type.

struct MyTag end

function jvp!(f!, dy, x, v, duals=nothing)
    if duals === nothing
        duals = jvpduals(dy, v, x)
    else # jvpduals returns initialized duals, so this init is only needed when preallocated
        duals.x .= ForwardDiff.Dual{_tagtype(x), eltype(x), 1}.(
            x, ForwardDiff.Partials.(tuple.(v))
        )
    end
    f!(duals.dy, duals.x)
    dy .= ForwardDiff.partials.(duals.dy, 1)
    return dy
end

function jvpduals(dy, x, v)
    dualdy = ForwardDiff.Dual{_tagtype(dy), eltype(dy), 1}.(
        dy, ForwardDiff.Partials.(tuple.(dy))
    )
    # This correctly initializes dualx for computing J_f(x) * v
    dualx = ForwardDiff.Dual{_tagtype(x), eltype(x), 1}.(
        x, ForwardDiff.Partials.(tuple.(v))
    )
    return (; dy=dualdy, x=dualx)
end

_tagtype(x) = typeof(ForwardDiff.Tag(MyTag(), eltype(x)))

@ChrisRackauckas
Copy link
Member

That does seem like a good fix. Mind opening a PR?

@danielwe
Copy link
Author

danielwe commented Jun 3, 2022

Been procrastinating on this, since my own version is still developing along with my project (including things like allocation-free 5-arg accumulating jvp, and consistent treatment of in-place vs. immutable versions), and I notice that the landscape of linear operator types is in rapid flux at the moment, at least on the SciML end of things (stoked for https://github.com/SciML/SciMLOperators.jl 🔥). So unless an improvement on this is urgent for anyone else I'll hold off a little and come back when it's more clear how things should look.

@ChrisRackauckas
Copy link
Member

There's been a lot of motion in this ocean in the last few weeks. Indeed, This change might want to wait until the end of the month until SciMLOperators is complete and registered.

@gdalle
Copy link
Member

gdalle commented Jun 27, 2024

Sparse differentiation is now a part of DifferentiationInterface, with automatic cache allocation and (tested) support for non-square Jacobians

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants