Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

automatic differentiation not working in advanced examples? #387

Open
JianghuiDu opened this issue Apr 20, 2021 · 13 comments
Open

automatic differentiation not working in advanced examples? #387

JianghuiDu opened this issue Apr 20, 2021 · 13 comments

Comments

@JianghuiDu
Copy link

JianghuiDu commented Apr 20, 2021

I'm trying to find an example of how to use automatic differentiation in more complex ODE rather than those vanilla examples in the tutorials. Simple ODEs are quite easy to figure out, but I always have the trouble with the Dual Number error in complex ODE models. Seems my ODE functions are always not generic enough. And I'd like to learn how to make it work.

I was trying to modify the Beeler-Reuter Model (the CPU version) in the advanced tutorial, as the original code doesn't work with autodiff=true since it only accepts Float. I finally manage to make it work, but I have many problems that I don't understand. Here is the modified working code:

using DifferentialEquations, Sundials, ForwardDiff

const v0 = -84.624
const v1 = 10.0
const C_K1 = 1.0f0
const C_x1 = 1.0f0
const C_Na = 1.0f0
const C_s = 1.0f0
const D_Ca = 0.0f0
const D_Na = 0.0f0
const g_s = 0.09f0
const g_Na = 4.0f0
const g_NaC = 0.005f0
const ENa = 50.0f0 + D_Na
const γ = 0.5f0
const C_m = 1.0f0

# make this parametric
mutable struct BeelerReuterCpu{T} <: Function
    t::T              # the last timestep time to calculate Δt
    diff_coef::T      # the diffusion-coefficient (coupling strength)

    C::Array{T,2}   # intracellular calcium concentration
    M::Array{T,2}     # sodium current activation gate (m)
    H::Array{T,2}    # sodium current inactivation gate (h)
    J::Array{T,2}     # sodium current slow inactivaiton gate (j)
    D::Array{T,2}     # calcium current activaiton gate (d)
    F::Array{T,2}     # calcium current inactivation gate (f)
    XI::Array{T,2}    # inward-rectifying potassium current (iK1)
    Δu::Array{T,2}    # place-holder for the Laplacian

    function BeelerReuterCpu(u0::Array{T,2}, diff_coef::T) where {T}
        self = new{T}()

        ny, nx = size(u0)
        self.t = 0.0
        self.diff_coef = diff_coef

        self.C = fill(0.0001f0, (ny, nx))
        self.M = fill(0.01f0, (ny, nx))
        self.H = fill(0.988f0, (ny, nx))
        self.J = fill(0.975f0, (ny, nx))
        self.D = fill(0.003f0, (ny, nx))
        self.F = fill(0.994f0, (ny, nx))
        self.XI = fill(0.0001f0, (ny, nx))
        self.Δu = zeros(ny, nx)

        return self
    end
end


# 5-point stencil
function laplacian(Δu, u) 

    n1, n2 = size(u)

    # internal nodes
    for j = 2:n2-1
        for i = 2:n1-1
            @inbounds Δu[i, j] = u[i+1, j] + u[i-1, j] + u[i, j+1] + u[i, j-1] - 4 * u[i, j]
        end
    end

    # left/right edges
    for i = 2:n1-1
        @inbounds Δu[i, 1] = u[i+1, 1] + u[i-1, 1] + 2 * u[i, 2] - 4 * u[i, 1]
        @inbounds Δu[i, n2] = u[i+1, n2] + u[i-1, n2] + 2 * u[i, n2-1] - 4 * u[i, n2]
    end

    # top/bottom edges
    for j = 2:n2-1
        @inbounds Δu[1, j] = u[1, j+1] + u[1, j-1] + 2 * u[2, j] - 4 * u[1, j]
        @inbounds Δu[n1, j] = u[n1, j+1] + u[n1, j-1] + 2 * u[n1-1, j] - 4 * u[n1, j]
    end

    # corners
    @inbounds Δu[1, 1] = 2 * (u[2, 1] + u[1, 2]) - 4 * u[1, 1]
    @inbounds Δu[n1, 1] = 2 * (u[n1-1, 1] + u[n1, 2]) - 4 * u[n1, 1]
    @inbounds Δu[1, n2] = 2 * (u[2, n2] + u[1, n2-1]) - 4 * u[1, n2]
    @inbounds Δu[n1, n2] = 2 * (u[n1-1, n2] + u[n1, n2-1]) - 4 * u[n1, n2]
end

@inline function rush_larsen(g, α, β, Δt)
    inf = α /+ β)
    τ = 1.0f0 /+ β)
    return clamp(g + (g - inf) * expm1(-Δt / τ), 0.0f0, 1.0f0)
end

function update_M_cpu(g, v, Δt)
    # the condition is needed here to prevent NaN when v == 47.0
    α = isapprox(v, 47.0f0) ? 10.0f0 : -(v + 47.0f0) / (exp(-0.1f0 * (v + 47.0f0)) - 1.0f0)
    β = (40.0f0 * exp(-0.056f0 * (v + 72.0f0)))
    return rush_larsen(g, α, β, Δt)
end

function update_H_cpu(g, v, Δt)
    α = 0.126f0 * exp(-0.25f0 * (v + 77.0f0))
    β = 1.7f0 / (exp(-0.082f0 * (v + 22.5f0)) + 1.0f0)
    return rush_larsen(g, α, β, Δt)
end

function update_J_cpu(g, v, Δt)
    α = (0.55f0 * exp(-0.25f0 * (v + 78.0f0))) / (exp(-0.2f0 * (v + 78.0f0)) + 1.0f0)
    β = 0.3f0 / (exp(-0.1f0 * (v + 32.0f0)) + 1.0f0)
    return rush_larsen(g, α, β, Δt)
end

function update_D_cpu(g, v, Δt)
    α = γ * (0.095f0 * exp(-0.01f0 * (v - 5.0f0))) / (exp(-0.072f0 * (v - 5.0f0)) + 1.0f0)
    β = γ * (0.07f0 * exp(-0.017f0 * (v + 44.0f0))) / (exp(0.05f0 * (v + 44.0f0)) + 1.0f0)
    return rush_larsen(g, α, β, Δt)
end

function update_F_cpu(g, v, Δt)
    α = γ * (0.012f0 * exp(-0.008f0 * (v + 28.0f0))) / (exp(0.15f0 * (v + 28.0f0)) + 1.0f0)
    β = γ * (0.0065f0 * exp(-0.02f0 * (v + 30.0f0))) / (exp(-0.2f0 * (v + 30.0f0)) + 1.0f0)
    return rush_larsen(g, α, β, Δt)
end

function update_XI_cpu(g, v, Δt)
    α = (0.0005f0 * exp(0.083f0 * (v + 50.0f0))) / (exp(0.057f0 * (v + 50.0f0)) + 1.0f0)
    β = (0.0013f0 * exp(-0.06f0 * (v + 20.0f0))) / (exp(-0.04f0 * (v + 20.0f0)) + 1.0f0)
    return rush_larsen(g, α, β, Δt)
end

function update_C_cpu(g, d, f, v, Δt)
    ECa = D_Ca - 82.3f0 - 13.0278f0 * log(g)
    kCa = C_s * g_s * d * f
    iCa = kCa * (v - ECa)
    inf = 1.0f-7 * (0.07f0 - g)
    τ = 1.0f0 / 0.07f0
    return g + (g - inf) * expm1(-Δt / τ)
end

function update_gates_cpu(u, XI, M, H, J, D, F, C, Δt)
    # let Δt = Float32(Δt) # remove the Let 
        n1, n2 = size(u)
        for j = 1:n2
            for i = 1:n1
                v = u[i, j]

                XI[i, j] = update_XI_cpu(XI[i, j], v, Δt)
                M[i, j] = update_M_cpu(M[i, j], v, Δt)
                H[i, j] = update_H_cpu(H[i, j], v, Δt)
                J[i, j] = update_J_cpu(J[i, j], v, Δt)
                D[i, j] = update_D_cpu(D[i, j], v, Δt)
                F[i, j] = update_F_cpu(F[i, j], v, Δt)

                C[i, j] = update_C_cpu(C[i, j], D[i, j], F[i, j], v, Δt)
            end
        end
    # end
end

# iK1 is the inward-rectifying potassium current
function calc_iK1(v)
    ea = exp(0.04f0 * (v + 85.0f0))
    eb = exp(0.08f0 * (v + 53.0f0))
    ec = exp(0.04f0 * (v + 53.0f0))
    ed = exp(-0.04f0 * (v + 23.0f0))
    return 0.35f0 * (
        4.0f0 * (ea - 1.0f0) / (eb + ec) +
        0.2f0 * (isapprox(v, -23.0f0) ? 25.0f0 : (v + 23.0f0) / (1.0f0 - ed))
    )
end

# ix1 is the time-independent background potassium current
function calc_ix1(v, xi)
    ea = exp(0.04f0 * (v + 77.0f0))
    eb = exp(0.04f0 * (v + 35.0f0))
    return xi * 0.8f0 * (ea - 1.0f0) / eb
end

# iNa is the sodium current (similar to the classic Hodgkin-Huxley model)
function calc_iNa(v, m, h, j)
    return C_Na * (g_Na * m^3 * h * j + g_NaC) * (v - ENa)
end

# iCa is the calcium current
function calc_iCa(v, d, f, c)
    ECa = D_Ca - 82.3f0 - 13.0278f0 * log(c)    # ECa is the calcium reversal potential
    return C_s * g_s * d * f * (v - ECa)
end

function update_du_cpu(du, u, XI, M, H, J, D, F, C)
    n1, n2 = size(u)

    for j = 1:n2
        for i = 1:n1
            v = u[i, j]

            # calculating individual currents
            iK1 = calc_iK1(v)
            ix1 = calc_ix1(v, XI[i, j])
            iNa = calc_iNa(v, M[i, j], H[i, j], J[i, j])
            iCa = calc_iCa(v, D[i, j], F[i, j], C[i, j])

            # # total current
            I_sum = iK1 + ix1 + iNa + iCa

            # # the reaction part of the reaction-diffusion equation
            du[i, j] = -I_sum / C_m
        end
    end
end

function (f::BeelerReuterCpu)(du, u, p, t)
    Δt = t - f.t

    if Δt != 0 || t == 0
        update_gates_cpu(
            u,
            eltype(u).(f.XI), # type conversion
            eltype(u).(f.M),
            eltype(u).(f.H),
            eltype(u).(f.J),
            eltype(u).(f.D),
            eltype(u).(f.F),
            eltype(u).(f.C),
            eltype(u)(Δt),
        )
        f.t = t
    end

    laplacian(eltype(u).(f.Δu), u) # type conversion

    # calculate the reaction portion
    update_du_cpu(du, u, f.XI, f.M, f.H, f.J, f.D, f.F, f.C)

    # ...add the diffusion portion
    du .+= f.diff_coef .* f.Δu
    du
end

N = 10;
u0 = fill(v0, (N, N));
u0[5:6,5:6] .= v1;   # a small square in the middle of the domain

using Plots
heatmap(u0)

deriv_cpu = BeelerReuterCpu(u0, 1.0);
prob = ODEProblem(deriv_cpu, u0, (0.0, 5.0))
# Lower tolerances to show the methods converge to the same value
@time sol=solve(prob, TRBDF2(autodiff = true), saveat = 100.0)

The main changes are :

  1. Make the struct BeelerReuterCpu parametric.
  2. In the (f::BeelerReuterCpu)(du, u, p, t) function, when calling update_gates_cpu and laplacian functions, the aruguments are converted to the type of u , like eltype(u).(f.Δu).

I made these changes otherwise I get the Dual Number error

ERROR: LoadError: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{SciMLBase.UJacobianWrapper{ODEFunction{true, BeelerReuterCpu{Float64}, LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Float64, SciMLBase.NullParameters}, Float64}, Float64, 12})
Closest candidates are:
  (::Type{T})(::Real, ::RoundingMode) where T<:AbstractFloat at rounding.jl:200
  (::Type{T})(::T) where T<:Number at boot.jl:760
  (::Type{T})(::AbstractChar) where T<:Union{AbstractChar, Number} at char.jl:50

My questions are:

  1. why type conversion is needed when calling update_gates_cpu and laplacian but not update_du_cpu inside (f::BeelerReuterCpu)(du, u, p, t)?
  2. I used parametric type in BeelerReuterCpu to make the type consistent between u and the fields in BeelerReuterCpu . I assume that's necessary for the Dual Number to work, but obviously that's not enough. I suppose that's because the struct is not created dynamically? Is there a way to do that.
  3. In the Documentation it is said DiffEqBase.dualcache is needed to avoid the Dual Number error when using cache. The cache example given there was very straightforward. But I wonder what exactly is considered as cache. Does it include any intermediary calculation variables?

I hope there could be an example of complex ODE with automatic differentiation. All the complex ODE examples in the tutorials and benchmarks only accept Float and therefore doesn't work for automatic differentiation.
Thanks!

@JianghuiDu JianghuiDu changed the title automatic differentiation in advanced examples? automatic differentiation not working in advanced examples? Apr 20, 2021
@ChrisRackauckas
Copy link
Member

why type conversion is needed when calling update_gates_cpu and laplacian but not update_du_cpu inside (f::BeelerReuterCpu)(du, u, p, t)?

It shouldn't be needed. The code should be improved to not require it.

I used parametric type in BeelerReuterCpu to make the type consistent between u and the fields in BeelerReuterCpu . I assume that's necessary for the Dual Number to work, but obviously that's not enough. I suppose that's because the struct is not created dynamically? Is there a way to do that.

That's the right idea.

In the Documentation it is said DiffEqBase.dualcache is needed to avoid the Dual Number error when using cache. The cache example given there was very straightforward. But I wonder what exactly is considered as cache. Does it include any intermediary calculation variables?

Yes, you'll need to essentially do this trick at a larger scale for all of the caches in here. You're on the right track.

@JianghuiDu
Copy link
Author

I used parametric type in BeelerReuterCpu to make the type consistent between u and the fields in BeelerReuterCpu . I assume that's necessary for the Dual Number to work, but obviously that's not enough. I suppose that's because the struct is not created dynamically? Is there a way to do that.

That's the right idea.

Can you point to any documents that discuss this? I have no clue...

In the Documentation it is said DiffEqBase.dualcache is needed to avoid the Dual Number error when using cache. The cache example given there was very straightforward. But I wonder what exactly is considered as cache. Does it include any intermediary calculation variables?

Yes, you'll need to essentially do this trick at a larger scale for all of the caches in here. You're on the right track.

I don't see where caches are created. Does something like Δu[i, j] = u[i+1, j] + u[i-1, j] + u[i, j+1] + u[i, j-1] - 4 * u[i, j] actually generates cache? I thought it doesn't.

@ChrisRackauckas
Copy link
Member

It's using a cache Δu. How was that matrix created? That should be something dispatching between Dual and not Dual.

@JianghuiDu
Copy link
Author

JianghuiDu commented Apr 20, 2021

So basically all the arrays in the struct are caches? I thought it refers to temporary caches allocated during calculations.

mutable struct BeelerReuterCpu{T} <: Function
    t::T              # the last timestep time to calculate Δt
    diff_coef::T      # the diffusion-coefficient (coupling strength)

    C::Array{T,2}   # intracellular calcium concentration
    M::Array{T,2}     # sodium current activation gate (m)
    H::Array{T,2}    # sodium current inactivation gate (h)
    J::Array{T,2}     # sodium current slow inactivaiton gate (j)
    D::Array{T,2}     # calcium current activaiton gate (d)
    F::Array{T,2}     # calcium current inactivation gate (f)
    XI::Array{T,2}    # inward-rectifying potassium current (iK1)
    Δu::Array{T,2}    # place-holder for the Laplacian

@ChrisRackauckas
Copy link
Member

Yes, each of those need to be dual caches. Those are all used as temporary storage.

@JianghuiDu
Copy link
Author

I created all the caches and pass them as parameters, and now it works.

However, I had trouble with the log functions in the code ( inside calc_iCa and update_C_cpu).

First, I came across the Fixing NaN/Inf Issues in ForwardDiff and I had to set NANSAFE_MODE_ENABLED=true. That removed the NaN problem.

But after that the log functions still throw errors (which does not happen if autodiff=false). Obviously the arguments are negative. I'm not sure if this is a ForwardDiff problem or the negative numbers were generated by the ODE solver because of inappropriate time stepping. I reduced the tolerance levels but that didn't help.

So my workaround was to add small positive numbers inside like in log(x+1e-10) instead. But that's not ideal and I wonder if there are other ways to deal with this problem. The working code is attached behind.

ERROR: DomainError with -4.7521268120395714e-12:
log will only return a complex result if called with a complex argument. Try log(Complex(x)).
Stacktrace:
  [1] throw_complex_domainerror(f::Symbol, x::Float64)
    @ Base.Math .\math.jl:33
  [2] log(x::Float64)
    @ Base.Math .\special\log.jl:285
  [3] log
    @ ~\.julia\packages\ForwardDiff\QOqCN\src\dual.jl:203 [inlined]
# this is where the log function is called:
  [4] calc_iCa(v::ForwardDiff.Dual{ForwardDiff.Tag{SciMLBase.UJacobianWrapper{ODEFunction{true, BeelerReuterCpu{Float64}, LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Float64, NTuple{8, DiffEqBase.DiffCache{Matrix{Float64}, Matrix{ForwardDiff.Dual{nothing, Float64, 12}}}}}, Float64}, Float64, 12}, d::ForwardDiff.Dual{ForwardDiff.Tag{SciMLBase.UJacobianWrapper{ODEFunction{true, BeelerReuterCpu{Float64}, LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Float64, NTuple{8, DiffEqBase.DiffCache{Matrix{Float64}, Matrix{ForwardDiff.Dual{nothing, Float64, 12}}}}}, Float64}, Float64, 12}, f::ForwardDiff.Dual{ForwardDiff.Tag{SciMLBase.UJacobianWrapper{ODEFunction{true, BeelerReuterCpu{Float64}, LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Float64, NTuple{8, DiffEqBase.DiffCache{Matrix{Float64}, Matrix{ForwardDiff.Dual{nothing, Float64, 12}}}}}, Float64}, Float64, 12}, c::ForwardDiff.Dual{ForwardDiff.Tag{SciMLBase.UJacobianWrapper{ODEFunction{true, BeelerReuterCpu{Float64}, LinearAlgebra.UniformScaling{Bool},
Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Float64, NTuple{8, DiffEqBase.DiffCache{Matrix{Float64}, Matrix{ForwardDiff.Dual{nothing, Float64, 12}}}}}, Float64}, Float64, 12})
using DifferentialEquations, Sundials, ForwardDiff, Plots
using DiffEqBase: get_tmp, dualcache

const v0 = -84.624
const v1 = 10.0
const C_K1 = Float64(1.0f0)
const C_x1 = Float64(1.0f0)
const C_Na = Float64(1.0f0)
const C_s = Float64(1.0f0)
const D_Ca = Float64(0.0f0)
const D_Na = Float64(0.0f0)
const g_s = Float64(0.09f0)
const g_Na = Float64(4.0f0)
const g_NaC = Float64(0.005f0)
const ENa = Float64(50.0f0) + D_Na
const γ = Float64(0.5f0)
const C_m = Float64(1.0f0)


mutable struct BeelerReuterCpu{T} <: Function
    t::T              # the last timestep time to calculate Δt
    diff_coef::T      # the diffusion-coefficient (coupling strength)


    function BeelerReuterCpu(u0::Array{T,2}, diff_coef::T) where {T}
        self = new{T}()

        ny, nx = size(u0)
        self.t = zero(T)
        self.diff_coef = diff_coef

        return self
    end
end


# 5-point stencil
function laplacian(Δu, u)

    n1, n2 = size(u)

    # internal nodes
    for j = 2:n2-1
        for i = 2:n1-1
            @inbounds Δu[i, j] = u[i+1, j] + u[i-1, j] + u[i, j+1] + u[i, j-1] - 4 * u[i, j]
        end
    end

    # left/right edges
    for i = 2:n1-1
        @inbounds Δu[i, 1] = u[i+1, 1] + u[i-1, 1] + 2 * u[i, 2] - 4 * u[i, 1]
        @inbounds Δu[i, n2] = u[i+1, n2] + u[i-1, n2] + 2 * u[i, n2-1] - 4 * u[i, n2]
    end

    # top/bottom edges
    for j = 2:n2-1
        @inbounds Δu[1, j] = u[1, j+1] + u[1, j-1] + 2 * u[2, j] - 4 * u[1, j]
        @inbounds Δu[n1, j] = u[n1, j+1] + u[n1, j-1] + 2 * u[n1-1, j] - 4 * u[n1, j]
    end

    # corners
    @inbounds Δu[1, 1] = 2 * (u[2, 1] + u[1, 2]) - 4 * u[1, 1]
    @inbounds Δu[n1, 1] = 2 * (u[n1-1, 1] + u[n1, 2]) - 4 * u[n1, 1]
    @inbounds Δu[1, n2] = 2 * (u[2, n2] + u[1, n2-1]) - 4 * u[1, n2]
    @inbounds Δu[n1, n2] = 2 * (u[n1-1, n2] + u[n1, n2-1]) - 4 * u[n1, n2]
end

@inline function rush_larsen(g, α, β, Δt)
    inf = α /+ β)
    τ = 1.0f0 /+ β)
    return clamp(g + (g - inf) * expm1(-Δt / τ), 0.0f0, 1.0f0)
end

function update_M_cpu(g, v, Δt)
    # the condition is needed here to prevent NaN when v == 47.0
    α = isapprox(v, 47.0f0) ? 10.0f0 : -(v + 47.0f0) / (exp(-0.1f0 * (v + 47.0f0)) - 1.0f0)
    β = (40.0f0 * exp(-0.056f0 * (v + 72.0f0)))
    return rush_larsen(g, α, β, Δt)
end

function update_H_cpu(g, v, Δt)
    α = 0.126f0 * exp(-0.25f0 * (v + 77.0f0))
    β = 1.7f0 / (exp(-0.082f0 * (v + 22.5f0)) + 1.0f0)
    return rush_larsen(g, α, β, Δt)
end

function update_J_cpu(g, v, Δt)
    α = (0.55f0 * exp(-0.25f0 * (v + 78.0f0))) / (exp(-0.2f0 * (v + 78.0f0)) + 1.0f0)
    β = 0.3f0 / (exp(-0.1f0 * (v + 32.0f0)) + 1.0f0)
    return rush_larsen(g, α, β, Δt)
end

function update_D_cpu(g, v, Δt)
    α = γ * (0.095f0 * exp(-0.01f0 * (v - 5.0f0))) / (exp(-0.072f0 * (v - 5.0f0)) + 1.0f0)
    β = γ * (0.07f0 * exp(-0.017f0 * (v + 44.0f0))) / (exp(0.05f0 * (v + 44.0f0)) + 1.0f0)
    return rush_larsen(g, α, β, Δt)
end

function update_F_cpu(g, v, Δt)
    α = γ * (0.012f0 * exp(-0.008f0 * (v + 28.0f0))) / (exp(0.15f0 * (v + 28.0f0)) + 1.0f0)
    β = γ * (0.0065f0 * exp(-0.02f0 * (v + 30.0f0))) / (exp(-0.2f0 * (v + 30.0f0)) + 1.0f0)
    return rush_larsen(g, α, β, Δt)
end

function update_XI_cpu(g, v, Δt)
    α = (0.0005f0 * exp(0.083f0 * (v + 50.0f0))) / (exp(0.057f0 * (v + 50.0f0)) + 1.0f0)
    β = (0.0013f0 * exp(-0.06f0 * (v + 20.0f0))) / (exp(-0.04f0 * (v + 20.0f0)) + 1.0f0)
    return rush_larsen(g, α, β, Δt)
end

function update_C_cpu(g, d, f, v, Δt)
    ECa = D_Ca - 82.3f0 - 13.0278f0 * log(g+1e-11)
    kCa = C_s * g_s * d * f
    iCa = kCa * (v - ECa)
    inf = 1.0f-7 * (0.07f0 - g)
    τ = 1.0f0 / 0.07f0
    return g + (g - inf) * expm1(-Δt / τ)
end

function update_gates_cpu(u, XI, M, H, J, D, F, C, Δt)
    # let Δt = eltype(u).(Δt)
    n1, n2 = size(u)
    for j = 1:n2
        for i = 1:n1
            v = u[i, j]

            XI[i, j] = update_XI_cpu(XI[i, j], v, Δt)
            M[i, j] = update_M_cpu(M[i, j], v, Δt)
            H[i, j] = update_H_cpu(H[i, j], v, Δt)
            J[i, j] = update_J_cpu(J[i, j], v, Δt)
            D[i, j] = update_D_cpu(D[i, j], v, Δt)
            F[i, j] = update_F_cpu(F[i, j], v, Δt)

            C[i, j] = update_C_cpu(C[i, j], D[i, j], F[i, j], v, Δt)
        end
    end
    # end
end

# iK1 is the inward-rectifying potassium current
function calc_iK1(v)
    ea = exp(0.04f0 * (v + 85.0f0))
    eb = exp(0.08f0 * (v + 53.0f0))
    ec = exp(0.04f0 * (v + 53.0f0))
    ed = exp(-0.04f0 * (v + 23.0f0))
    return 0.35f0 * (
        4.0f0 * (ea - 1.0f0) / (eb + ec) +
        0.2f0 * (isapprox(v, -23.0f0) ? 25.0f0 : (v + 23.0f0) / (1.0f0 - ed))
    )
end

# ix1 is the time-independent background potassium current
function calc_ix1(v, xi)
    ea = exp(0.04f0 * (v + 77.0f0))
    eb = exp(0.04f0 * (v + 35.0f0))
    return xi * 0.8f0 * (ea - 1.0f0) / eb
end

# iNa is the sodium current (similar to the classic Hodgkin-Huxley model)
function calc_iNa(v, m, h, j)
    return C_Na * (g_Na * m^3 * h * j + g_NaC) * (v - ENa)
end

# iCa is the calcium current
function calc_iCa(v, d, f, c)
    ECa = D_Ca - 82.3f0 - 13.0278f0 * log(c+1e-11)    # ECa is the calcium reversal potential
    return C_s * g_s * d * f * (v - ECa)
end

function update_du_cpu(du, u, XI, M, H, J, D, F, C)
    n1, n2 = size(u)

    for j = 1:n2
        for i = 1:n1
            v = u[i, j]

            # calculating individual currents
            iK1 = calc_iK1(v)
            ix1 = calc_ix1(v, XI[i, j])
            iNa = calc_iNa(v, M[i, j], H[i, j], J[i, j])
            iCa = calc_iCa(v, D[i, j], F[i, j], C[i, j])

            # # total current
            I_sum = iK1 + ix1 + iNa + iCa

            # # the reaction part of the reaction-diffusion equation
            du[i, j] = -I_sum / C_m
        end
    end
end


function (f::BeelerReuterCpu)(du, u, p, t)
    XI_cache,M_cache,H_cache,J_cache,D_cache,F_cache,C_cache,Δu_cache=p
        XI_cache=DiffEqBase.get_tmp(XI_cache, u)
        M_cache=DiffEqBase.get_tmp(M_cache, u)
        H_cache=DiffEqBase.get_tmp(H_cache, u)
        J_cache=DiffEqBase.get_tmp(J_cache, u)
        D_cache=DiffEqBase.get_tmp(D_cache, u)
        F_cache=DiffEqBase.get_tmp(F_cache, u)
        C_cache=DiffEqBase.get_tmp(C_cache, u)
        Δu_cache=DiffEqBase.get_tmp(Δu_cache, u)

    Δt = t - f.t

    if Δt != eltype(u)(0) || t == eltype(u)(0)
        update_gates_cpu(u, XI_cache, M_cache, H_cache, J_cache, D_cache, F_cache, C_cache, Δt)
        f.t = t
    end

    laplacian(Δu_cache, u)

    # # calculate the reaction portion
    update_du_cpu(du, u, XI_cache, M_cache, H_cache, J_cache, D_cache, F_cache, C_cache)

    # # ...add the diffusion portion
    du .+= f.diff_coef .* Δu_cache
end


function odetest(u0,solver)
    T=eltype(u0)
    ny,nx = size(u0)
    C_cache = DiffEqBase.dualcache(fill(convert(T,0.0001f0), (ny, nx)))
    M_cache = DiffEqBase.dualcache(fill(convert(T,0.01f0), (ny, nx)))
    H_cache = DiffEqBase.dualcache(fill(convert(T,0.988f0), (ny, nx)))
    J_cache = DiffEqBase.dualcache(fill(convert(T,0.975f0), (ny, nx)))
    D_cache = DiffEqBase.dualcache(fill(convert(T,0.003f0), (ny, nx)))
    F_cache = DiffEqBase.dualcache(fill(convert(T,0.994f0), (ny, nx)))
    XI_cache = DiffEqBase.dualcache(fill(convert(T,0.0001f0), (ny, nx)))
    Δu_cache = DiffEqBase.dualcache(zeros(T,ny, nx))
    p = (C_cache,M_cache,H_cache,J_cache,D_cache,F_cache,XI_cache,Δu_cache)
    deriv_cpu = BeelerReuterCpu(u0, 1.0);
    prob = ODEProblem(deriv_cpu, u0, (0.0, 50.),p)
    @time sol = solve(prob, solver, saveat = 100.0)
    deriv_cpu.t = 0.0
    heatmap(sol[end])
end    

N = 50;
u0 = fill(v0, (N, N));
u0[23:27,23:27] .= v1;   # a small square in the middle of the domain

odetest(u0,CVODE_BDF(linear_solver=:GMRES))
odetest(u0,TRBDF2(autodiff = true))
odetest(u0,TRBDF2(linsolve=LinSolveGMRES(),autodiff=false))
odetest(u0,KenCarp4(linsolve=LinSolveGMRES(),autodiff=false))

@ChrisRackauckas
Copy link
Member

The model might just not be that stable. decrease the tolerance?

@JianghuiDu
Copy link
Author

JianghuiDu commented Apr 21, 2021

Decreasing reltol and abstol down to 1e-15 doesn't make a difference. However, decreasing dtmax does make the negative number in ERROR: DomainError with (a negative number) smaller , like with dtmax = 1e-5 this negative number is comparable to the rounding error (1e-15). But still it's not possible to make it positive even with dtmax = 1e-15 . I tried KenCarp4, ABDF2, TRBDF2 and the error persists. None have the problem if autodiff is turned off.

@JianghuiDu
Copy link
Author

JianghuiDu commented Apr 21, 2021

Also, it seems the DiffEqBase.dualcache can only be passed as parameters p to the ODE function (f::BeelerReuterCpu)(du, u, p, t), and cannot go inside the definition of mutable struct BeelerReuterCpu that construct the OED function. Is that correct?

@ChrisRackauckas
Copy link
Member

They can go into the struct.

@JianghuiDu
Copy link
Author

JianghuiDu commented Apr 23, 2021

I can't get it work to put the cache inside the struct.
If I only add a cache inside like

mutable struct BeelerReuterCpu{T} <: Function
    t::T              # the last timestep time to calculate Δt
    diff_coef::T      # the diffusion-coefficient (coupling strength)
    C_cache::DiffEqBase.DiffCache
    M_cache::DiffEqBase.DiffCache
    H_cache::DiffEqBase.DiffCache
    J_cache::DiffEqBase.DiffCache
    D_cache::DiffEqBase.DiffCache
    F_cache::DiffEqBase.DiffCache
    XI_cache::DiffEqBase.DiffCache
    Δu_cache::DiffEqBase.DiffCache


    function BeelerReuterCpu(u0::Array{T,2}, diff_coef::T) where {T}
        self = new{T}()

        ny, nx = size(u0)
        self.t = zero(T)
        self.diff_coef = diff_coef
        self.C_cache = DiffEqBase.dualcache(fill(convert(T,0.0001), (ny, nx)))
        self.M_cache = DiffEqBase.dualcache(fill(convert(T,0.01), (ny, nx)))
        self.H_cache = DiffEqBase.dualcache(fill(convert(T,0.988), (ny, nx)))
        self.J_cache = DiffEqBase.dualcache(fill(convert(T,0.975), (ny, nx)))
        self.D_cache = DiffEqBase.dualcache(fill(convert(T,0.003), (ny, nx)))
        self.F_cache = DiffEqBase.dualcache(fill(convert(T,0.994), (ny, nx)))
        self.XI_cache = DiffEqBase.dualcache(fill(convert(T,0.0001), (ny, nx)))
        self.Δu_cache = DiffEqBase.dualcache(zeros(T,ny, nx))
        
        return self
    end
end

and do

function (f::BeelerReuterCpu)(du, u, p, t)
# something
    f.XI_cache=DiffEqBase.get_tmp(f.XI_cache, u)
#something
end

in the function body then the type mismatches since the right hand side becomes Matrix{Float64} while the left hand side remains DiffEqBase.dualcache.

So I though may be I need both the original arrays and their caches in the struct like

mutable struct BeelerReuterCache{T}
    C_cache::DiffEqBase.DiffCache
    M_cache::DiffEqBase.DiffCache
    H_cache::DiffEqBase.DiffCache
    J_cache::DiffEqBase.DiffCache
    D_cache::DiffEqBase.DiffCache
    F_cache::DiffEqBase.DiffCache
    XI_cache::DiffEqBase.DiffCache
    Δu_cache::DiffEqBase.DiffCache

    function BeelerReuterCache(u0::Array{T,2}) where{T}
        self = new{T}()
        ny,nx = size(u0)
        self.C_cache = DiffEqBase.dualcache(fill(convert(T,0.0001), (ny, nx)))
        self.M_cache = DiffEqBase.dualcache(fill(convert(T,0.01), (ny, nx)))
        self.H_cache = DiffEqBase.dualcache(fill(convert(T,0.988), (ny, nx)))
        self.J_cache = DiffEqBase.dualcache(fill(convert(T,0.975), (ny, nx)))
        self.D_cache = DiffEqBase.dualcache(fill(convert(T,0.003), (ny, nx)))
        self.F_cache = DiffEqBase.dualcache(fill(convert(T,0.994), (ny, nx)))
        self.XI_cache = DiffEqBase.dualcache(fill(convert(T,0.0001), (ny, nx)))
        self.Δu_cache = DiffEqBase.dualcache(zeros(T,ny, nx))

        return self
    end
    
end

# BeelerReuterCache(zeros(10,10))

mutable struct BeelerReuterCpu{T} <: Function
    t::T              # the last timestep time to calculate Δt
    diff_coef::T      # the diffusion-coefficient (coupling strength)
    cache::BeelerReuterCache

    C::Array{T,2}
    M::Array{T,2}
    H::Array{T,2}
    J::Array{T,2}
    D::Array{T,2}
    F::Array{T,2}
    XI::Array{T,2}
    Δu::Array{T,2}


    function BeelerReuterCpu(u0::Array{T,2}, diff_coef::T) where {T}
        self = new{T}()

        ny, nx = size(u0)
        self.t = zero(T)
        self.diff_coef = diff_coef
        self.cache = BeelerReuterCache(u0)

        self.C =fill(convert(T,0.0001), (ny, nx))
        self.M =fill(convert(T,0.01), (ny, nx))
        self.H = fill(convert(T,0.988), (ny, nx))
        self.J = fill(convert(T,0.975), (ny, nx))
        self.D = fill(convert(T,0.003), (ny, nx))
        self.F = fill(convert(T,0.994), (ny, nx))
        self.XI = fill(convert(T,0.0001), (ny, nx))
        self.Δu = zeros(T,ny, nx)

        return self
    end
end

but that doesn't work either with

function (f::BeelerReuterCpu)(du, u, p, t)
# something
    f.XI=DiffEqBase.get_tmp(f.cache.XI_cache, u)
#something
end

Now the classic error MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{typeof(ff), Float64}, Float64, 12}) comes back.

It seems I need something to convert the cache to the original array and backwards too?

Anyway, what is the advantage of creating functions in this way like function (f::BeelerReuterCpu)(du, u, p, t) where you can access the parameters and caches directly using f. compared to say just define a separate function f(du,u,p,t) and pass the parameters and caches through p using p.?

Any help is greatly appreciated.

@ChrisRackauckas
Copy link
Member

C_cache::DiffEqBase.DiffCache

that should be a parameterized type.

@JianghuiDu
Copy link
Author

I realized that in XI=DiffEqBase.get_tmp(f.XI_cache, u) the left hand side doesn't need to be allocated. I was trying to allocate for that and that was the problem.

Now this works

mutable struct BeelerReuterCpu{T,chunk_size} <: Function
    t::T              # the last timestep time to calculate Δt
    diff_coef::T      # the diffusion-coefficient (coupling strength)
    C_cache::DiffEqBase.DiffCache{Array{T,2},Array{ForwardDiff.Dual{nothing,T,chunk_size},2}}
    M_cache::DiffEqBase.DiffCache{Array{T,2},Array{ForwardDiff.Dual{nothing,T,chunk_size},2}}
    H_cache::DiffEqBase.DiffCache{Array{T,2},Array{ForwardDiff.Dual{nothing,T,chunk_size},2}}
    J_cache::DiffEqBase.DiffCache{Array{T,2},Array{ForwardDiff.Dual{nothing,T,chunk_size},2}}
    D_cache::DiffEqBase.DiffCache{Array{T,2},Array{ForwardDiff.Dual{nothing,T,chunk_size},2}}
    F_cache::DiffEqBase.DiffCache{Array{T,2},Array{ForwardDiff.Dual{nothing,T,chunk_size},2}}
    XI_cache::DiffEqBase.DiffCache{Array{T,2},Array{ForwardDiff.Dual{nothing,T,chunk_size},2}}
    Δu_cache::DiffEqBase.DiffCache{Array{T,2},Array{ForwardDiff.Dual{nothing,T,chunk_size},2}}


    function BeelerReuterCpu(u0::Array{T,2}, diff_coef::T,::Val{chunk_size}) where {T,chunk_size}
        self = new{T,chunk_size}()

        ny, nx = size(u0)
        self.t = zero(T)
        self.diff_coef = diff_coef
        self.C_cache = DiffEqBase.dualcache(fill(convert(T,0.0001), (ny, nx)),Val{chunk_size})
        self.M_cache = DiffEqBase.dualcache(fill(convert(T,0.01), (ny, nx)),Val{chunk_size})
        self.H_cache = DiffEqBase.dualcache(fill(convert(T,0.988), (ny, nx)),Val{chunk_size})
        self.J_cache = DiffEqBase.dualcache(fill(convert(T,0.975), (ny, nx)),Val{chunk_size})
        self.D_cache = DiffEqBase.dualcache(fill(convert(T,0.003), (ny, nx)),Val{chunk_size})
        self.F_cache = DiffEqBase.dualcache(fill(convert(T,0.994), (ny, nx)),Val{chunk_size})
        self.XI_cache = DiffEqBase.dualcache(fill(convert(T,0.0001), (ny, nx)),Val{chunk_size})
        self.Δu_cache = DiffEqBase.dualcache(zeros(T,ny, nx),Val{chunk_size})

        return self
    end
end


function (f::BeelerReuterCpu)(du, u, p, t)
        XI=DiffEqBase.get_tmp(f.XI_cache, u)
        M=DiffEqBase.get_tmp(f.M_cache, u)
        H=DiffEqBase.get_tmp(f.H_cache, u)
        J=DiffEqBase.get_tmp(f.J_cache, u)
        D=DiffEqBase.get_tmp(f.D_cache, u)
        F=DiffEqBase.get_tmp(f.F_cache, u)
        C=DiffEqBase.get_tmp(f.C_cache, u)
        Δu=DiffEqBase.get_tmp(f.Δu_cache, du)

    Δt = t - f.t

    if Δt != eltype(u)(0) || t == eltype(u)(0)
        update_gates_cpu(u, XI, M, H, J, D, F, C, Δt)
        f.t = t
    end

    laplacian(Δu, u)

    # # calculate the reaction portion
    update_du_cpu(du, u, XI, M, H, J, D, F, C)

    # # ...add the diffusion portion
    du .+= f.diff_coef .* Δu
end

Also this immutable version of the function struct with mutable cache struct

mutable struct BeelerReuterCpuCache{T,chunk_size}
    t::T
    C::DiffEqBase.DiffCache{Array{T,2},
                            Array{ForwardDiff.Dual{nothing,T,chunk_size},2}}
    M::DiffEqBase.DiffCache{Array{T,2},
                            Array{ForwardDiff.Dual{nothing,T,chunk_size},2}}
    H::DiffEqBase.DiffCache{Array{T,2},
                            Array{ForwardDiff.Dual{nothing,T,chunk_size},2}}
    J::DiffEqBase.DiffCache{Array{T,2},
                            Array{ForwardDiff.Dual{nothing,T,chunk_size},2}}
    D::DiffEqBase.DiffCache{Array{T,2},
                            Array{ForwardDiff.Dual{nothing,T,chunk_size},2}}
    F::DiffEqBase.DiffCache{Array{T,2},
                            Array{ForwardDiff.Dual{nothing,T,chunk_size},2}}
    XI::DiffEqBase.DiffCache{Array{T,2},
                             Array{ForwardDiff.Dual{nothing,T,chunk_size},2}}
    Δu::DiffEqBase.DiffCache{Array{T,2},
                             Array{ForwardDiff.Dual{nothing,T,chunk_size},2}}
end

struct BeelerReuterCpu{T,chunk_size}
    #     t::T              # the last timestep time to calculate Δt
    diff_coef::T      # the diffusion-coefficient (coupling strength)
    cache::BeelerReuterCpuCache{T,chunk_size}
end

function BeelerReuterCpu(u0::Array{T,2}, diff_coef::T,
                         ::Val{chunk_size}) where {T,chunk_size}
    ny, nx = size(u0)
    t = zero(T)
    diff_coef = diff_coef
    C = DiffEqBase.dualcache(fill(convert(T, 0.0001), (ny, nx)),
                             Val{chunk_size})
    M = DiffEqBase.dualcache(fill(convert(T, 0.01), (ny, nx)), Val{chunk_size})
    H = DiffEqBase.dualcache(fill(convert(T, 0.988), (ny, nx)), Val{chunk_size})
    J = DiffEqBase.dualcache(fill(convert(T, 0.975), (ny, nx)), Val{chunk_size})
    D = DiffEqBase.dualcache(fill(convert(T, 0.003), (ny, nx)), Val{chunk_size})
    F = DiffEqBase.dualcache(fill(convert(T, 0.994), (ny, nx)), Val{chunk_size})
    XI = DiffEqBase.dualcache(fill(convert(T, 0.0001), (ny, nx)),
                              Val{chunk_size})
    Δu = DiffEqBase.dualcache(zeros(T, ny, nx), Val{chunk_size})
    cache = BeelerReuterCpuCache{T,chunk_size}(t, C, M, H, J, D, F, XI, Δu)

    return BeelerReuterCpu{T,chunk_size}(diff_coef, cache)
end

There's no performance difference. I thought the immutable version might be better... Anyway.

Thanks for the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants