Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extreme compile times for some functions #9

Open
baggepinnen opened this issue Jan 13, 2022 · 7 comments
Open

Extreme compile times for some functions #9

baggepinnen opened this issue Jan 13, 2022 · 7 comments

Comments

@baggepinnen
Copy link
Contributor

I've experienced quite extreme compile times for some functions, here's an example

julia> using DescriptorSystems

julia> G = rss(2,2,4);

julia> @time gnugap(G, G)
 59.315304 seconds (185.38 M allocations: 10.129 GiB, 6.14% gc time, 99.97% compilation time)
(6.598356591423948e-16, 2.55648096966907)

This problem is likely to be related to #8, since type piracy is likely to lead to method invalidations and inference problems that in turn cause recompilation of a lot of base methods. To investigate this hypothesis, we can make use of SnoopCompile.jl

using Revise
using SnoopCompileCore
using DescriptorSystems;
G = rss(2,2,4);

tinf = @snoopi_deep gnugap(G, G)

using ProfileView, SnoopCompile
ProfileView.view(flamegraph(tinf))

This displays the following graph
Screenshot from 2022-01-13 14-02-07
The colored flames indicates stacks, and their extent in the horizontal direction indicate their run time. Notice how there is a large horizontal gap in the flame graph, followed immediately by the stack for Base.typed_hvcat (where I've held the mouse to display the method name in the screenshot. This is a clear indication that the type piracy of #8 causes huge compile times, likely due to type inference problems.

Running

@code_warntype [G G]

indicates that there are type inference problems present when concatenating systems:

julia> @code_warntype [G G]
MethodInstance for hcat(::DescriptorStateSpace{Float64}, ::DescriptorStateSpace{Float64})
  from hcat(SYS1::DescriptorStateSpace, SYS2::DescriptorStateSpace) in DescriptorSystems at /home/fredrikb/.julia/packages/DescriptorSystems/20HLg/src/connections.jl:71
Arguments
  #self#::Core.Const(hcat)
  SYS1::DescriptorStateSpace{Float64}
  SYS2::DescriptorStateSpace{Float64}
Locals
  D::Matrix{Float64}
  C::Matrix{Float64}
  B::Matrix{Float64}
  blockdims::Vector{Int64}
  E::Any
  A::Matrix{Float64}
  Ts::Float64
  T::Type{Float64}
  ny::Any
  @_13::Union{UniformScaling, Matrix{Float64}}
  @_14::Union{UniformScaling, Matrix{Float64}}

notice variables E::Any and @_13::Union{UniformScaling, Matrix{Float64}}.

The other gaps in the flame graph points to the folowing methods

function Base.hvcat(rows::Tuple{Vararg{Int}}, A::Union{DescriptorStateSpace, AbstractVecOrMat{<:Number}, Number, UniformScaling}...)

as well as https://github.com/andreasvarga/DescriptorSystems.jl/blob/main/src/dss.jl#L52

@baggepinnen
Copy link
Contributor Author

Digging a bit deeper, it turns out that we're also suffering from invenia/Intervals.jl#144

julia> using SnoopCompile

julia> invalidations = @snoopr begin
           using DescriptorSystems;
           G = rss(2,2,4);
           gnugap(G, G)
       end;

julia> trees = SnoopCompile.invalidation_trees(invalidations);

julia> @show length(SnoopCompile.uinvalidated(invalidations)) # show total invalidations
length(SnoopCompile.uinvalidated(invalidations)) = 1900
1900

julia> show(trees[end]) # show the most invalidated method
inserting convert(::Type{T}, interval::Intervals.AnchoredInterval{P, T}) where {P, T} in Intervals at /home/fredrikb/.julia/packages/Intervals/ua9cq/src/anchoredinterval.jl:181 invalidated:

there are almost 2000 invalidations from running the small code snippet above which is huge, and about half of those comes from invenia/Intervals.jl#144

@andreasvarga
Copy link
Owner

Yes, I am aware of this problem. Probably you are right, an it is related to #8. You probably recall that when I tried to alleviate the issue, I faced some bugs in the implementation of hcat\vcat\hvcat, which still manifest in the LTS version 1.6.5:

julia> using LinearAlgebra

julia> [1 2; I]
ERROR: ArgumentError: number of columns of each array must match (got (2, 1))
Stacktrace:
 [1] _typed_vcat(#unused#::Type{Any}, A::Tuple{Matrix{Any}, Matrix{Any}})
   @ Base .\abstractarray.jl:1553
 [2] typed_vcat(::Type{Any}, ::Matrix{Any}, ::Matrix{Any})
   @ Base .\abstractarray.jl:1567
 [3] typed_hvcat(::Type{Any}, ::Tuple{Int64, Int64}, ::Int64, ::Vararg{Any, N} where N)
   @ Base .\abstractarray.jl:1957
 [4] hvcat(::Tuple{Int64, Int64}, ::Int64, ::Vararg{Any, N} where N)
   @ Base .\abstractarray.jl:1932
 [5] top-level scope
   @ REPL[2]:1

julia> versioninfo()
Julia Version 1.6.5
Commit 9058264a69 (2021-12-19 12:30 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i7-10700 CPU @ 2.90GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-11.0.1 (ORCJIT, skylake) 

So, I stopped investing more time in this direction, until this bug will be corrected in the LTS version, Fortunately, my implementation can overcome this Julia bug:

using DescriptorSystems

julia> [1 2; I]
3×2 Matrix{Int64}:
 1  2
 1  0
 0  1

I hope I will be able to dedicate more time to this issue after finishing the implementation of the FaultDetectionTools.jl (probably in the next 2-3 months).

Regarding the invalidation aspect, I don't understand what is going on. I never used the mentioned conversion funtion directly. It is only a guess, but this function is probably imported via the Polynomials package, which has Interval.jl among its dependencies. But I don't have explanation for these figures.

@baggepinnen
Copy link
Contributor Author

Just to clarify, the invalidation problem is not due to this package. I mentioned it on slack yesterday and got the answer that they might be able to solve it in Intervals.jl where the problem lies.

@baggepinnen
Copy link
Contributor Author

I tried this again in julia v1.8-beta1 with DescriptorSystems v1.3 and the timing has improved by 20 seconds, a great step!

julia> @time gnugap(G, G)
 38.420655 seconds (115.38 M allocations: 5.127 GiB, 4.95% gc time, 99.99% compilation time)

but it still takes quite a long time. Apparently, gnugap now manages to do the bulk of the work fairly quickly, but then there is a huge inference gap (the white space in the image below) followed by a call to dss.

image

I can't really figure out what the remaining problem is, it appears as if execution reaches the call to ghinfnorm without any problems. Running

using Revise
using SnoopCompileCore
using DescriptorSystems;
G = rss(2,2,4);

tinf = @snoopi_deep ghinfnorm(G);

using ProfileView, SnoopCompile
ProfileView.view(flamegraph(tinf))

leads to a similar looking plot with what appears to be a call to == in the file uniformscaling.jl taking a lot of time to infer
Screenshot from 2022-03-09 14-01-05

@baggepinnen
Copy link
Contributor Author

The call to == is probably

    disc || E == I || rcond(E) >= n*eps(float(real(T1))) || (return Inf, Inf)

@baggepinnen
Copy link
Contributor Author

baggepinnen commented Jun 22, 2022

It looks like the compilataion time has regressed again on julia v1.8.0-rc1

julia> using DescriptorSystems

julia> G = rss(2,2,4);

julia> @time gnugap(G, G)
103.958712 seconds (161.42 M allocations: 7.936 GiB, 4.80% gc time, 99.99% compilation time)
(6.867873289263948e-16, 1.2311753343767355)

up from 38.42 seconds a few posts up.

The same version of DescriptorSystems (v1.3.2) on julia v1.7 is a bit better, but still very long compile time

julia> @time gnugap(G, G)
 72.154130 seconds (135.50 M allocations: 6.680 GiB, 3.71% gc time, 99.95% compilation time)
(6.21299328221319e-16, Inf)

EDIT: Somthing was off with my initial benchmark, it's actually a slight bit faster on 1.8

63.557759 seconds (161.45 M allocations: 7.938 GiB, 4.64% gc time, 99.99% compilation time)

still, much worse than it was in #9 (comment)

@andreasvarga
Copy link
Owner

I fixed the issues related to possible type instability in

@code_warntype [G G]

In Julia 1.8, there are no red lines anymore in the output. Also [1 2; I] works as expected using only LinearAlgebra calls.

Inspecting with the cursor the result of ProfileView.view(flamegraph(tinf)) for the gnugap function, there are some calls to SparseMatrix.setindex! , which in my opinion should not appear, I have no idea how to get rid of this.

The compilation times are hopefully better, although probably still too long. I should have a look to the precompilation possibilities or producing sysimages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants