Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mumps solver updates #39

Open
wants to merge 23 commits into
base: main
Choose a base branch
from
Open

Mumps solver updates #39

wants to merge 23 commits into from

Conversation

mplough-kobold
Copy link

@mplough-kobold mplough-kobold commented Aug 17, 2023

This PR contains updates that I have made to the Mumps solver.

Mumps changes:

  • Shared Mumps factorizations are now correctly destroyed only when the last matrix copy is destroyed. Previously, a shared factorization was deleted when any matrix copy was destroyed. As a result, solves silently failed.
  • The Mumps solver now solves (A^T)x = b without refactorization and not (A^H)x = b (where T is the transpose and H is the conjugate transpose) when A is complex. The Mumps solver’s transpose operator now behaves the same way as all the other solvers, and we ensure that Mumps(A.T) * x and Mumps(A).T * x commute.
  • Solvers now support the .conjugate() method, allowing for taking the conjugate transpose via Solver(A).T.conjugate(). The Mumps solver now supports solving the conjugate transpose problem (A^H)x = b without refactorization via Mumps(A).T.conjugate(). Unit tests are included ensure this case works.
  • New Mumps error codes are now included. If an unknown code is returned, that’s handled too.
  • Mumps warnings are now correctly handled as flags.
  • Mumps unit tests are now uncommented and all pass, including the doubly commented singular matrix test.
  • It’s now possible to run Mumps with both general symmetric and symmetric, positive definite matrices. Note that the current version of Mumps (5.6.1) has no specific optimizations for Hermitian matrices.
  • It’s now possible to import the Mumps solver just like other solvers - that is: from pymatsolver import Mumps
  • Standardized supported attribute names so that the same SimPEG simulation solver_opts may be used with both the Pardiso and Mumps solvers.
  • Mumps runs the whole Pardiso test suite (except for threading and refactorization), ensuring that both solvers work the same way.

Maintenance:

  • pymatsolver unit tests are ported from obsolete nose syntax to pytest
  • Mumps and Pardiso unit tests are skipped when libraries are not available, and the test framework now warns you that the tests have been skipped

The Mumps test_singular test is available now.

The Mumps test_1to5_T test is still failing.
Also correctly handle warnings as flags.
There was an issue where if you copied a matrix and the copy passed out
of scope, the original matrix could no longer be used.  This issue was
the reason that the test_1to5_T test was failing.
Otherwise you get
    ValueError: dimension mismatch
when the solver doesn't do the reshaping
It's now possible to use the same solver_opts values in SimPEG for both
the Pardiso and MUMPS solvers.

We now run all the Pardiso test cases for Mumps as well.  Running these
test cases uncovered a Mumps issue where matrix transposes didn't
commute when operating on complex matrices.  If you asked the Mumps
solver:

    Ainv = Mumps(A)
    AinvT_1 = Mumps(A).T
    AinvT_2 = Mumps(A.T)

then the two inverse transposes weren't equal when A is a complex matrix.

For small matrices, you can look at the inverse transposes by solving
this trivial system:

    AinvT_1 * np.identity(A.size) != AinvT_2 * np.identity(A.size)

This issue occurred because pymatsolver treated the complex case as the
conjugate transpose -- but we weren't asking for the conjugate
transpose; we were explicitly asking for just the transpose.

There was no issue with Mumps itself, just a few extra lines of Fortran
code in the Mumps interface.  Now that those lines are removed, the Mumps
solver and the Pardiso solver operate identically with these transposes.
@codecov
Copy link

codecov bot commented Aug 18, 2023

Codecov Report

Merging #39 (cde8692) into main (b54ee1d) will decrease coverage by 1.49%.
The diff coverage is 61.11%.

@@            Coverage Diff             @@
##             main      #39      +/-   ##
==========================================
- Coverage   89.26%   87.78%   -1.49%     
==========================================
  Files           5        5              
  Lines         298      311      +13     
==========================================
+ Hits          266      273       +7     
- Misses         32       38       +6     
Files Changed Coverage Δ
pymatsolver/__init__.py 84.61% <57.14%> (-5.39%) ⬇️
pymatsolver/solvers.py 87.16% <63.63%> (-2.20%) ⬇️

if hasattr(attr, "fset") and attr.fset is not None:
properties_with_setters.append(a)
kwargs = {attr: getattr(self, attr) for attr in properties_with_setters}

newMS = self.__class__(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the point before was that mumps could use a factorization of A to solve systems with A.T instead of refactoring the entire matrix, as like what will happen here (This might not be the case though.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch - and yes, that's exactly the point here. Fixed in c848dd8.

I must have introduced this while debugging why the transpose was resulting in a Hermitian transpose for complex matrices (which is not how the pardiso or the rest of the solvers handle this case).

The following are now equivalent but AinvT1 doesn't need a refactorization:

AinvT1 = Mumps(A).T
AinvT2 = Mumps(A.T)

This behavior matches all the other solvers.

I contemplated adding a .H (Hermitian transpose) operator such that:

Mumps(A).H == Mumps(A.T.conjugate())

However, this presents a floating-point +0 / -0 problem that I'm not thrilled about. Given the matrix

[ 1 + 0j,  0 + 0j ]
[ 0 + 0j,  1 + 1j ]

the inverse is

[ 1 + 0j, 0   +   0j ]
[ 0 + 0j, 0.5 - 0.5j ]

Since Mumps doesn't take the Hermitian transpose itself, we'd need to emulate that behavior by conjugating the RHS, telling Mumps to solve the transpose problem, and then conjugating again.

The inverse of the matrix's conjugate transpose (i.e. Mumps(A.T.conjugate()) is

[ 1 + 0j, 0   +   0j ]
[ 0 + 0j, 0.5 + 0.5j ]

as computed by Mumps, scipy, or pardiso.

However, the inverse as computed with the emulation (i.e. Mumps(A).H) is

[ 1 - 0j, 0   -   0j ]
[ 0 - 0j, 0.5 + 0.5j ]

That's a little iffy for me. I'd rather have the solvers all behave the same way on simple cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I'm not concerned about the differences between +0 and -0 (A floating point thing), I don't think we're ever doing anything that divides by 0 which would be the only place that makes a difference.

Copy link
Author

@mplough-kobold mplough-kobold Aug 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That has me convinced.

I went away from the .H property because the scipy.sparse API doesn't have that property and I wanted to present the same interface here. To take the Hermitian transpose of a scipy.sparse matrix A, you run A.T.conjugate(), so I added a conjugate operator here. I also added unit tests to ensure that Mumps(A).T.conjugate() gives the same results as Mumps(A.T.conjugate()).

@jcapriot
Copy link
Member

What steps are you using to link this to mumps?
My ideal pathway for this is to create a compiled cython interface to the solver (similar to pydiso) that is completely separated from pymatsolver, then use that interface here.
The interface could likely be linked to mumps installed through conda-forge. https://github.com/conda-forge/mumps-feedstock

  • The feedstock on windows only does the sequential version (not sure if it is openmp though).

There are a few python interfaces already out there,

  • pymumps package interface to the solver
    • only supported double
    • Only OSx and Linux pre-built libraries on condo-forge (no windows).
  • mumpy
    • unmaintained
    • not on coda-forge

@mplough-kobold
Copy link
Author

What steps are you using to link this to mumps?

At the moment I'm compiling MUMPS from source using https://github.com/scivision/mumps, which lets me use the latest version of MUMPS (5.6.1) rather than the last version available on conda-forge (5.2.1). The conda-forge feedstock was last updated a year and 5 months ago.

I am not using the OpenMPI-parallel version of MUMPS, relying instead on OpenMP and BLAS parallelism.

Aside from a migration to setuptools due to the impending removal of distutils from Python 3.12, this PR only contains code changes and omits packaging changes. I'm currently experimenting with packaging in my matt-packaging branch. In that branch I'm currently using the deprecated numpy.distutils to use f2py to build this repository's MumpsInterface Fortran extension.

With MUMPS 5.6.1 built and installed, a pip install . from the matt-packaging branch works. Of course, pip install . totally bombs on that branch if MUMPS is not installed. That result is good enough for me but it's obviously not good enough for general release.

My ideal pathway for this is to create a compiled cython interface to the solver (similar to pydiso) that is completely separated from pymatsolver, then use that interface here. The interface could likely be linked to mumps installed through conda-forge.
...

There are a few python interfaces already out there,

  • pymumps package interface to the solver

    • only supported double
    • Only OSx and Linux pre-built libraries on condo-forge (no windows).
  • mumpy

    • unmaintained
    • not on conda-forge

I agree that a Cython interface is the way to go here. A Cython interface like pydiso's will be far easier to support than the current Fortran / f2py interface. Even without binary wheel distributions it won't be necessary to have a Fortran compiler installed. MUMPS's C interface means that writing that interface won't be too terribly hard, and we'll be able to move from distutils to a setuptools extension module. That'll make it very easy to support Python 3.12.

I'd like to avoid completely separating the MUMPS Cython interface from pymatsolver, though.

First, we can still package pymatsolver with a MUMPS Cython extension and have installation succeed even without MUMPS installed. Cython will make it very easy to package the extension with setuptools and the setuptools ext_module optional keyword "specifies that a build failure in the extension should not abort the build process, but simply not install the failing extension."

Second, our current MUMPS interface needs are very small in scope. It would be fairly straightforward to write a small extension that supports only the functionality we need from MUMPS -- just init, a helper to convert to MUMPS format, factor, solve, and destroy.

Third, the existing Python MUMPS interfaces are dormant. The last commit to pymumps was on October 27, 2020 and its last release was on November 5, 2018. The last commit to mumpy was on November 14, 2017 and it has never had a release.

Ultimately I'd like to release a version of pymatsolver with good MUMPS support, and to do so with a minimum amount of work. Lack of MUMPS support makes it impossible to have a performant SimPEG on Apple's M1 hardware. I'd like to close that gap pragmatically -- and quickly if possible.

@jcapriot
Copy link
Member

For the Apple silicon processors I was experimenting with apple’s sparse solvers in their accelerate framework. It works well for most of our systems because they are SPD, but they do not support complex matrices. https://developer.apple.com/documentation/accelerate/sparse_solvers/

@mplough-kobold
Copy link
Author

It's too bad about Apple's solvers. MUMPS does really well on Apple's M1 machines though, and it's also portable to ARM servers like AWS's Graviton processors.

@mplough-kobold
Copy link
Author

Thanks for the discussion at last night's meeting! I'm going to move this out of draft status. As discussed we'll handle packaging separately.

@mplough-kobold mplough-kobold marked this pull request as ready for review August 24, 2023 14:26
@mplough-kobold mplough-kobold changed the title Mumps solver update (for discussion) Mumps solver updates Aug 24, 2023
@akhmerov
Copy link

Hi all,

I'm taking over the development of mumpy, a mumps wrapper that you have found. My current goals are:

  • Develop a modern Python wrapper for mumps using cython + meson build (recently adopted by scipy stack).
  • Provide a somewhat complete high level interface to mumps functionality rather than only factor/solve.
  • Prioritize mumps from conda-forge and linux as well as distribution via conda-forge
  • Consider OSX and Windows within the scope, MPI as a stretch goal.

Our work is for now over here: https://gitlab.kwant-project.org/kwant/python-mumps, and it is very much in progress.

Do you have any feedback? Is that project relevant to this package?

On a separate note, I made a proposal to scipy that is similar in spirit to the main point of this package: scipy/scipy#14485.

@akhmerov
Copy link

akhmerov commented Feb 4, 2024

An update: python-mumps is now released on pypi and conda-forge on all platforms.

@mplough-kobold
Copy link
Author

@akhmerov - the python-mumps project is definitely relevant to this package. It's not the focus of my attention right now so please don't interpret my delay in responding as a lack of interest in your work.

Why python-mumps is relevant

The SimPEG geophysics simulation package requires a performant sparse solver to complete simulations in a reasonable amount of time. This package provides an abstraction layer and an interface for sparse solvers so that SimPEG doesn't have to understand the specifics of every solver.

The best direct solvers at this point are Intel's MKL Pardiso and MUMPS. MKL Pardiso only runs on x86-64 processors while MUMPS can also run on ARM. Performance is fairly similar. The pymatsolver package brings in MKL Pardiso support via pydiso. It currently has a very small interface to MUMPS; this PR makes that interface good enough for our purposes.

However, this PR has really just served as a conversation starter; the approach is a bit of a dead end. In the future, we'd like pymatsolver to be able to use a separate interface package to talk to MUMPS in much the same manner that it uses pydiso to talk to MKL Pardiso.

Your attention here is exactly what we need - thanks for making us aware of your work.

Some immediate needs

Getting rid of f2py and moving to meson will be really good - moving away from the legacy build tooling will unlock the ability to use Python 3.12.

We'd also like the solver interface to be able to release the GIL when doing computations. SimPEG has some level of Dask support and we have users parallelizing using Dask right now. The inability to release the GIL means that workers time out, making simulation runs less reliable. Some attempts have been made to release the GIL in pydiso but I don't believe that the work is complete.

python-mumps gaps

Looking in PyPI, the python-mumps package as of 0.0.1.post1 only offers binaries for linux / x86-64. Is there a plan to offer binaries for linux / ARM? There's not really a need at this point for SimPEG to use MUMPS on x86-64 - if someone is running on that platform then the existing interface to MKL Pardiso will work fine.

@akhmerov
Copy link

Thanks for the explanation.

Looking in PyPI, the python-mumps package as of 0.0.1.post1 only offers binaries for linux / x86-64. Is there a plan to offer binaries for linux / ARM?

I prioritize conda-forge over pypi because their build machinery is more standardized. As far as I understand, it arm should already be available there.

@jcapriot
Copy link
Member

Currently, python-mumps is only available for x86-64 systems on conda-forge
https://anaconda.org/conda-forge/python-mumps
If you want, I can help with getting it set up for osx-arm builds as well.

@akhmerov
Copy link

Wow, that would be awesome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants