Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sebastianwolf/error checking #45

Open
wants to merge 34 commits into
base: master
Choose a base branch
from

Conversation

sebastianwolf
Copy link
Contributor

Dave and me (Dave was the main brain :) implemented error-checking in FastScape. See mail for additional information.

Error-checking and raising is done using macros defined in the pre-processor-file Error.fpp.
Some compilation information:
We used gnu-compilers and an extra Makefile to compile and run the code. You can have the makefile if it helps you, but it is not included in the pull request. You can find it if you go back some commits in the branch that I PRed.
Gnu needed -cpp and -ffree-line-length-none for the code to compile. The -cpp option enables pre-processing, the -ffree-line-length-none option is necessary to allow for long lines. The macros are considered as one-liner.
The latter can make problems with legacy pgi-compilers. Pre-processor option for pgi is -Mpreprocess, but if the pgi compiler is older than somewhere 2018, there is a maximum linelength of 132 signs. Newer pgi-compilers apparently have 2k allowed signs per line (as far as I could find on the world wide web). If one wants to use these old compilers, some write statements in the Error.fpp file most be shortened. This is not an issue, I think. Not many people will use old pgi. We have it on one of our servers, so I tested and found it there.
I did not check with the intel compilers.

Error codes are defined in a new file src/FastScapeErrorCodes.f90 this needs to be added to the compilation files.

How does it work?
The main change, and I hope you will like it, is that every api function got one more integer argument ‘ierr’. However, only those ctx-functions that had a stop call got the error argument as well, so that the heart of the code only has minor additions. We tried to make minimal essential and non-invasive changes.
Example:
FastScape_Init(ierr) —> got new arg, same as all other api-funcs
calls Init() —> has no arg as error can not be raised in this function.

FastScape_Execute_Step(ierr) —> got new arg, same as all other api-funcs
...
Diffusion (ierr);FSCAPE_CHKERR(ierr) —> got new arg, as solver function tridag can raise error

Once an error is raised, the subroutine is returned early and the error propagated upwards. The macros and error-raising are really powerful, because errors trace back the file, and the line where the error was raised.

List of functions that got new ierr-arg:

  • All api functions
  • in FlowRouting.f90: FlowRouting, FlowRoutingSingleFlowDirection, find_mult_rec
  • in LocalMinima.f90: LocalMinima, mst_kruskal, loc_min_3_indexx
  • in Marine.f90: Marine, SiltSandCouplingDiffusion

I was a bit concerned that you might not like that all api functions got another argument, because this might make existing examples and user-codes non-useable until all functions got an additional argument. Dave convinced me that the way we did it is the best way and helps the project the most. I hope you agree, but of course I’m open for change.

SebWolf and others added 17 commits March 30, 2021 16:26
* There are two debugging files in fastscapelib-fortran/src. The codes are in c and fortran and there is an accompanying Makefile.
* code compiles with gnu
* Error-checking changes a context-variable, instead of a variable that is local to each function.
* Fastscape_run.c contains an error (setup is called before set_nx_ny) to showcase the error-handling
…h indexing. Error 1 gives none. Either remove none, or put index +1.
* Make exception output more compact and readable
* Added addition exception types
* Added ierr to Diffusion and checked that it works.
* Note that if you define ierr=0 inside tridag, the code produces a
segmentation fault. It do not know, why, but it is not necessary
anyways.
* Added error-checking in debug-c-code
* Added errror checking with error.fpp for debug-fortran-code
* Moved setup_not_run errors to api from context.
* Now all stop calls in the code are removed and replaced by
error-checking
end if
FSCAPE_CHKERR(ierr) ! Call FSCAPE_CHKERR() so that all possible exceptions above will be displayed

call SetUp(ierr)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is an error. SetUp should not have an argument. Should be SetUp(). Sorry for that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pushed update

@benbovy
Copy link
Member

benbovy commented Apr 23, 2021

Thanks a lot @sebastianwolf and @hpc4geo for working on this!

In general, I like the approach. I'll be able to reuse ierr in the Python bindings to raise properly (and clean-up allocated arrays) instead of having a Python interpreter crash, which is nice!

I let @jeanbraun comment on the change of the signature of the API functions, since I'm not using them much. I don't have strong opinions on this, however as a maintainer I'm still a bit concerned by those breaking changes. Do you think that we could make those ierr arguments optional to ensure a smooth transition?

We're already using -cpp in CMake's build, and I don't think either that incompatibility with old pgi compilers would be an issue for us.

@sebastianwolf could you merge the master branch here or rebase this branch on top of master, please? I have updated the CI configuration in #46 so it will be easier to see if the changes made here break something or not.

@benbovy
Copy link
Member

benbovy commented Apr 26, 2021

Would you mind adding ${FASTSCAPELIB_SRC_DIR}/FastscapeErrorCodes.f90 just below this line

${FASTSCAPELIB_SRC_DIR}/FastScape_ctx.f90
to fix CI builds ?

@sebastianwolf
Copy link
Contributor Author

I added ErrorCodes to the compile list and pushed the option to allow for macros longer than 132 chars. Still there is a compilation error. Seems like the std=f2008 option is relatively strict on syntax. Removing it might solve the problem.

@benbovy
Copy link
Member

benbovy commented Apr 26, 2021

Ah I see, I need to merge #47 and then we need to merge (rebase onto) master here, as the line length flag that you added has no effect (set(CMAKE_Fortran_FLAGS "${CMAKE_Fortran_FLAGS} ${dialect}") is commented below).

@benbovy
Copy link
Member

benbovy commented Apr 26, 2021

Mmm so there's two CI errors:

  • Python builds

We need to set -ffree-line-length-none for Python builds too. However, it's set with other flags like -ffree-form and I'm not sure how to set those compile flags in a way that doesn't mess up with some files automatically generated by F2PY (for which it's hard to set flags, as the whole build is managed by scikit-build -- or at least I don't know how).

So we would need a special case for this, or maybe a cleaner solution would be to add continuation lines in the code where this is needed instead of setting -ffree-line-length-none.

  • Fortran example segmentation fault

Update every call to FastScape API functions with an ierr argument solves the issue, but it's quite a breaking change. I'm not sure how many users would be affected and to which extent, which is annoying.

I've tried something like this in FastScape_api subroutines:

  integer, optional, intent(out):: ierr

  if (present(ierr)) ierr = 0

But this doesn't seem to work (ierr is always present even when calling the subroutine from, Fan.f90 or other external files without including the argument). I wonder if that's because FastScape_api subroutines aren't defined in a module so the interface is ignored / not accessible from the outside.

I guess it would be nice and cleaner to put all api functions in a module? It would still be a breaking change (users would need to insert a use statement) but in my opinion this would be much better than having to update all subroutine api calls.

I'm far from being an expert in Fortran, though, so I might be missing other possible solutions.

@benbovy
Copy link
Member

benbovy commented Apr 26, 2021

Quick and dirty fix for Python builds that should work:

if(CMAKE_Fortran_COMPILER_ID MATCHES "Flang|GNU")
    ...
    set(pythonflags "-ffree-line-length-none")
endif()

...

# let F2PY (scikit-build) configure dialect flags for F77 and F90 source
# (some source files are generated by F2PY)
if(SKBUILD)
  set(CMAKE_Fortran_FLAGS "${CMAKE_Fortran_FLAGS} ${pythonflags}")
else()
  set(CMAKE_Fortran_FLAGS "${CMAKE_Fortran_FLAGS} ${dialect} ${ioflags}")
endif()

@benbovy
Copy link
Member

benbovy commented Apr 26, 2021

Yes I like a strong recommendation like that :-) !

As a quick test, I've wrapped all subroutines in FastScape_api.f90 into a FastScapeAPI module that I've imported in the example programs. Looks like it works well with present(). As a bonus, we get much more meaningful errors when API functions are called without ierr in external programs (i.e., Error: Missing actual argument for argument 'ierr' at at compilation instead of a segmentation fault at runtime). So I think we should really use a module for the API routines.

@sebastianwolf, let me know if you want me to do something. I can submit a PR or commit directly to your branch if that's easier.

@sebastianwolf
Copy link
Contributor Author

Very good. The 'optional' and specific error message sounds like a good idea.

@benbovy you can just directly work on my branch. Not sure you need write permissions on my fork, but I'll give them to you anyways.

@benbovy
Copy link
Member

benbovy commented Apr 26, 2021

In the last commits I added two macros that handle the now optional ierr argument in all public API functions: one to initialize the default value and output a depreciation warning, and the other for calling either return or stop depending on whether ierr has been given (another depreciation notification is shown just before calling stop).

The new requirement in external codes using the fastscape's API is to add use FastScapeAPI (and maybe add -I/path/to/fastscapeapi.mod to find the module, but that shouldn't be needed when building and installing the library using CMake).

Everything looks good in CI, except for the python bindings build on windows (Flang doesn't seem to support the same flags than gfortran for relaxing line length constraints). I haven't run all examples, though.

benbovy added 3 commits April 27, 2021 12:22
see: flang-compiler/flang#342

Flang apparently does not support well long line lengths, and it is
used here (v5.0) for shipping Python bindings with conda on Windows.

I haven't found how to insert Fortran continuation lines in macros
without any compile error (is that possible?), so I tried here to split
output messages into several print statements to hopefully avoid too
long code lines.
It's better to do something more idiomatic on the Python side with the
returned error codes (e.g., using `raise`).

Also, this should hopefully fix too long code lines for Flang/Windows.
@benbovy
Copy link
Member

benbovy commented Apr 27, 2021

All CI is green now, and everything looks good to me. @sebastianwolf @hpc4geo are you happy with the last changes? Do not hesitate to drop comments if you feel that something could be improved.

Is there anything you'd like to add to the TODO list below?

  • some more "manual" testing (inspect example results + run example notebook)
  • update all examples (insert ierr in api functions)
  • update documentation (release notes + mention that all api functions are now defined in the fastscapeapi module)

I suggest that we also wait for Jean's feedback before merging this (he's busy with EGU this week, maybe you are participating too?).

@sebastianwolf
Copy link
Contributor Author

sebastianwolf commented Apr 27, 2021

Thanks a lot, @benbovy . I pulled your changes and did some tests as well.

I have my custom makefile and one f90 and one c code that both run a simple fastscape setup for 500 timesteps. These codes simulate how Fastscape would be called from another program, like a thermo-mechanical code. I pushed a branch to my fork called sebastianwolf/MakeNRunfile_for_ErrorChecking. You can see and get these files there.

Some observations/problems:

  • wrapping the api funcs into a module requires redefining all function names in the c code. For instance fastscape_init_() became __fastscapeapi_MOD_fastscape_init(). I found the nameing-convention using the commandobjdump -t FastScape_api.o . I read that the function names might change with the compiler used. That would be a bit inconvenient. Anyways, is wrapping the subroutines in api into a module necessary and beneficial? If not, I would not do it for the sake of having simple names when calling from c.
  • Running the c-code produces correct results, at least the View() and Debug() functions show proper values. However, the output is not readable/nonsense in paraview. This was initially the same for the fortran code, but using the gfortran flag -fconvert=big-endian solved the problem (as you wrote). However, using this flag for gfortran does not solve the c-code. Hence, to me it looks very likely that calling the vtk function from c does not obey the big-endian conversion, and the output is non-sense. To solve this I propose to move the convert calls back to the functions so that we can remove the compiler flag in gfortran. I tested this and added convert='big_endian' to VTK.f90, lines 64 and 120, and to lines 743, 722, 734 in Fastscape_ctx and then it works from c as well. Should I push the changes? Doing so, we also do not need the compiler flag -fconvert=big-endian

src/Error.fpp Outdated
if (present(ierr)) then; \
ierr = ierr_; \
else; \
print '(A)', "*** Depreciation warning *** "; \
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benbovy This should read "Deprecation" not "Depreciation"

src/Error.fpp Outdated
ierr = ierr_; \
else; \
print '(A)', "*** Depreciation warning *** "; \
print '(A,A,A)', "Calling ", TRIM((fname)), " without 'ierr' argument (integer) is depreciated! Please update your code!"; \
Copy link

@hpc4geo hpc4geo Apr 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benbovy This should read "deprecated" not "depreciated"

@benbovy
Copy link
Member

benbovy commented Apr 27, 2021

Calling Fortran from C

is wrapping the subroutines in api into a module necessary and beneficial?

Yes I think it's really beneficial if we want to expose the subroutines interfaces to 3rd-party codes (I think in general we should). Without those interfaces, it would not be possible to make ierr optional here.

That said, you can control the function name (among other things) exposed to C by using Fortran ISO C Binding, like explained here. You could also set a custom name with bind(C, name='foo').

I'd be open to add C interoperability here, although:

Otherwise, it should still be possible for you to write a small Fortran wrapper file with C-interoperable versions of the subset of API functions that you need to call from your C code.

VTK issues

I see that -fconvert has an effect only when used in the main program, so it probably has no effect when calling the VTK Fortran routines from a C program.

Could you try setting the GFORTRAN_CONVERT_UNIT environment variable as documented here to see if it has an effect?

The problem with setting convert='big_endian' in open() is that it's not part of any Fortran standard (although I think it's supported by all common compilers), so if we use it we can't set -std=f2008 without having a compilation error.

Another possibility would be to refactor the VTK routines and switch from legacy (binary) VTK format to the XML (binary) VTK format (the latter allows setting the endianess in the file header). We could use https://github.com/szaghi/VTKFortran, but that's probably overkill to depend on that library.

@benbovy
Copy link
Member

benbovy commented Apr 28, 2021

One other impact of using a module for the API routines is that those are now moved in the fastscapelib_fortran.fastscapeapi namespace for the Python bindings. I think it's ok to have this breaking change as I recommend using https://github.com/fastscape-lem/fastscape, which is built on top of this library. For those who are still using the low-level bindings here it's easy to create an alias to avoid too much refactoring of existing code:

import fastscapelib_fortran as fs

becomes

import fastscapelib_fortran

fs = fastscapelib_fortran.fastscapeapi

@dmay23
Copy link

dmay23 commented Apr 28, 2021 via email

@benbovy
Copy link
Member

benbovy commented Apr 28, 2021

Wouldn't this be better
fs = fastscapelib_fortran.api
?

Yes it would be much nicer. We could easily create an alias in src_python/fastscapelib_fortran/__init__.py actually. I'm not much concerned by how elegant is the API of the Python bindings here, though. I reuse it internally in higher-level Python packages and I don't recommend using it directly.

Can I ask again what was the rational behind putting all the API functions in a module?

I think it follows much more the modern Fortran idioms and therefore has a lot of benefits when reused in other Fortran codes. Without the module, from my understanding there's no way for the compiler to be aware of the routine interfaces, hence the missing subroutine arg check is just not possible (optional is not recognized) and the compiler cannot throw meaningful errors (instead we get seg faults at runtime).

But yeah, there are certainly trade-offs when we want to reuse the code in other languages.

What we could do is define the module separately (with only an interface block), so that we keep both advantages, i.e., using the module in Fortran codes and just using the plain subroutines in C. We would have to specify the interface of each subroutine twice, which is not ideal, but I'm not sure how we can avoid it if we want to make things good and flexible enough in Fortran, C and Python.

Alternatively, it would perhaps make more sense to create a specific module for C-interoperability, rather than treat reusability in Fortran as a special case. We could have a fastscape_api_c.f90 file with something like:

module fastscapeapi_c

  use iso_c_binding

  interface

    subroutine FastScape_Init(ierr) bind(C, name='c_fastscape_init')
      use iso_c_binding, only: c_int
      implicit none
      integer(c_int), intent(out) :: ierr
    end subroutine FastScape_Init

    ...

  end interface

end module fastscapeapi_c

@benbovy
Copy link
Member

benbovy commented Apr 28, 2021

Hmm so the fastscape_api_c.f90 example code in my previous comment doesn't seem right (it doesn't create any symbol in the compiled library).

Other possible options:

    1. create a module for C interoperability that includes the implementation (just call the corresponding function), e.g.,
module fastscapeapi_c

  use fastscapeapi
  use iso_c_binding

  contains

    subroutine c_fastscape_init(ierr) bind(C)
      use iso_c_binding, only: c_int
      implicit none

      integer(c_int), intent(out) :: ierr

      call FastScape_Init(ierr)

    end subroutine c_fastscape_init

end module fastscapeapi_c

Pros: flexible and robust in all languages
Cons: that's pretty verbose...

    1. use the Fortran 2018 standard, which allows using bind(C) for subroutines with optional arguments.

Pros: we don't need to create a separate module for C-interoperable routines.
Cons: is it acceptable to use Fortran 2018? (from my point of view: yes).

    1. Do not put API routines in a module, and instead create a module that (only) provides the interface (that can be used in other Fortran codes).

Pros: not much breaking changes.
Cons: not very clean. The interface has to be written twice for the most common usage. Also, for relying on iso_c_binding for C-interoperability feels less hacky to me than relying on mangled routine names (among other possible issues), even if those are more predictable for external routines.

@sebastianwolf
Copy link
Contributor Author

Thanks for the explanations, @benbovy. Just a quick update on the VTK problem:

Could you try setting the GFORTRAN_CONVERT_UNIT environment variable as documented here to see if it has an effect?

I tested this and it works. Adding export GFORTRAN_CONVERT_UNIT='big_endian' to the bash environment fixes the issue with VTK when Fastscape is called from c. FYI, using this environment variable makes the compiler option -fconvert also unnecessary for Fortran.

@benbovy
Copy link
Member

benbovy commented Apr 28, 2021

Adding export GFORTRAN_CONVERT_UNIT='big_endian' to the bash environment fixes the issue with VTK when Fastscape is called from c.

Good to know, thanks for testing!

@sebastianwolf
Copy link
Contributor Author

I just wanted to up this thread here again so that we can converge towards a solution :).
I'll start with a thought I had lately:
What I really like about the Fastscape interface is it's simplicity. It's really clean and if you know some basic fortran it is simple to understand the code and get going with it. It's also simple to add functionality to the code - one does not need a lot of time to understand the complex layers surrounding the heart of the code.
Each time we add layers of complexity, we make the interface less readable and more challenging to add to for people who are just starting. I think this simplicity is really great, valuable, and important to keep in mind. Anyways, enough rambling.

If I understand correctly, the largest advantage of putting the api into a module is to make the ierr-arg optional. An optional arg can provide the user with a message to update the code and does not result in a seg fault.
This is useful and helps Benoit a lot.

Wraping the api into a module creates the disadvantage of having mangled routine names that depend on the compiler used.

To avoid mangled names, Benoit proposed three options.
i) is verbose but simple
ii) is less verbose but needs a modern standard
iii) seems messy, and I do not fully understand what you mean by

create a module that (only) provides the interface

If the api goes into a module, which there seems to be good reasons for, then using the iso_c_binding option seems reasonable.
I like the simplest solution that does not depend on having the newest compilers and modern standards. I say that because some of our still fine and used local hpc might not have been updated since 2018. I have no experience with fortran standard 2018, but it would be easier to not have the restriction of a modern standard if it fails with older compilers.

So therefore, if a module for the api is used, the simplest solution might be number i).

However, any solution is fine with me. Any other thoughts?!

@benbovy
Copy link
Member

benbovy commented May 27, 2021

Agreed @sebastianwolf let's move this forward!

Choosing one of the options above is not straightforward. We'll need to make tradeoffs since:

  • We want to keep things simple for users
  • We don't want to break user's code in new releases
  • We want to keep things easily maintainable
  • We want useful features like the error checking here
  • We want to reuse this library in 3rd party codes written in Fortran, C and Python
  • We want compatibility with older compilers

That's a lot of things to consider! :-)

Apart from using bind(C) for subroutines with optional arguments, we don't use any other modern feature from the Fortran 2018 standard, so it would be worth checking (e.g., with a simple program) if this specific feature is supported by the compilers available on your local hpc. I'll also need to check if Flang supports it too (I use it to build the conda package on windows for the Python bindings). If it is supported, then I'd lean toward option ii), which is the best one regarding maintenance IMO because we don't need to write the routine interfaces twice.

Otherwise, I think that option i) may be the cleanest one for exposing the API routines to C.

Any other thoughts are welcome to help us choosing the best option

cc @Djneu who is coupling Fastscape with Aspect I think.

@sebastianwolf
Copy link
Contributor Author

Very good.

we don't use any other modern feature from the Fortran 2018 standard, so it would be worth checking (e.g., with a simple program) if this specific feature is supported by the compilers available on your local hpc.

I'll test this and report back :).

…ng from

c are the same as the fortran functions, only with lowercase letters.
Successfully tested api with legacy pgi and gnu compilers.
@sebastianwolf
Copy link
Contributor Author

Hei guys, I implemented and tested option ii (one interface, optional ierr arg, iso_c binding) and it worked fine also with legacy pgi compilers. The changes are pushed to this branch.

I chose the name of each function in c to be the same as the fortran-function, but in lower-case only. I figured that this is a good option.

I don't know why two of the tests here failed.
test-python / 3.8 (windows-latest) seems related to problems with iso_c_binding,
continuous-integration/appveyor/... problem is related to f2018 standard.

Could you have a look at these, @benbovy ?
Cheers and thanks

@hpc4geo
Copy link

hpc4geo commented Jun 1, 2021

I'm not sure if this might be useful for the project, but I have written a (dirty) Python parser which will automatically generate the C function prototypes (and a C header file) directly from the contents of FastScape_api.f90.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants