Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible to build without internet access aka pre-built dependencies? #128

Open
mathomp4 opened this issue May 2, 2023 · 9 comments · May be fixed by #129
Open

Possible to build without internet access aka pre-built dependencies? #128

mathomp4 opened this issue May 2, 2023 · 9 comments · May be fixed by #129

Comments

@mathomp4
Copy link

mathomp4 commented May 2, 2023

A user I help support with the GEOS model has asked for neural-fortran to be used. The main issue I'm currently having is how I can build it in my "usual" way.

For example, currently I use ESMA-Baselibs to install the "base libraries" used by GEOS. Now, Baselibs is "nice" for clusters in that I can clone it on a node with internet access (and download a few things not on GitHub and not submodule-able) and then do all the other tasks on a compute node where things like make -j10 are allowed (head nodes maaaaybe they'd allow make -j2).

The issue I see here is that if I use the CMake install method, it looks like it always uses FetchContent which of course would fail on a compute node at CMake time since it couldn't get the code since compute nodes can't see the internet.

So, I wondered, do you have ideas on how to handle this? My first thought was I can add functional-fortran, h5fortran, and json-fortran as submodules, but I think I'd need to do something like in, say, functional.cmake:

find_package(functional 0.6.1 QUIET)
if (NOT functional_FOUND)

  FetchContent_Declare(functional
    GIT_REPOSITORY https://github.com/wavebitscientific/functional-fortran
    GIT_TAG 0.6.1
    GIT_SHALLOW true
  )

  FetchContent_Populate(functional)
...

Does that look about right? I'm going to try this and see in some test builds but I even wondered if you'd be willing to support such a change.

@milancurcic
Copy link
Member

Good question, I don't know. I'm curious to hear if your idea works. If it does, let's please add it to the repo.

@scivision let us know if you have any ideas about how best to handle this situation with CMake.

@mathomp4
Copy link
Author

mathomp4 commented May 2, 2023

Note: Something like this might be needed anyway. We have hopes to move from Baselibs to spack in the future. In that case, you'd almost have to separate out the builds (I think) to spack-ize this.

@mathomp4
Copy link
Author

mathomp4 commented May 3, 2023

Good news, I'm close. Bad news, I'm afraid I'm doing something..."bad" with CMake.

So, the issue seems to be some weird interaction with the way I build HDF5 and I guess how find_package() finds it. When I build h5fortran (and, thus, neural-fortran) I have to do some CMake ugliness:

cmake -DCMAKE_INSTALL_PREFIX=$(prefix) -DHDF5_ROOT="$(prefix);$(prefix)/include/hdf5;$(prefix)/include/szlib" -DSERIAL=1 ..

This is because we don't build HDF5 with CMake, but rather autotools and we install it oddly (for legacy reasons), so we have to sort of tell it where to find things.

Ugly, but when I put some prints in the cmake/h5fortran.cmake file:

find_package(HDF5 COMPONENTS Fortran REQUIRED)
message(STATUS "HDF5_FOUND: ${HDF5_FOUND}")
message(STATUS "HDF5_LIBRARIES: ${HDF5_LIBRARIES}")

I see:

-- HDF5_FOUND: TRUE
-- HDF5_LIBRARIES: /Users/mathomp4/Baselibs/ESMA-Baselibs-main-with-neural-fortran/aarch64-apple-darwin22.4.0/gfortran/Darwin/lib/libhdf5_fortran.a;/Users/mathomp4/Baselibs/ESMA-Baselibs-main-with-neural-fortran/aarch64-apple-darwin22.4.0/gfortran/Darwin/lib/libhdf5.a;/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX13.3.sdk/usr/lib/libm.tbd;/Users/mathomp4/Baselibs/ESMA-Baselibs-main-with-neural-fortran/aarch64-apple-darwin22.4.0/gfortran/Darwin/lib/libsz.a;/Users/mathomp4/Baselibs/ESMA-Baselibs-main-with-neural-fortran/aarch64-apple-darwin22.4.0/gfortran/Darwin/lib/libz.dylib;/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX13.3.sdk/usr/lib/libdl.tbd;/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX13.3.sdk/usr/lib/libm.tbd

Now the first time I tried make it failed at finding the MPI libraries, so I added a block:

if (HDF5_IS_PARALLEL)
  message(STATUS "HDF5 is parallel")
  target_link_libraries(HDF5::HDF5 INTERFACE MPI::MPI_Fortran)
endif()

and that got me past the first issue. Huzzah. But now when I do make:

[ 63%] Linking Fortran executable ../bin/test_flatten_layer
Undefined symbols for architecture arm64:
  "_H5LT_set_attribute_numerical", referenced from:
      _h5ltset_attribute_c in libhdf5hl_fortran.a(H5LTfc.o)
  "_H5LTfind_dataset", referenced from:
      _h5ltfind_dataset_c in libhdf5hl_fortran.a(H5LTfc.o)
...

To me that looks like it can't find the HL libraries. So, I made some modifications:

find_package(HDF5 COMPONENTS Fortran REQUIRED
  OPTIONAL_COMPONENTS C HL)
if (HDF5_HL_FOUND)
  message(STATUS "HDF5 HL is available")
  target_link_libraries(HDF5::HDF5 INTERFACE hdf5::hdf5_hl hdf5::hdf5_hl_fortran)
endif()
if (HDF5_IS_PARALLEL)
  message(STATUS "HDF5 is parallel")
  target_link_libraries(HDF5::HDF5 INTERFACE MPI::MPI_Fortran)
endif()

And that seemed to work. For reasons I don't understand, I needed to ask for both C and HL because it wouldn't link without it.

But this seems...wrong. I'd have thought HDF5::HDF5 would have everything hdf5-ish. MPI, okay, but the HL libraries seem fine?

Is this because we build HDF5 as static maybe?

@mathomp4 mathomp4 linked a pull request May 3, 2023 that will close this issue
@milancurcic
Copy link
Member

Great! I don't understand it either. My rule of thumb has been that if the linker asks for dependencies and I don't understand why, just make it happy. Yes, the PR will be very welcome.

@mathomp4
Copy link
Author

mathomp4 commented May 3, 2023

Okay. I've made a draft PR (see #129) so you can see my changes.

Good news, works for me (at least the neural-fortran ctest is happy). Bad news, I seem to have broken the FetchContent path. When I try that, it dies in the h5fortran CMake:

-- h5fortran not found, fetching from GitHub
-- h5fortran 4.6.3  CMake 3.26.3
-- checking that C and Fortran compilers can link
-- checking that C and Fortran compilers can link - OK
-- Looking for H5_HAVE_FILTER_SZIP
-- Looking for H5_HAVE_FILTER_SZIP - found
-- Looking for H5_HAVE_FILTER_DEFLATE
-- Looking for H5_HAVE_FILTER_DEFLATE - found
-- Looking for H5_HAVE_PARALLEL
-- Looking for H5_HAVE_PARALLEL - found
-- Found MPI_C: /Users/mathomp4/installed/Compiler/gcc-gfortran-12.2.0/openmpi/4.1.5/lib/libmpi.dylib (found version "3.1")
-- Found MPI_Fortran: /Users/mathomp4/installed/Compiler/gcc-gfortran-12.2.0/openmpi/4.1.5/lib/libmpi_usempif08.dylib (found version "3.1")
-- Found MPI: TRUE (found version "3.1") found components: C Fortran
-- Found ZLIB: /Users/mathomp4/installed/MPI/gcc-gfortran-12.2.0/openmpi-4.1.5/Baselibs/7.12.0/Darwin/lib/libz.dylib (found version "1.2.11")
-- Found SZIP: /Users/mathomp4/installed/MPI/gcc-gfortran-12.2.0/openmpi-4.1.5/Baselibs/7.12.0/Darwin/lib/libsz.a
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Looking for H5Pset_fapl_mpio
-- Looking for H5Pset_fapl_mpio - found
-- Performing Test HDF5_C_links
-- Performing Test HDF5_C_links - Success
-- Performing Test HDF5_Fortran_links
-- Performing Test HDF5_Fortran_links - Success
-- Found HDF5: /Users/mathomp4/installed/MPI/gcc-gfortran-12.2.0/openmpi-4.1.5/Baselibs/7.12.0/Darwin/lib/libhdf5_hl.a;/Users/mathomp4/installed/MPI/gcc-gfortran-12.2.0/openmpi-4.1.5/Baselibs/7.12.0/Darwin/lib/libhdf5.a (found suitable version "1.10.9", minimum required is "1.8.7") found components: Fortran HL
-- Found MPI: TRUE (found version "3.1") found components: Fortran
CMake Error at build/_deps/h5fortran-src/CMakeLists.txt:43 (target_link_libraries):
  Cannot specify link libraries for target "HDF5::HDF5" which is not built by
  this project.


-- Configuring incomplete, errors occurred!

@mathomp4
Copy link
Author

mathomp4 commented May 3, 2023

I'm wondering if maybe the reordering I did with h5fortran.cmake was "too much". I'll experiment...

@mathomp4
Copy link
Author

mathomp4 commented May 3, 2023

Well, I can change that error to another error by moving things around. Dang it.

@milancurcic Can you try out my branch locally? I'm wondering if this is just Baselibs being weird or if I did break things.

@milancurcic
Copy link
Member

Yes, I'll try it in the afternoon.

@mathomp4
Copy link
Author

mathomp4 commented May 3, 2023

Yes, I'll try it in the afternoon.

Thanks.

Some good news. I can get it to work for me in the FetchContent way, but only by changing h5fortran. 😦

The change is this code:

if(hdf5_parallel OR HDF5_HAVE_PARALLEL)
  target_link_libraries(HDF5::HDF5 INTERFACE MPI::MPI_Fortran)
endif()

and I seem to have to do:

if(NOT TARGET HDF5::HDF5)
if(hdf5_parallel OR HDF5_HAVE_PARALLEL)
  target_link_libraries(HDF5::HDF5 INTERFACE MPI::MPI_Fortran)
endif()
endif()

So this must be an order-of-operations thing, but I'm danged if I can figure it out. If I don't have the find_package(HDF5) call before the find_package(h5fortran) call in cmake/h5fortran.cmake then it's like the h5fortran find_package(HDF5) wins and it doesn't see the right HL libraries, etc.

Grah.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants