Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added El Capitan Mac binary identifier to selection statement #120

Open
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

akrinos
Copy link

@akrinos akrinos commented Oct 11, 2017

As of R version 3.4.2, the platform reported by R for machines running El Capitan or Sierra is "mac.binary.el-capitan" rather than the former "mac.binary" for previous (post-mavericks) versions. This resolves the issue of GLMr failing to function normally with SIP disabled in R 3.4.2.

@lawinslow
Copy link
Member

@akrinos Thanks for looking into this. Keeping on top of such breaking changes on the macOS side has been challenging.

  1. Good call on the reported platform. Mind making that more general, so that mac.binary.* is matched? It isn't inconceivable that the platform name changes again with the next version.
  2. Looks like this change may have broken the *nix build. Not 100% sure why, but we need to track that down before merging the PR. (hmm, it may be a change to Travis builds as well, which we can fix in a separate PR too)
  3. I see you moved the *.dylib files to the exec directory. It would be great if we could keep those separate from the executables. It looks like our use of DYLD fallback is now broken with the latest macOS. Apple docs mention that and allude to a potential solution.

Could you try fixing the use of DYLD fallback instead of moving the libraries into the exec dir? I think we might be able to fix this by setting the environment variable and executing the GLM binary in the same call to system2. Something like sending it the GLM path

system2('DYLD_LIBRARY_PATH=/path/to/libs /path/to/glm')

I'd like to make sure there is no better option before mixing all the libs and binaries together.

@akrinos
Copy link
Author

akrinos commented Oct 12, 2017

Latest commit generalizes check for any "Mac" OS. We noted that the *nix build appeared to be broken in current GLMr version a few weeks ago during testing, and error messages relate to shared libgd elements. Perhaps setting up a temporary subdirectory for Mac executable and library files would be a temporary fix, which would be similar to the setup in GLM download. Changing directory structure in this way does not resolve *nix build problems.

Tomorrow I will look into a better solution that does not involve putting dylibs in same directory - DYLD_LIBRARY_FALLBACK_PATH is indeed what SIP interferes with.

@lawinslow
Copy link
Member

Just FYI, the failed travis build isn't from your PR. I fixed the travis build over here #121. I'm going to merge it in and you can pull those changes. Naturally disregard #2 from above, but it would still be great to get DYLD_LIBRARY_FALLBACK_PATH working on >= el capitan.

@cayelan
Copy link

cayelan commented Oct 12, 2017

@lawinslow is there any reason why the executable files can't be in the same directory as the *.dylib files? @akrinos & I have been working on this and it seems like the easiest solution- especially because then you can keep SIP enabled and successfully run GLMr on mac OSX.

@lawinslow
Copy link
Member

@cayelan dumping them all in the exec directory is a hack that doesn't solve the root issue. The exec directory labels everything 755 executable, which isn't necessary (or desirable in some situations) for library files. I would first like to see if we can fix the main issue before resorting to a hacked solution.

@akrinos did you make any progress on feeding in feeding DYLD_LIBRARY_FALLBACK_PATH in with the system call itself?

@cayelan
Copy link

cayelan commented Oct 13, 2017

@lawinslow ok. My understanding is that setting the DYLD_LIBRARY_PATH to another directory is what is triggering SIP and causing GLMr failure. @akrinos is working on this full-time but hasn't been able to find a solution yet that doesn't involve keeping the library and executable files together. Long-term, finding a solution where OS X users don't have to disable SIP to run GLMr would be really helpful.

@lawinslow
Copy link
Member

lawinslow commented Oct 13, 2017

@cayelan I am not against ultimately doing what it takes to get this running on macOS. But before we go down this route, I want to make sure we have explored our options and really understand how these issues manifest on macOS and what our options are. I would like to hear from @akrinos what has been tried and what the results are. It sounds like @akrinos must have explored a few options if they are working full time. In the end, having full control of the library source is useful and potentially important in the future, depending on the functionality we later deem necessary. For example, one commonly requested feature is to support multiple versions of GLM dispatchable from the same package. The ability to control library source directories would be important if this is the goal.

@akrinos could you speak to what you have tried?

@akrinos
Copy link
Author

akrinos commented Oct 13, 2017

@lawinslow I have tried setting DYLD_SYSTEM_PATH within the call to the GLM executable - as far as I can tell from my research and experimentation there is no way to make this work with SIP enabled with the type of library that we have. Instead, I started experimenting with command line methods. The protection system will undo these changes prior to the executable call. My latest commits are a solution to the problem, though they are awaiting final testing on a couple of Sierra machines to ensure that everything about the implementation is working properly. It is currently somewhat inefficient, as it updates the references using OSx utility install_name_tool for every library and the executable itself, taking into account the two ways these may be named on these systems. However, I am modifying the code to only perform this action in the case that the GLM run fails, thus the costly operations would only occur once when a Sierra user has just downloaded or updated the package. For people who are unaffected by the SIP issues, no slowdown should be experienced, in theory.

@cayelan Feel free to test the current version on my forked repository; let me know if anything unexpected arises and I will complete the troubleshooting process for this workaround tomorrow.

@lawinslow
Copy link
Member

Thanks @akrinos. This update is super helpful. And thank you for your detailed approach. Just thinking out loud here. How long does the install_name_tool take to execute? It could be done on package attach too, which would result in it being called just once per session. That way we wouldn't need to have it run only on failure in the run_glm method.

Just a thought.

Also, I don't have a great way to test this right now, so I'll let you confirm when you're done with the edits.

@akrinos
Copy link
Author

akrinos commented Oct 15, 2017

@lawinslow I have only tested the changes on a pretty capable desktop, so I am not sure how long execution would require on the average machine, but typically on the desktop the full process of the preliminary run certainly takes less than one minute, probably less than thirty seconds. If this were done on package attach, we would need to check whether the machine was a Mac - it should work just fine on any Mac, though I should test to make sure. But the calls only need to be made once, unless the user downloads a new version of the package - the changes to the paths of the library files are permanent on the user's particular machine.

Hopefully by early next week @cayelan and/or others will have been able to check the edits on their machines. The one current problem with the existing setup is that the first time the user tries run_glm() after download, I can't seem to suppress the library error the when run_glm() tries to call the GLM executable, so users just have to wait to see that eventually the error is resolved and GLM runs as expected. I don't know if there is a way to suppress the initial error output (I haven't found one in my limited search for one - silent = TRUE doesn't work), but if we can't change that aspect, perhaps a line could just be printed subsequently noting the error and potential handling.

@lawinslow
Copy link
Member

@cayelan Have you been able to test the latest changes by @akrinos? If they are working well, we can get these merged in.

@akrinos
Copy link
Author

akrinos commented Oct 24, 2017

@lawinslow The changes have been tested on at least two Sierra machines; the only stumbling block is the error message that users receive (that does not affect execution) the first time running GLM after package download. As far as I can tell, additional issues are not present, and none of the changes should have any effect on non-Mac users. I am happy to assist with troubleshooting if it turns out that I am missing any potential hazards. @cayelan, feel free to add.

@jordansread
Copy link
Member

on El Cap I get

glmtools::run_example_sim()
sim_folder argument missing, using temp directory:  /var/folders/b1/j14cjj994hxbp82qv23tj27h0028qb/T//RtmpRZzmr5 

driver data file copied to  /var/folders/b1/j14cjj994hxbp82qv23tj27h0028qb/T//RtmpRZzmr5/nldas_driver.csv 
writing nml file to  /var/folders/b1/j14cjj994hxbp82qv23tj27h0028qb/T//RtmpRZzmr5/glm2.nml 
dyld: Library not loaded: @executable_path/libifcore.dylib
  Referenced from: /Library/Frameworks/R.framework/Versions/3.4/Resources/library/GLMr/exec/macglm
  Reason: image not found
Reading config from glm2.nml
Could not open "glm2.nml"
Error opening namelist file glm2.nml
       ------------------------------------------------
       |  General Lake Model (GLM)   Version 2.2.0rc5    |
       ------------------------------------------------
simulation complete. 
*.nc output located in  /var/folders/b1/j14cjj994hxbp82qv23tj27h0028qb/T//RtmpRZzmr5/output.nc 
[1] "/var/folders/b1/j14cjj994hxbp82qv23tj27h0028qb/T//RtmpRZzmr5"

and fails to create .nc file

@akrinos
Copy link
Author

akrinos commented Oct 24, 2017

@jread-usgs This is the output I would expect with the exception of the NC file generation. I'm not familiar with the example simulation, but you mean that the simulation files are copied to the given temp directory, but that when you navigate there, the NC file is not present? How does this result differ from the output when the command from the present GLMr version is used?

@jordansread
Copy link
Member

My impression is that the model is not producing a .nc file.

@akrinos
Copy link
Author

akrinos commented Nov 5, 2017

@jread-usgs Is there any more information you can provide about this issue? It's hard to know what's wrong when I am unable to reproduce it. Have you tried running a GLM folder from your machine?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants