Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use StableTasks to pin the thread of the renderloop on MacOS #4779

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

asinghvi17
Copy link
Member

@asinghvi17 asinghvi17 commented Feb 11, 2025

Description

In any Mac system, the window handler only allows the main thread of an application to handle threads. For Julia, I assume that's thread id 1.

Every time I try to use GLMakie on Mac, on Julia versions where async tasks can migrate threads, I get a segfault:

julia> fig5 = Figure(size = (4000, 4000))
*** Terminating app due to uncaught exception 'NSInternalInconsistencyException', reason: 'NSWindow should only be instantiated on the main thread!'
*** First throw call stack:
(
)
libc++abi: terminating due to uncaught exception of type NSException

[80423] signal 6: Abort trap: 6
in expression starting at none:0
__pthread_kill at /usr/lib/system/libsystem_kernel.dylib (unknown line)
Allocations: 7437143 (Pool: 7436608; Big: 535); GC: 12
zsh: abort      julia

This PR uses StableTasks.jl to pin the GLMakie renderloop task to thread id 1, which seems to fix the issue on my system. I've currently locked the change behind a Sys.isapple(), which should mean no changes for Windows and Linux users. But this needs to be tested.

A consequence of using StableTasks is that the fields of the renderer have to change. I can revert that if desired by unwrapping the StableTask.

Fixes #4688

No relation to the other NS* errors that were quite mysterious.

Type of change

Delete options that do not apply:

  • Bug fix (non-breaking change which fixes an issue)

Checklist

  • Added an entry in CHANGELOG.md (for new features and breaking changes)
  • Added or changed relevant sections in the documentation
  • Added unit tests for new algorithms, conversion methods, etc.
  • Added reference image tests for new plotting functions, recipes, visual options, etc.

In any Mac system, the window handler only allows the main thread of an application to handle threads.  For Julia, I assume that's thread id 1.

Every time I try to use GLMakie on Mac, on Julia versions where async tasks can migrate threads, I get a segfault:

```
julia> fig5 = Figure(size = (4000, 4000))
*** Terminating app due to uncaught exception 'NSInternalInconsistencyException', reason: 'NSWindow should only be instantiated on the main thread!'
*** First throw call stack:
(
)
libc++abi: terminating due to uncaught exception of type NSException

[80423] signal 6: Abort trap: 6
in expression starting at none:0
__pthread_kill at /usr/lib/system/libsystem_kernel.dylib (unknown line)
Allocations: 7437143 (Pool: 7436608; Big: 535); GC: 12
zsh: abort      julia
```

This PR uses StableTasks.jl to pin the GLMakie renderloop task to thread id 1, which seems to fix the issue on my system.  I've currently locked the change behind a `Sys.isapple()`, which should mean no changes for Windows and Linux users.  But this needs to be tested.

A consequence of using StableTasks is that the fields of the renderer have to change.  I can revert that if desired by unwrapping the StableTask.
@asinghvi17 asinghvi17 self-assigned this Feb 11, 2025
@asinghvi17 asinghvi17 added bug GLMakie This relates to GLMakie.jl, the OpenGL backend for Makie. macOS For issues involving macOS and the M-series processor labels Feb 11, 2025
@SimonDanisch
Copy link
Member

Wait, since when can @async migrate threads?

It is strongly encouraged to favor Threads.@Spawn over @async always even when no parallelism is required
│ especially in publicly distributed libraries. This is because a use of @async disables the migration of
│ the parent task across worker threads in the current implementation of Julia.

@asinghvi17
Copy link
Member Author

asinghvi17 commented Feb 11, 2025

I have no clue - but when I start my Julia process with current master and 1 thread, there are no issues. If I start it with 2 or more threads, then I get this.

It may be that the parent process can't migrate but the spawned process still can...but not 100% sure how that works.

Edit: apparently the spawned process cannot migrate threads, but it's still not advisable to use async - cf. this slack message

@asinghvi17
Copy link
Member Author

asinghvi17 commented Feb 11, 2025

There's also a persistent issue here where closing the window causes a deadlock somehow (after my changes), might be because thread 1 is also where all IO happens:

julia> ┌ Warning: error closing screen
│   exception =
│    deadlock detected: cannot wait on current task
│    Stacktrace:
│      [1] error(s::String)
│        @ Base ./error.jl:35
│      [2] wait(t::Task)
│        @ Base ./task.jl:367
│      [3] wait(t::StableTasks.StableTask{Any})
│        @ StableTasks.Internals ~/.julia/packages/StableTasks/3CrzR/src/internals.jl:15
│      [4] wait(x::GLMakie.Screen{GLFW.Window})
│        @ GLMakie ~/.julia/dev/makie/Makie.jl/GLMakie/src/screen.jl:545
│      [5] #stop_renderloop!#83
│        @ ~/.julia/dev/makie/Makie.jl/GLMakie/src/screen.jl:947 [inlined]
│      [6] stop_renderloop!
│        @ ~/.julia/dev/makie/Makie.jl/GLMakie/src/screen.jl:939 [inlined]
│      [7] close(screen::GLMakie.Screen{GLFW.Window}; reuse::Bool)
│        @ GLMakie ~/.julia/dev/makie/Makie.jl/GLMakie/src/screen.jl:753
│      [8] close
│        @ ~/.julia/dev/makie/Makie.jl/GLMakie/src/screen.jl:739 [inlined]
│      [9] renderloop(screen::GLMakie.Screen{GLFW.Window})
│        @ GLMakie ~/.julia/dev/makie/Makie.jl/GLMakie/src/screen.jl:1077
│     [10] (::GLMakie.var"#79#81"{GLMakie.Screen{GLFW.Window}})()
│        @ GLMakie ~/.julia/dev/makie/Makie.jl/GLMakie/src/screen.jl:920
│     [11] (::GLMakie.var"#80#82"{StableTasks.AtomicRef{Any}, GLMakie.var"#79#81"{GLMakie.Screen{GLFW.Window}}})()
│        @ GLMakie ~/.julia/packages/StableTasks/3CrzR/src/internals.jl:107
└ @ GLMakie ~/.julia/dev/makie/Makie.jl/GLMakie/src/screen.jl:1079

@asinghvi17 asinghvi17 marked this pull request as draft February 11, 2025 18:18
@MakieBot
Copy link
Collaborator

Benchmark Results

SHA: 4ca3a38800f7e0920ca529f2438b73af46e1f1e7

Warning

These results are subject to substantial noise because GitHub's CI runs on shared machines that are not ideally suited for benchmarking.

GLMakie
CairoMakie
WGLMakie

@SimonDanisch
Copy link
Member

Since I don't have a Mac I can't debug this... Could you print the threadid from where it happens?
Want to make sure it doesn't hide another issue

@asinghvi17
Copy link
Member Author

Will do, Jameson also sent some info on what could be a better solution (to do what e.g. Gtk.jl does), will try to implement that this week.

Have to give a talk in an hour but will try to print the threadid tomorrow!

@ffreyer
Copy link
Collaborator

ffreyer commented Feb 11, 2025

Could this just be Threads.@spawnat 1 renderloop() or does that not keep the task on the master thread?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug GLMakie This relates to GLMakie.jl, the OpenGL backend for Makie. macOS For issues involving macOS and the M-series processor
Projects
Status: Work in progress
Development

Successfully merging this pull request may close these issues.

MacOS/GLMakie: NSException when displaying GLMakie figure
4 participants