-
Notifications
You must be signed in to change notification settings - Fork 460
Apple Metal Support
Apple computers can have several types of GPUs:
- Integrated GPUs: built into the CPU. For example, Apple Silicon chips (M1 and M2) have built-in GPUs. This architecture has the advantage that CPU and GPU can share memory efficiently.
- Discrete GPUs connected by Thunderbolt. These are called 'external GPUs' (eGPUs). They are typically NVIDIA or AMD. According to Apple, only Intel Macs support eGPUs.
- Discrete GPUs connected by PCI. There are signs that Apple is phasing this out as well.
Mac OS lets applications access GPUs via two APIs:
-
OpenCL. As of now, this is included with Mac OS. Apple announced that they have stopped supporting it and at some point will no longer include it. It's possible that it will be available from elsewhere.
-
Metal is Apple's replacement for OpenCL and OpenGL. See also Wikipedia.
There's support for accessing NVIDIA GPUs via CUDA, but it looks like this is being phased out. I don't think AMD's CAL API has ever been supported.
BOINC uses struct COPROC
to describe either
- a single GPU (or coprocessor)
- a set of (nearly) identical GPUs.
There are derived classes COPROC_NVIDIA
, COPROC_ATI
, and COPROC_INTEL
that contain additional vendor-specific info
(like CUDA capabilities, CAL version #, etc.).
The class COPROCS
describes all the processing resources on the host.
It has an array of COPROC
objects;
the first element represents the CPU,
which may be usable via OpenCL.
It has separate vendor-specific objects for NVIDIA, ATI, and Intel.
The client enumerates GPUs using three APIs:
- CAL (AMD)
- CUDA (NVIDIA)
- OpenCL
The top-level function (COPROCS::detect_gpus()
first calls functions to enumerate NVIDIA and then ATI GPUs.
Each of these returns a global vector of objects:
nvidia_gpus
and ati_gpus
.
The COPROCS::get_opencl()
function
loops over platforms (i.e. vendors).
For each vendor it enumerates CPUs, then GPUs and accelerators.
In the latter case it tries to match the GPU up
with an entry in the NVIDIA and ATI vectors.
In the Intel case it adds a record to a global vector intel_gpus
.
Next, COPROCS::correlate_gpus()
reduces the vectors (e.g. nvidia_gpus
)
to a single COPROC
object.
It calls vendor-specific functions, e.g. COPROC_NVIDIA::correlate()
.
Each of these identifies the most powerful instance
(on the basis of memory, FLOPS, etc.)
and identifies the instances that are close to that in all these dimensions.
The result is a single COPROC
object with a count
field indicating the number of instances.
(to be completed)
(to be completed)
To simplify things, let's have these goals:
- Be able to use integrated GPUs (Apple Silicon and Intel) from either Metal or OpenCL.
- Use a single name ('Apple_M') for all Apple Silicon GPUs.
For now, let's NOT have the goal of accessing discrete GPUs (NVIDIA, ATI) via Metal. This would add a lot of complexity because of the need to correlate multiple GPU instances. For now, these GPUs can be used via OpenCL (and possibly CUDA in the case of NVIDIA). By the time Apple removes OpenCL, discrete GPUs will probably have little total power compared to Apple Silicon GPUs.
So here's what we need to do:
- Add Metal-specific info (versions, capabilities) to
COPROC
. - Add a new class
COPROC_APPLE
. - Pick a name for Apple Silicon GPUs, e.g.
Apple_M
. - Call Metal to enumerate GPUs. Ignore all except Apple Silicon and Intel.
- If OpenCL detects an Apple Silicon or Intel GPU,
and it was detected via Metal, copy the OpenCL info to the
COPROC_APPLE
orCOPROC_INTEL
object. - Adopt a convention for Metal plan class names, and add logic at various places in the server code.
If there are multiple Apple Silicon chips, what does Metal return? How would we distinguish between them? How could the client tell an app to use a particular chip?