This document describes how to get neural network models into the Core ML format, either via automatic conversion or by building them
from scratch programmatically. We also discuss various utilities available to edit the mlmodel
such as quantization, making the input shape
flexible, changing the input/output names, types, inspecting mlmodels, printing a text description of the model etc.
What are the layers supported by Core ML? For the latest list along with all the parameterizations, check out the neuralnetwork.proto file, which is a protobuf description of a neural network model. Since its a big file, its easier to navigate either by starting from the top level proto message or by directly looking at the layer types. An auto-generated documentation, built from the comments in the proto file, can be found here.
Please make sure that you have installed the latest coremltools
, tfcoreml
(if using tensorflow converter) and onnx-coreml
(if using ONNX converter) python packages.
pip install --upgrade coremltools
pip install --upgrade tfcoreml
pip install --upgrade onnx-coreml
Jupyter notebook examples for converters
- Keras.io converter (TF 1.x backend)
- TensorFlow conversion
- ONNX converter (PyTorch conversion)
- Building an mlmodel using the Builder API
- Model quantization
- Model prediction
- Model inspection and editing
Models created via the Keras.io API, with Tensorflow 1.x backend,
and saved in the .h5
format can be converted to Core ML.
The coremltools Keras converter supports Keras versions 2.2+. (Versions below 2.2, up to 1.2.2 are supported, however they are no longer maintained, i.e. no bug fixes will be made for versions below 2.2)
# convert by providing path to a .h5 file
mlmodel = coremltools.converters.keras.convert('keras_model.h5')
mlmodel.save('coreml_model.mlmodel')
# convert by providing a Keras model object
from keras.models import load_model
keras_model = load_model("keras_model.h5")
mlmodel = coremltools.converters.keras.convert(keras_model)
The convert function can take several additional arguments, such as:
input_names
,output_names
: to provide custom names for inputs and outputsimage_input_names
: to get an mlmodel such that its input is of type imageis_bgr
,red_bias
,green_bias
,blue_bias
,image_scale
: to provide parameters for image pre-processing when the input is of type image (i.e. ifimage_input_names
is being used).class_labels
,predicted_feature_name
: to produce an mlmodel of type neural network classifiermodel_precision
: to produce a quantized model. Equivalently, the mlmodel can be quantized post conversion as well, see the section on Model quantizationrespect_trainable
: to produce an updatable mlmodel. See examples hereadd_custom_layers
,custom_conversion_functions
: to add a custom layer in the generated mlmodel. This is useful when Keras has a layer (native or lambda) that Core ML does not support. For a description of CoreML custom layers see this nice overview.
For a complete list of arguments that can be passed to the convert method, see here or at the function signature in code
-
conversion of models defined via the
tf.keras
API is not supported via the coremltools keras converter. However, whentf.keras
is used in TensorFlow 2.x and the model exported to.h5
format, it can be converted via the TensorFlow converter. -
models with Keras lambda layers: use
custom_conversion_functions
, so that Keras lambda layers can be mapped to Core ML custom layers -
What if the converter errors out due to an unsupported layer, or an unsupported parameter in a layer? The coremltools Keras converter targets the Core ML specification version 3, the one released during the macOS 10.14, iOS 12 release cycle. Majority of the native layers of the
keras.io
API can be mapped to the iOS 12 Core ML layers. However if a conversion error due to an unsupported layer comes up, the recommended route is one of the following:- Upgrade to TensorFlow 2.x, and then use the newer
tf.keras
API and convert to Core ML via the TensorFlow converter - With TensorFlow 1.x, save the Keras model as a frozen graph def file in
.pb
format, instead of.h5
. Then use the TensorFlow converter.
Example:
- Upgrade to TensorFlow 2.x, and then use the newer
from keras.models import Sequential
from keras.layers import Dense, ReLU
h5_path = '/tmp/keras_model.h5'
pb_path = '/tmp/keras_model.pb'
mlmodel_path = '/tmp/keras_model.mlmodel'
model = Sequential()
model.add(Dense(32, input_shape=(784,)))
model.add(ReLU())
model.save(h5_path)
input_tensor_shapes, output_node_names = _save_h5_as_frozen_pb(h5_path, pb_path) # defined below
# convert the .pb file to .mlmodel via tfcoreml converter
import tfcoreml
mlmodel = tfcoreml.convert(
tf_model_path = pb_path,
mlmodel_path = mlmodel_path,
output_feature_names = output_node_names,
input_name_shape_dict = input_tensor_shapes,
minimum_ios_deployment_target='13')
Function to convert .h5 to .pb in Tensorflow 1.x:
def _save_h5_as_frozen_pb(h5_path, frozen_model_path, has_variables=True):
from keras.models import load_model
from keras import backend as K
import tensorflow as tf
import shutil, tempfile, os
from tensorflow.python.tools.freeze_graph import freeze_graph
K.set_learning_phase(0)
model = load_model(h5_path)
model_dir = tempfile.mkdtemp()
graph_def_file = os.path.join(model_dir, 'tf_graph.pb')
checkpoint_file = os.path.join(model_dir, 'tf_model.ckpt')
output_node_names = []
if isinstance(model.output, list):
for idx in range(len(model.output)):
output_node_names.append(model.output[idx].name[:-2])
else:
output_node_names.append(model.output.name[:-2])
tf_graph = K.get_session().graph
tf.reset_default_graph()
if has_variables:
with tf_graph.as_default() as g:
saver = tf.train.Saver()
with tf.Session(graph=tf_graph) as sess:
sess.run(tf.global_variables_initializer())
# save graph definition somewhere
tf.train.write_graph(sess.graph, model_dir, graph_def_file, as_text=False)
# save the weights
if has_variables:
saver.save(sess, checkpoint_file)
K.clear_session()
# freeze the graph
if has_variables:
freeze_graph(input_graph=graph_def_file,
input_saver="",
input_binary=True,
input_checkpoint=checkpoint_file,
output_node_names=",".join(output_node_names),
restore_op_name="save/restore_all",
filename_tensor_name="save/Const:0",
output_graph=frozen_model_path,
clear_devices=True,
initializer_nodes="")
if os.path.exists(model_dir):
shutil.rmtree(model_dir)
input_tensor_shapes = {}
if isinstance(model.input, list):
for idx in range(len(model.input)):
input_shape = [i for i in model.input_shape[idx]]
for i, d in enumerate(input_shape):
if d is None:
input_shape[i] = 1
input_tensor_shapes[model.input[idx].name[:-2]] = input_shape
else:
input_shape = [i for i in model.input_shape]
for i, d in enumerate(input_shape):
if d is None:
input_shape[i] = 1
input_tensor_shapes[model.input.name[:-2]] = input_shape
return input_tensor_shapes, output_node_names
Note: an alternative route that can be used in all of the cases above
is to first convert the Keras model to the .onnx
format and then use the ONNX converter described below.
The ONNX converter has been updated to target all Core ML specification versions from 1 to 4 (iOS 11 to iOS 13).
TensorFlow models can be converted to Core ML by using the tfcoreml
converter
(link to the GitHub repo), which
depends on the coremltools package.
pip install --upgrade tfcoreml
To convert models trained/saved via TensorFlow 1, first export them into the frozen graph def format, which is a protobuf file
format with .pb
as the extension. Frozen .pb
files can be obtained by using TensorFlow's
tensorflow.python.tools.freeze_graph
utility.
This Jupyter notebook shows how to freeze a graph to produce a .pb
file.
There are several other Jupyter notebook examples for conversion here.
import tfcoreml
tfcoreml.convert(tf_model_path='my_model.pb',
mlmodel_path='my_model.mlmodel',
output_feature_names=['softmax:0'], # name of the output tensor (appended by ":0")
input_name_shape_dict={'input:0': [1, 227, 227, 3]}, # map from input tensor name (placeholder op in the graph) to shape
minimum_ios_deployment_target='12')
# if the above invocation fails with an error, then update the
# minimum_ios_deployment_target to invoke the newer converter path:
tfcoreml.convert(tf_model_path='my_model.pb',
mlmodel_path='my_model.mlmodel',
output_feature_names=['softmax'], # name of the output op
input_name_shape_dict={'input': [1, 227, 227, 3]}, # map from the placeholder op in the graph to shape (can have -1s)
minimum_ios_deployment_target='13')
The argument minimum_ios_deployment_target
controls the set of Core ML layers that are used by the converter.
When its value is set to '12'
, only the set of layers that were shipped in Core ML during the iOS 12, macOS 14 release cycle are used.
It is recommended to first use this setting, since if successful, it produces a Core ML model that can be deployed to iOS 12 and higher.
In case, it results in an error due to an unsupported op or parameter, then the target should be set to '13'
, so that the
converter can utilize all the layers (including control flow, recurrent layers etc) that were shipped in Core ML in iOS 13.
There are 3 ways to export an inference graph in TensorFlow 2:
- Use the
tf.keras
APIs (Sequential, Functional, or Subclassing) and export to the.h5
HDF5 file orSavedModel
directory format. - Use TensorFlow's low-level APIs (along with
tf.keras
) and export to aSavedModel
directory format. - Use TensorFlow's low-level APIs and export as
concrete functions
format.
For all 3 cases tfcoreml
's convert()
function can be used to convert your model into Core ML model format. The argument minimum_ios_deployment_target
must be set to '13'
.
Converting a tf.keras
HDF5 model:
from tensorflow.keras.applications import ResNet50
import tfcoreml
keras_model = ResNet50(weights=None, input_shape=(224, 224, 3))
keras_model.save('./model.h5')
# print input shape
print(keras_model.input_shape)
# get input, output node names for the TF graph from the Keras model
input_name = keras_model.inputs[0].name.split(':')[0]
keras_output_node_name = keras_model.outputs[0].name.split(':')[0]
graph_output_node_name = keras_output_node_name.split('/')[-1]
model = tfcoreml.convert('./model.h5',
input_name_shape_dict={input_name: (1, 224, 224, 3)},
output_feature_names=[graph_output_node_name],
minimum_ios_deployment_target='13')
model.save('./model.mlmodel')
import tensorflow as tf
import tfcoreml
keras_model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(10, activation='softmax')
])
keras_model.save('/tmp/keras_model.h5')
# print input shape
print(keras_model.input_shape)
# get input, output node names for the TF graph from the Keras model
input_name = keras_model.inputs[0].name.split(':')[0]
keras_output_node_name = keras_model.outputs[0].name.split(':')[0]
graph_output_node_name = keras_output_node_name.split('/')[-1]
model = tfcoreml.convert(tf_model_path='/tmp/keras_model.h5',
input_name_shape_dict={input_name: (1, 28, 28)},
output_feature_names=[graph_output_node_name],
minimum_ios_deployment_target='13')
model.save('/tmp/keras_model.mlmodel')
Converting a SavedModel:
from tensorflow.keras.applications import MobileNet
import tfcoreml
keras_model = MobileNet(weights=None, input_shape=(224, 224, 3))
keras_model.save('./savedmodel', save_format='tf')
# tf.saved_model.save(keras_model, './savedmodel')
model = tfcoreml.convert('./savedmodel',
mlmodel_path='./model.mlmodel',
input_name_shape_dict={'input_1': (1, 224, 224, 3)},
output_feature_names=['Identity'],
minimum_ios_deployment_target='13')
See notebooks in here
or the unit test cases for more examples, on how to save to .h5
or SavedModel
or concrete functions
.
Note: When the value of minimum_ios_deployment_target
is set to '13'
, tfcoreml
directly calls coremltools to convert the TensorFlow models, as can be seen here.
The conversion code for minimum_ios_deployment_target
less than or equal to '12'
is entirely present in the tfcoreml
GitHub repo, whereas for minimum_ios_deployment_target
is equal to '13'
(or greater than in the future) the code is entirely present in the coremltools GitHub repo.
- Although majority of Core ML 3 (iOS 13, macOs 15) layers have been updated in the converter, there might be a few missing layers, or cases not handled.
Please file a GitHub issue if you encounter a bug while using the argument
minimum_ios_deployment_target='13'
. - The
tf.keras
conversion is only supported when TensorFlow 2.x is used. - TensorFlow 2.x model conversion is not supported with Python 2.
- Currently there are issues while exporting
tf.keras
graphs, that contain recurrent layers, to the.h5
format
PyTorch and MXNet models can be first exported to the ONNX format and then converted to Core ML via the
onnx-coreml
(https://github.com/onnx/onnx-coreml) converter.
pip install --upgrade onnx-coreml
from onnx_coreml import convert
ml_model = convert(model='my_model.onnx',
minimum_ios_deployment_target='12') # or minimum_ios_deployment_target = '13'
ml_model.save('my_model.mlmodel')
The argument minimum_ios_deployment_target
controls the set of Core ML layers that are used by the converter.
When its value is set to '12'
, only the set of layers that were shipped in Core ML during the iOS 12, macOS 14 release cycle are used.
It is recommended to first use this setting, since if successful, it produces a Core ML model that can be deployed to iOS 12 and higher.
In case, it results in an error due to an unsupported op or parameter, then the target should be set to '13'
, so that the
converter can utilize all the layers (including control flow, recurrent layers etc) that were shipped in Core ML in iOS 13.
Additional converter arguments are explained here
See additional examples here
Converting PyTorch model to CoreML model is a two step process:
- Convert PyTorch model to ONNX model
- PyTorch model can be converted into ONNX model using
torch.onnx.export
- Reference: https://pytorch.org/docs/stable/onnx.html#id2
- Tools required: PyTorch
- Convert ONNX model to CoreML
- Take the
.onnx
model and pass it to the functiononnx_coreml.convert()
- Tools required: onnx-coreml
PyTorch to ONNX conversion:
- Create a pyTorch model
import torch import torch.nn as nn import torch.nn.functional as F # Step 0 - (a) Define ML Model class small_model(nn.Module): def __init__(self): super(small_model, self).__init__() self.fc1 = nn.Linear(768, 256) self.fc2 = nn.Linear(256, 10) def forward(self, x): y = F.relu(self.fc1(x)) y = F.softmax(self.fc2(y)) return y
- Load model
# Step 0 - (b) Create model or Load from dist model = small_model() dummy_input = torch.randn(768)
- Convert From PyTorch to ONNX
# Step 1 - PyTorch to ONNX model torch.onnx.export(model, dummy_input, './small_model.onnx')
ONNX to CoreML:
# Step 2 - ONNX to CoreML model
from onnx_coreml import convert
mlmodel = convert(model='./small_model.onnx', minimum_ios_deployment_target='13')
# rename inputs/outputs
import coremltools
spec = mlmodel.get_spec()
coremltools.utils.rename_feature(spec, current_name='input.1', new_name='input_tensor')
coremltools.utils.rename_feature(spec, current_name='12', new_name='network_output')
mlmodel = coremltools.models.MLModel(spec)
# Save converted CoreML model
mlmodel.save('small_model.mlmodel')
Step 1 can be replaced by respective framework to ONNX converter.
- If the onnx opset version is greater than 9 and there are issues during conversion, please try exporting to onnx opset version 9 and then converting to Core ML
- onnx models with weight quantization and control flow layers (loop, branch) will give a conversion error since support for those has not been added yet to the converter
Code snippet of building a toy 2 layer Core ML model.
Here is an example of building the GPT2 model from scratch using the Core ML builder API, with pre-trained weights.
coremltools
provides utilities for performing post-training quantization for the weight parameters,
to reduce the size of the .mlmodel
file. By default the converters produce mlmodel that have weights
in FP32 precision. These can be quantized to either FP16 or to 8 bits, 7 bits, up to all the way to 1 bit.
The lower the number of bits, more the chances of degrading the model accuracy. The loss in accuracy varies with
the model.
Here is a code snippet on using the quantization utilities.
Neural network models can take as inputs two datatypes: either multi-arrays or image types. When using coremltools to call predict on a model, in the case of multi-arrays, a numpy array must be fed. In the case of an image, a PIL image python object should be used.
Multi-array prediction:
import coremltools
import numpy as np
model = coremltools.models.MLModel('path/to/the/saved/model.mlmodel')
# print input description to get input shape
print(model.description.input)
input_shape = (...) # insert correct shape of the input
# call predict
output_dict = model.predict({'input_name': np.random.rand(*input_shape)}, useCPUOnly=True)
Image prediction:
import coremltools
import numpy as np
import PIL.Image
model = coremltools.models.MLModel('path/to/the/saved/model.mlmodel')
Height = 20 # use the correct input image height
Width = 60 # use the correct input image width
# Scenario 1: load an image from disk
def load_image(path, resize_to=None):
# resize_to: (Width, Height)
img = PIL.Image.open(path)
if resize_to is not None:
img = img.resize(resize_to, PIL.Image.ANTIALIAS)
img_np = np.array(img).astype(np.float32)
return img_np, img
# load the image and resize using PIL utilities
_, img = load_image('/path/to/image.jpg', resize_to=(Width, Height))
out_dict = model.predict({'image': img})
# Scenario 2: load an image from a numpy array
shape = (Height, Width, 3) # height x width x RGB
data = np.zeros(shape, dtype=np.uint8)
# manipulate numpy data
pil_img = PIL.Image.fromarray(data)
out_dict = model.predict({'image': pil_img})
Code snippet for loading mlmodel and converting it to spec and vice versa.
Netron is a nice tool to visualize Core ML neural network models.
To print a text description of the model:
import coremltools
nn_mlmodel = coremltools.models.MLModel('path/to/the/model.mlmodel')
# To print a succinct description of the neural network
spec = nn_mlmodel.get_spec()
from coremltools.models.neural_network.printer import print_network_spec
print_network_spec(spec, style='coding')
# or
print_network_spec(spec)
To print information about the pre-processing parameters of the model (only applicable if the input is of type image)
import coremltools
def _get_nn_sepc(spec):
if spec.WhichOneof('Type') == 'neuralNetworkClassifier':
nn_spec = spec.neuralNetworkClassifier
if spec.WhichOneof('Type') == 'neuralNetwork':
nn_spec = spec.neuralNetwork
elif spec.WhichOneof('Type') == 'neuralNetworkRegressor':
nn_spec = spec.neuralNetworkRegressor
else:
raise ValueError('Spec does not have a neural network')
spec = coremltools.models.utils.load_spec('path/to/the/saved/model.mlmodel')
# Get neural network portion of the spec
nn_spec = _get_nn_sepc()
# print pre-processing parameter
print(nn_spec.preprocessing)
There are several utilities
to mark inputs with flexible
shapes.
import coremltools
model = coremltools.models.MLModel('path/to/the/saved/model.mlmodel')
spec = model.get_spec()
# lets say the name of the input feature is "input" that we want to rename to "input_tensor"
coremltools.utils.rename_feature(spec, current_name='input', new_name='input_tensor')
model = coremltools.models.MLModel(spec)
model.save('path/to/the/saved/model.mlmodel')
Here is the list of supported datatypes. For instance, change the datatype from 'double' to 'float32':
import coremltools
from coremltools.proto import FeatureTypes_pb2 as ft
model = coremltools.models.MLModel('path/to/the/saved/model.mlmodel')
spec = model.get_spec()
def _set_type_as_float32(feature):
if feature.type.HasField('multiArrayType'):
feature.type.multiArrayType.dataType = ft.ArrayFeatureType.FLOAT32
# iterate over the inputs
for input_ in spec.description.input:
_set_type_as_float32(input_)
# iterate over the outputs
for output_ in spec.description.output:
_set_type_as_float32(output_)
model = coremltools.models.MLModel(spec)
model.save('path/to/the/saved/model.mlmodel')
Sometimes we want to print out weights of a particular layer for debugging purposes.
Following is an example showing how we can utilize the protobuf
APIs to access any
attributes including the weight parameters. This code snippet uses the model we created in
the this
example.
import coremltools
import numpy as np
model = coremltools.models.MLModel('conv_prelu.mlmodel')
spec = model.get_spec()
print(spec)
layer = spec.neuralNetwork.layers[0]
weight_params = layer.convolution.weights
print('Weights of {} layer: {}.'.format(layer.WhichOneof('layer'), layer.name))
print(np.reshape(np.asarray(weight_params.floatValue), (1, 1, 3, 3)))