This section shows how to use python APIs. Refer to python API reference for more details and pplnn.py for usage examples.
For brevity, all code snippets assume that the following two lines are present:
from pyppl import nn as pplnn
from pyppl import common as pplcommon
In PPLNN
, an Engine
is a collection of op implementations running on specified devices such as CPU or NVIDIA GPU. For example, we can use the built-in x86.EngineFactory
:
x86_options = pplnn.x86.EngineOptions()
x86_engine = pplnn.x86.EngineFactory.Create(x86_options)
to create an engine running on x86-compatible CPUs, or use
cuda_options = pplnn.cuda.EngineOptions()
cuda_engine = pplnn.cuda.EngineFactory.Create(cuda_options)
to create an engine running on NVIDIA GPUs.
Use
runtime_builder = pplnn.onnx.RuntimeBuilderFactory.Create()
to create a onnx.RuntimeBuilder
, which is used for creating Runtime
instances.
onnx_model_file = "/path/to/onnx_model_file"
status = runtime_builder.LoadModelFromFile(onnx_model_file)
loads an ONNX model from the specified file.
resources = RuntimeBuilderResources()
resources.engines = [x86_engine] # or = [cuda_engine]
runtime_builder.SetResources(resources)
PPLNN
also supports multiple engines running in the same model. For example:
resources.engines = [x86_engine, cuda_engine]
status = runtime_builder.SetResources(resources)
The model will be partitioned into several parts and assign different ops to these engines automatically.
status = runtime_builder.Preprocess()
does some preparations before creating Runtime
instances.
runtime = runtime_builder.CreateRuntime()
creates a Runtime
instances.
We can get graph inputs using the following functions of Runtime
:
input_count = runtime.GetInputCount()
tensor = runtime.GetInputTensor(idx)
and fill input data(using randomg data in this snippet):
for i in range(runtime.GetInputCount()):
tensor = runtime.GetInputTensor(i)
dims = GenerateRandomDims(tensor.GetShape())
in_data = np.random.uniform(-1.0, 1.0, dims)
status = tensor.ConvertFromHost(in_data)
if status != pplcommon.RC_SUCCESS:
logging.error("copy data to tensor[" + tensor.GetName() + "] failed: " +
pplcommon.GetRetCodeStr(status))
sys.exit(-1)
ret_code = runtime.Run()
for i in range(runtime.GetOutputCount()):
tensor = runtime.GetOutputTensor(i)
shape = tensor.GetShape()
tensor_data = tensor.ConvertToHost()
out_data = np.array(tensor_data, copy=False)