how to use GPU with TensorRT #21561

alikad · 2024-07-30T11:45:58Z

alikad
Jul 30, 2024

Hi,
i wrote the below code and except that gpu goes up when i run it. but the only thing that happens is raising up the cpu.
by the way the output result is correct.
where is the problem?

#include
#include
#include <onnxruntime_cxx_api.h>
#include
#include
using namespace std;
int main()
{
constexpr int frameChannels = 1 * 184;
constexpr int frameHeight = 128;
constexpr int frameWidth = 128;
const std::string engineCachePath = "C:/tmp/";
const wchar_t* modelPath = L"metal.onnx";

Ort::Env env(OrtLoggingLevel::ORT_LOGGING_LEVEL_WARNING, "test");
Ort::SessionOptions sessionOptions;
auto& ortApi = Ort::GetApi();

// ########## Setup TRT options ############
OrtTensorRTProviderOptionsV2* pTrtOptions = nullptr;
ortApi.CreateTensorRTProviderOptions(&pTrtOptions);
std::unique_ptr<OrtTensorRTProviderOptionsV2, decltype(ortApi.ReleaseTensorRTProviderOptions)> trtOptions(
	pTrtOptions, ortApi.ReleaseTensorRTProviderOptions
);
std::vector<const char*> trtKeys{
	"device_id",
	"trt_fp16_enable",
	"trt_dla_enable",
	"trt_dla_core",
	"trt_engine_cache_enable",
	"trt_engine_cache_path" };
std::vector<const char*> trtValues{ "0", "1", "1", "0", "1", engineCachePath.c_str() };
ortApi.UpdateTensorRTProviderOptions(trtOptions.get(), trtKeys.data(), trtValues.data(), trtKeys.size());
ortApi.SessionOptionsAppendExecutionProvider_TensorRT_V2(sessionOptions, trtOptions.get());
// ########## Setup CUDA options ############
OrtCUDAProviderOptionsV2* pCudaOptions = nullptr;
ortApi.CreateCUDAProviderOptions(&pCudaOptions);
std::unique_ptr<OrtCUDAProviderOptionsV2, decltype(ortApi.ReleaseCUDAProviderOptions)> cudaOptions(
	pCudaOptions, ortApi.ReleaseCUDAProviderOptions
);
std::vector<const char*> keys{ "device_id", "cudnn_conv_use_max_workspace", "do_copy_in_default_stream" };
std::vector<const char*> values{ "0", "1", "1" };
ortApi.UpdateCUDAProviderOptions(cudaOptions.get(), keys.data(), values.data(), keys.size());
ortApi.SessionOptionsAppendExecutionProvider_CUDA_V2(sessionOptions, cudaOptions.get());

// ###########################################

Ort::Session session{ nullptr };
size_t numInputNodes;
size_t numOutputNodes;
std::vector<Ort::AllocatedStringPtr> inputNodeNamesAllocated;
std::vector<const char*> inputNodeNames;
std::vector<Ort::AllocatedStringPtr> outputNodeNamesAllocated;
std::vector<const char*> outputNodeNames;
std::vector<int64_t> inputTensorShape;
try {
	session = Ort::Session(env, modelPath, sessionOptions);

	Ort::AllocatorWithDefaultOptions allocator;
	numInputNodes = session.GetInputCount();
	for (size_t idx = 0; idx < numInputNodes; ++idx) {
		inputNodeNamesAllocated.push_back(session.GetInputNameAllocated(idx, allocator));
		inputNodeNames.push_back(inputNodeNamesAllocated.back().get());
	}

	numOutputNodes = session.GetOutputCount();
	for (size_t idx = 0; idx < numOutputNodes; ++idx) {
		outputNodeNamesAllocated.push_back(session.GetOutputNameAllocated(idx, allocator));
		outputNodeNames.push_back(outputNodeNamesAllocated.back().get());
	}

	inputTensorShape = session.GetInputTypeInfo(0).GetTensorTypeAndShapeInfo().GetShape();
	for (auto& dim : inputTensorShape) {
		if (dim == -1)
			dim = 1;
	}
}
catch (const std::exception& e) {
	std::cout << "Error: " << e.what() << std::endl;
	return 1;
}
std::unique_ptr<float[]> dummyImage = std::make_unique<float[]>(frameWidth * frameHeight * frameChannels);
Ort::MemoryInfo memoryInfo = Ort::MemoryInfo::CreateCpu(OrtArenaAllocator, OrtMemTypeDefault);
Ort::Value inputTensor = Ort::Value::CreateTensor<float>(
	memoryInfo,
	dummyImage.get(),
	frameWidth * frameHeight * frameChannels,
	inputTensorShape.data(),
	inputTensorShape.size()
);
Ort::RunOptions runOptions;
try {
	for (size_t i = 0; i < 100; i++)
	{
		std::vector<Ort::Value> outputTensors = session.Run(
			runOptions, inputNodeNames.data(), &inputTensor, numInputNodes, outputNodeNames.data(), numOutputNodes
		);
	}
}
catch (const std::exception& e) {
	std::cout << "Error: " << e.what() << std::endl;
	return 1;
}
return 0;

}

alikad · 2024-08-03T04:27:47Z

alikad
Aug 3, 2024
Author

can anyone help me? @carzh

0 replies

tianleiwu · 2024-08-20T14:48:23Z

tianleiwu
Aug 20, 2024
Collaborator

You may notice that the inputs and outputs are in CPU:

Ort::MemoryInfo memoryInfo = Ort::MemoryInfo::CreateCpu(OrtArenaAllocator, OrtMemTypeDefault);

That means each inference need copy inputs from CPU to GPU, and outputs from GPU to CPU. So, it is expected that there are CPU activity.

If you want to verify whether TensorRT is used, you can enable profiling: https://onnxruntime.ai/docs/performance/tune-performance/profiling-tools.html#in-code-performance-profiling, and use NSight System

5 replies

alikad Sep 14, 2024
Author

thanks for the reply.
is there any complete c++ example of using gpu?

tianleiwu Sep 14, 2024
Collaborator

thanks for the reply. is there any complete c++ example of using gpu?

https://github.com/microsoft/onnxruntime-inference-examples/blob/main/c_cxx/squeezenet/main.cpp

alikad Sep 15, 2024
Author

but the link is for cpu usage.
auto memory_info = Ort::MemoryInfo::CreateCpu(OrtArenaAllocator, OrtMemTypeDefault);
am i wrong?

tianleiwu Sep 15, 2024
Collaborator

but the link is for cpu usage. auto memory_info = Ort::MemoryInfo::CreateCpu(OrtArenaAllocator, OrtMemTypeDefault); am i wrong?

That example uses inputs in CPU. If your input data is in GPU memory, you can use I/O binding.

The model will run in TRT provider, which uses GPU.

alikad Sep 16, 2024
Author

excuse me for my questions. I'm new at tensorRt.
where can i find the i/o binding cpp example?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to use GPU with TensorRT #21561

{{title}}

Replies: 2 comments 5 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

how to use GPU with TensorRT #21561

alikad Jul 30, 2024

Replies: 2 comments · 5 replies

alikad Aug 3, 2024 Author

tianleiwu Aug 20, 2024 Collaborator

alikad Sep 14, 2024 Author

tianleiwu Sep 14, 2024 Collaborator

alikad Sep 15, 2024 Author

tianleiwu Sep 15, 2024 Collaborator

alikad Sep 16, 2024 Author

alikad
Jul 30, 2024

Replies: 2 comments 5 replies

alikad
Aug 3, 2024
Author

tianleiwu
Aug 20, 2024
Collaborator

alikad Sep 14, 2024
Author

tianleiwu Sep 14, 2024
Collaborator

alikad Sep 15, 2024
Author

tianleiwu Sep 15, 2024
Collaborator

alikad Sep 16, 2024
Author