-
Notifications
You must be signed in to change notification settings - Fork 508
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query regarding support of Executorch for ARM Ethos-U65 backend #9356
Comments
@zingo @Erik-Lundell @digantdesai can any of you answer? |
Hi @vikasbalaga , thx for your interest in Executorch and Ethos-U 🥇 Ethos-U65 is supported with Executorch as well, but we haven't given it too much love, yet. A couple of reasons; 1) there's no FVP to test it on 2) the AOT flow is very similar to Ethos-U55. Ethos-U65 is supported conceptually, it just needs some plumbing. For example the list you mention (model conversion script) and ArmCompileSpecBuilder. There's are some more places as well. If you are happy to give it a go, we can support you. Just push a PR and tag us (@digantdesai @freddan80 @per @zingo @oscarandersson8218 ). We'll give Ethos-U65 more attention medium term future. The runtime flow is slightly different. Ethos-U65 sits on an "ML island" (Cortex-M + Ethos-U subsystem, embedded) as part of a larger system (Cortex-A, rich OS). That means Executorch runtime should be on the ML island, and your application calling into Executorch runtime needs to communicate somehow with the Cortex-A system. That mean of communication could build on e.g. ethos-u-linux-driver-stack. Some (not too big) modifications will probably be needed for Executorch workloads. Hope this helps 👍 |
@freddan80 and others, thanks for your quick response.
Yes in my case it is (Cortex-A55, OS) and (Cortex-M33 + Ethos-U65) ML island and also I have a hardware setup available, so I don't need FVP
Yes, I am interested in trying it. I could find the following places, which require modification :
So, could you help me in finding other modifications that are required? Also, (I think this is a naive question), will this Executorch implementation work for my CPU Cortex-M33? |
I'd start with those and debug from there. The important thing is that the call to vela argument looks right. (we can help checking that)
The Note that you'd want to use a 'vela.ini' file that fits your system config, and provide that to |
I have modified the arm_aot_compile.py and I think I am able to generate *.pte model for Ethos U65 backend. The configuration details I picked based on my hardware type. (I have forked the repo and committed my changes in a private branch for your reference)
I tried modifying the build_executorch_runner.sh for my system (Cortex-M33 CPU and Ethos U65 NPU) but here I am observing cmake errors
I tried to debug it, but I couldn't understand how to update the "NPU timing adapters" as per Ethos U65 requirements. Also, it looks like we need to specify a |
Would it be possible to share the changes? I assigned this to @AdrianLundell. He'll help you. |
This points to the private branch I created by forking the repo. Thanks! |
Hi, nice work so far! The examples/arm/executor_runner-code and related CMake-scripts used to built it should be viewed as an example to get you started when building your own application. The build_executorch_runner.sh script and all flags containing target specific info in the runtime flow is there to make this example convenient to run and to help our testing, rather than being an official API. For example, the timing adapters and related macros TARGET_BOARD, SYSTEM_CONFIG and MEMORY_MODE which you mention are only relevant for the simulators so to answer you question there you can ignore those completely. The relevant parts of this CMakeScript is the linking of the libraries and the converting of the .pte to a header-file, with that done you can approach this as writing for any other application for u65. The simulator is of course very useful when developing so if you have not done so, I would suggest to start testing your model and executor_runner on u55 using the Corstone-300 target, and move to u65 when you have that working. |
I have tried performing inference on Corstone-300 FVP by following the steps mentioned here. With this I am able to perform inference on the FVP (I have tried simple model with "ADD" operation)
However, I have observed that the |
Nope that is correct, the example just allocates a 60MB buffer so we can test/used large models out of the box as the FVP can use quite much memory. See
You can either just change the code or set ET_ARM_BAREMETAL_METHOD_ALLOCATOR_POOL_SIZE from cmake as a workaround. We hope/plan to look into making the handling of this area better, and not ending up in the elf and such, but right now it is working like this. It's just a bad left over from when we "forked" from the examples/devtools/example_runner/example_runner.cpp :) |
@AdrianLundell, I tried to build my own application by adding CMake-scripts but I have hit a road block in providing custom linker script. I almost spent 2 weeks with not much progress :( So, I gave it up and then tried a second approach, where I will try to integrate Executorch libraries into a "working firmware application" that is available for my board (which comes with its own linker script).
(I have taken the arm_executor_runner as a reference for my application and using a sample pte model with only "add" operation) It looks like there is some operator registry in which all the ops need to be registered, but I am not sure how it works. Also, when I tried the examples, it looks like "add" is mapped to ethos-u delegate, so when I try to run it without delegate option, even on simulator I observed similar errors Thanks! |
Hi, |
Hi,
And here are the libs that I am linking to my application. (I am linking all *.a generated in
But I am still facing that issue in my application :( I can confirm that the build logs generated by CMake are identical b/w simulator and my application |
Sounds like a good approach to me to start with a known working application and build from there! For the operator registration, the executorch runtime has no operator implementations by default, so you need to cross compile them for Cortex-M55 using the This error suggests to me however that the network has not lowered probably since the add operator should be delegated to the Ethos-U rather than run on CPU, but maybe you have not come to that part yet? |
Yes, I wanted to start it slow, by first executing directly on my CPU (Cortex-M33) and then delegate it to Ethos-U65 I will try to compare both builds to see what I am missing, but one thing I am not sure is that, does the "
I just haven't implemented that part yet ;) |
I see, it could be a problem with the bindings as well, from the examples/arm/CmakeLists.txt:
Are you doing this? |
I tried comparing .map file of my application with that of simulator and observed that in my app, the symbols from some of the libs So, I tried linking the libs with So, is there a way, where I can target only specific libs (for the kernels) among the list, so that I will try to fit only those in the available memories? |
Hi,
I have started working with Executorch and in this section of launching Executorch on ARM Ethos-U, I have observed that only Ethos-U55 and Ethos-U85 are mentioned.
Also in the model conversion script and setup utilities, the supported targets only contain Ethos-U55 and Ethos-U85 variants
But I am trying to work with a Ethos-U65 based system, so does that mean Executorch only supports the above mentioned variants or does it also support Ethos-U65?
cc @digantdesai @freddan80 @per @zingo @oscarandersson8218
The text was updated successfully, but these errors were encountered: