- NnxBuildFlow and CmakeBuildFlow
- Neureka V2 support
- github action for testing neureka
- add NnxMapping dictionary that maps accelerator name to the accelerator specific classes
- choice of data generation method (ones, incremented, or random)
- N-EUREKA accelerator support: 3x3, 1x1, and 3x3 depthwise convolution kernels
- Support for kernels without normalization and quantization for NE16
- isort check
- publication citation
- support 32bit scale
- cmake support
- const qualifier to
<acc>_dev_tfunction arguments - support for N-EUREKA's dedicated weight memory
- wmem is no more a test configuration argument but a command line argument
- neureka is now tested with a more recent gcc version
- python requirements are changed into requirements-pip and requirements-conda
- conftest now passes only strings to test.py to improve readability of pytest logs
- NnxMemoryLayout is now NnxWeight and also has a method for source generation
- the
wmemfield in the test configurations is now required ne16_task_initgot split into smaller parts:ne16_task_init,ne16_task_set_op_to_conv,ne16_task_set_weight_offset,ne16_task_set_bits,ne16_task_set_norm_quant- strides in
ne16_task_set_strides,ne16_task_set_dims, andne16_task_set_ptrsare now strides between consecutive elements in that dimension ne16_task_queue_sizeis nowNE16_TASK_QUEUE_SIZEne16_task_set_ptrssplit intone16_task_set_ptrs_convandne16_task_set_ptrs_norm_quant
k_in_stride,w_in_stride,k_out_stride, andw_out_stridefromne16_nnx_dispatch_stride2x2modeattribute fromne16_quant_tstructure
- global shift should have been of type uint8 not int32
- type conversion compiler warning
- New Hardware Processing Engine (HWPE) device in
util/hwpe.h - A device structure for ne16
ne16_dev_tinne16/hal/ne16.hwhich extends the hwpe device - Test app Makefile has now an
ACCELERATORvariable to specify which accelerator is used
- Library functions no longer start with a generic
nnx_prefix but with<accelerator>_nnx_prefix to allow for usage of multiple kinds of accelerators in the same system - Decoupled board specific functionality into
ne16/bspwhich also contains constant global structures to the implementations of thene16_dev_tstructure - Moved all task related functions (
nnx_task_set_dims*) intone16/hal/ne16_task.c - Tests adjusted for the new interface
- Test data generation moved into source files with extern declarations to check the output from the main
- pyright errors
- formatting errors
- Stridded 2x2 mode needed to propagate
padding_bottomwhen input height is smaller then 5 - Test requirements where missing the toml dependency and
- Added timeout parameter to conftest.py
- Added stride arguments to
nnx_task_set_dims,nnx_task_set_dims_stride2x2, andnnx_dispatch_task_stride2x2
- Initial release of the repository.