编译版本,默认配置,android-ndk-r27,xcode 15.2,ubuntu-20.04,ubuntu-22.04,ubuntu-24.04,vs2015,vs2017,vs2019,vs2022,emscripten-3.1.28
file | content | arch |
---|---|---|
ncnn-full-source.zip | 包含全部 submodule 代码的完整源码 | |
ncnn-android.zip | android 静态库/动态库 | armeabi-v7a + arm64-v8a + x86 + x86_64 |
ncnn-android-vulkan.zip | android 静态库/动态库,支持 GPU | armeabi-v7a + arm64-v8a + x86 + x86_64 |
ncnn-apple.zip | apple xcframework,ios + ios-simulator + macos + mac-catalyst + watchos + watchos-simulator + tvos + tvos-simulator + visionos + visionos-simulator | arm64 + arm64e + x86_64 |
ncnn-apple-vulkan.zip | apple xcframework,ios + ios-simulator + macos + mac-catalyst + watchos + watchos-simulator + tvos + tvos-simulator + visionos + visionos-simulator,支持 GPU | arm64 + arm64e + x86_64 |
ncnn-ios.zip | ios 静态库 | arm64 |
ncnn-ios-vulkan.zip | ios 静态库,支持 GPU | arm64 |
ncnn-ios-simulator.zip | ios simulator 静态库 | x86_64 + arm64 |
ncnn-ios-simulator-vulkan.zip | ios simulator 静态库,支持 GPU | x86_64 + arm64 |
ncnn-macos.zip | macos 静态库 | x86_64 + arm64 |
ncnn-macos-vulkan.zip | macos 静态库,支持 GPU | x86_64 + arm64 |
ncnn-mac-catalyst.zip | mac catalyst 静态库 | x86_64 + arm64 |
ncnn-mac-catalyst-vulkan.zip | mac catalyst 静态库,支持 GPU | x86_64 + arm64 |
ncnn-watchos.zip | watchos 静态库 | armv7k + arm64_32 |
ncnn-watchos-simulator.zip | watchos simulator 静态库 | x86_64 + arm64 |
ncnn-tvos.zip | tvos 静态库 | x86_64 + arm64 |
ncnn-tvos-vulkan.zip | tvos 静态库,支持 GPU | x86_64 + arm64 |
ncnn-tvos-simulator.zip | tvos simulator 静态库 | x86_64 + arm64 |
ncnn-tvos-simulator-vulkan.zip | tvos simulator 静态库,支持 GPU | x86_64 + arm64 |
ncnn-visionos.zip | visionos 静态库 | arm64 |
ncnn-visionos-vulkan.zip | visionos 静态库,支持 GPU | arm64 |
ncnn-visionos-simulator.zip | visionos simulator 静态库 | x86_64 + arm64 |
ncnn-visionos-simulator-vulkan.zip | visionos simulator 静态库,支持 GPU | x86_64 + arm64 |
ncnn-ubuntu.zip | ubuntu linux 静态库/动态库,支持 GPU,模型转换工具 | x86_64 |
ncnn-windows.zip | windows 静态库/动态库,支持 GPU,模型转换工具 | x86 + x64 + arm + arm64 |
ncnn-webassembly.zip | webassembly 静态库 | wasm32 + simd + threads + simd-threads |
新增RMSNorm层和对应的pnnx转换,单元测试
x86 convolution tiled gemm优化
量化工具支持 rnn/lstm/gru 动态量化
x86 lstm int8 sse2/xop/avx2/avx512/avx512vnni/avxvnni优化
arm rnn/lstm/gru int8 neon/asimdhp/asimddp优化
multiheadattention支持qdim参数与embed_dim不同
multiheadattention支持scale参数
更新pybind11到2.12支持numpy2
添加wasi支持(@quink-black)
添加x86/arm convolution/slice/concat oom单元测试
onnx2ncnn工具添加警告和推荐使用pnnx的信息输出(@lll143653)
修复x86 avx512 vnni指令派发失效的问题
增强x86/arm计算内核在内存不足时的错误返回
仅在windows arm平台使用ruapu指令集探测
windows mingw编译时支持大小核和SMT探测
修复powerpc vsx计算abs可能的错误
修复arm vfpv4条件下可能的fp16s/bf16s同时启用的冲突
修复aarch64架构l2-cache很小时因gemm K分块可能的越界读错误
修复riscv v tanh计算错误(@zhangyang2057)
arm/convolution_3x3_pack1to8_fp16s使用ldr/str替代ld1/st1优化(@quink-black)
修复c_api无参数函数声明(@quink-black)
c_api添加set_vulkan_device接口(@Baiyuetribe)
pyncnn添加从python bytes内存加载模型的接口(@joeyballentine)
为VkAndroidHardwareBufferImageAllocator添加NCNN_PLATFORM_API宏(@Xyzhao1999)
修复mingw64编译时avx崩溃和termux编译错误(@TianZerL)
修复在关闭NCNN_BF16时arm riscv编译错误
修复x86-wsl编译时的无用变量警告(@Tabbleman)
create_gpu_instance()中不进行destroy_gpu_instance()(@Asd-g)
更新ruapu.h(@lazyparser)
修复ndk-r27在cmake阶段的编译错误(@Galasnow)
添加yolov8示例代码(@whyb)
pnnx支持转换dynamo导出的onnx
pnnx默认编译onnx2pnnx支持,支持转换conv/convtranspose/pad/linear/softmax/relu/resize/upsample/avgpool/maxpool/batchnorm/lrn/layernorm/instancenorm/groupnorm/rnn/lstm/gru/prelu/gelu/elu/leakyrelu/relu6/celu/hardshrink/hardsigmoid/hardswish/clip/multiheadattention/reducemin/reducemax/reducemean/reducesum/reduceprod/logsoftmax/logsigmoid/mish/selu/sigmoid/silu/softmin/softplus/softshrink/softsign/tanh/tanhshrink/expand/permute/repeat/reshape/select/slice/cat/ceil/chunk/flatten/floor/maximum/minimum/split/squeeze/stack/transpose/unbind/unsqueeze
pnnx支持转换onnx指定inputshape
pnnx转换onnx遇到动态shape时尝试折叠非动态轴相关的常量
pnnx转换onnx合并简单的shape运算pattern
pnnx清除onnx中无用的cast
pnnx接受bf16的模型转换和输入输出类型
pnnx转换torch.tile/torch.where/torch.logaddexp
pnnx转换无dilation参数的F.maxpool到ncnn
pnnx转换1到2个轴参数的torch.roll到ncnn
pnnx转换有dim参数的torch.max/torch.min时返回tuple并自动删除没有用到的indice输出
pnnx合并onnx sdpa和qdim mha
pnnx识别sdpa的batch轴
pnnx支持torch-2.3和torch-2.4
pnnx不再折叠有就地操作的别名tensor为常量
pnnx转换到的ncnn模型py自动替换long为int
ci添加windows clang
ci添加harmonyos
ci添加mingw(@TianZerL)
ci添加esp32和esp32编译文档(@luxincn)
重构release ci脚本
发布ubuntu 24.04预编译包
发布visionos/visionos-simulator vulkan预编译包
pypi发布python 3.13预编译包
更新pytorch/onnx模型转换文档(@whyb)
添加riscv-gnu-toolchain编译文档(@Tabbleman)
添加harmonyos vulkan编译文档(@cugxchen)
修正vulkan-notes文档的错误(@roachsinai)
更新qcom855plus跑分数据
添加RaspberryPi 5 GPU超频跑分数据(@CharlieYu4994)
添加EPYC7742和V100跑分数据(@sakria9)
添加Snapdragon 888跑分数据(@chainsx)
添加RaspberryPi 5 CPU超频跑分数据(@chainsx)
添加OrangePi 5Plus跑分数据(@inspireMeNow)
添加Snapdragon 765G跑分数据(@inspireMeNow)
添加CVITEK SG2000跑分数据(@inspireMeNow)
添加OrangePi CM4跑分数据(@py1066)
添加Axera AX630C跑分数据(@UOPiceman)
添加Kunpeng 920 7260跑分数据(@violet73)
New Contributors
- @quink-black made their first contribution in #5436
- @Tabbleman made their first contribution in #5444
- @roachsinai made their first contribution in #5472
- @Asd-g made their first contribution in #5437
- @lazyparser made their first contribution in #5499
- @CharlieYu4994 made their first contribution in #5518
- @Xyzhao1999 made their first contribution in #5521
- @sakria9 made their first contribution in #5528
- @inspireMeNow made their first contribution in #5550
- @py1066 made their first contribution in #5551
- @UOPiceman made their first contribution in #5559
- @luxincn made their first contribution in #5567
- @zhangyang2057 made their first contribution in #5584
- @violet73 made their first contribution in #5606
Full Changelog: 2024041...2024082