commit | 5b3d182c340beaaf4a7dc1db61d3fc21d1d31c97 | [log] [tgz] |
---|---|---|
author | Elliott Hughes <enh@google.com> | Fri Nov 15 23:16:11 2024 +0000 |
committer | Elliott Hughes <enh@google.com> | Fri Nov 15 23:16:11 2024 +0000 |
tree | 3916fe38d5fd6f567e80b2de8d646ea0f2fa5dff | |
parent | c157ba609a0d47d39aa8d38d41ecc82b6d8a7b56 [diff] |
Remove unused -Wno-implicit-function-declaration. This is an error in C23 anyway. Change-Id: I1b5c102df21de79ea3c07e737b8e1347f18e4aad
XNNPACK is a highly optimized library of floating-point neural network inference operators for ARM, WebAssembly, and x86 platforms. XNNPACK is not intended for direct use by deep learning practitioners and researchers; instead it provides low-level performance primitives for accelerating high-level machine learning frameworks, such as TensorFlow Lite, TensorFlow.js, PyTorch, and MediaPipe.
XNNPACK implements the following neural network operators:
All operators in XNNPACK support NHWC layout, but additionally allow custom stride along the Channel dimension. Thus, operators can consume a subset of channels in the input tensor, and produce a subset of channels in the output tensor, providing a zero-cost Channel Split and Channel Concatenation operations.
The table below presents single-threaded performance of XNNPACK library on three generations of MobileNet models and three generations of Pixel phones.
Model | Pixel, ms | Pixel 2, ms | Pixel 3a, ms |
---|---|---|---|
FP32 MobileNet v1 1.0X | 82 | 86 | 88 |
FP32 MobileNet v2 1.0X | 49 | 53 | 55 |
FP32 MobileNet v3 Large | 39 | 42 | 44 |
FP32 MobileNet v3 Small | 12 | 14 | 14 |
The following table presents multi-threaded (using as many threads as there are big cores) performance of XNNPACK library on three generations of MobileNet models and three generations of Pixel phones.
Model | Pixel, ms | Pixel 2, ms | Pixel 3a, ms |
---|---|---|---|
FP32 MobileNet v1 1.0X | 43 | 27 | 46 |
FP32 MobileNet v2 1.0X | 26 | 18 | 28 |
FP32 MobileNet v3 Large | 22 | 16 | 24 |
FP32 MobileNet v3 Small | 7 | 6 | 8 |
Benchmarked on March 27, 2020 with end2end_bench --benchmark_min_time=5
on an Android/ARM64 build with Android NDK r21 (bazel build -c opt --config android_arm64 :end2end_bench
) and neural network models with randomized weights and inputs.
The table below presents multi-threaded performance of XNNPACK library on three generations of MobileNet models and three generations of Raspberry Pi boards.
Model | RPi Zero W (BCM2835), ms | RPi 2 (BCM2836), ms | RPi 3+ (BCM2837B0), ms | RPi 4 (BCM2711), ms | RPi 4 (BCM2711, ARM64), ms |
---|---|---|---|---|---|
FP32 MobileNet v1 1.0X | 3919 | 302 | 114 | 72 | 77 |
FP32 MobileNet v2 1.0X | 1987 | 191 | 79 | 41 | 46 |
FP32 MobileNet v3 Large | 1658 | 161 | 67 | 38 | 40 |
FP32 MobileNet v3 Small | 474 | 50 | 22 | 13 | 15 |
INT8 MobileNet v1 1.0X | 2589 | 128 | 46 | 29 | 24 |
INT8 MobileNet v2 1.0X | 1495 | 82 | 30 | 20 | 17 |
Benchmarked on Feb 8, 2022 with end2end-bench --benchmark_min_time=5
on a Raspbian Buster build with CMake (./scripts/build-local.sh
) and neural network models with randomized weights and inputs. INT8 inference was evaluated on per-channel quantization schema.
XNNPACK is a based on QNNPACK library. Over time its codebase diverged a lot, and XNNPACK API is no longer compatible with QNNPACK.
刚怀孕吃什么水果对胎儿好 | 什么的小院 | 头上汗多是什么原因 | 打呼噜是什么原因引起的 | 叶酸有什么作用 |
红糖不能和什么一起吃 | 什么太空 | 中筋面粉是什么粉 | 是指什么 | 经常催吐有什么危害 |
mt是什么单位 | 什么叫一个周期 | 耳鸣是什么 | star什么意思 | 水晶绒是什么面料 |
林可霉素主治什么病 | 脖子为什么会痒 | 声东击西什么意思 | 鼻梁痛什么原因引起的 | 金为什么克木 |
易烊千玺是什么星座onlinewuye.com | 上海市长什么级别hcv7jop7ns4r.cn | 采是什么意思jinxinzhichuang.com | 阑尾炎是什么病hcv8jop9ns1r.cn | 下午六点是什么时辰mmeoe.com |
猪肉什么馅的饺子好吃hcv7jop9ns4r.cn | 若无其事的若是什么意思hcv7jop5ns6r.cn | 唇釉是什么hcv7jop6ns7r.cn | 6月5号什么星座hcv8jop2ns4r.cn | 牛的五行属什么hcv9jop2ns3r.cn |
十月初一是什么节hcv8jop2ns6r.cn | 脚底疼是什么原因引起的hkuteam.com | 乳头有点痒是什么原因hcv9jop5ns4r.cn | 什么是管制hcv7jop9ns3r.cn | 脚气挂什么科hcv8jop2ns8r.cn |
什么颜色的猫最旺财hcv8jop1ns7r.cn | 猫为什么不怕蛇hcv9jop4ns2r.cn | 青蛙像什么hcv9jop0ns5r.cn | 什么是珠心算hcv8jop5ns7r.cn | 2009年属什么hcv9jop7ns0r.cn |