* [hal][neon] Optimize the v_dotprod_fast intrinsics for aarch64. On Armv8 in AArch64 execution mode, we can skip the sequence v<op>_<ty>(vget_high_<ty>(x), vget_high_<ty>(y)) in favour of v<op>_high_<ty>(x, y) This has better changes for recent compilers to use less data movement operations and better register allocation. See for example: https://godbolt.org/z/bPq7vd * [hal][neon] Fix build failure on armv7. * [hal][neon] Address review comments in PR. PR: https://github.com/opencv/opencv/pull/19486 * [hal][neon] Define macro to check for the AArch64 execution state of Armv8. * [hal][neon] Fix macro definition for AArch64. The fix is needed to prevent warnings when building for Armv7. |
||
|---|---|---|
| .. | ||
| 3rdparty/SoftFloat | ||
| doc | ||
| include/opencv2 | ||
| misc | ||
| perf | ||
| src | ||
| test | ||
| CMakeLists.txt | ||