- redefine 16-bit multiply operator to perform saturating multiply
instead of non-saturating multiply
- implement 8-bit multiply operator to perform saturating multiply
- implement v_mul_wrap() for 8-bit, 16-bit non-saturating multiply
- improve performance of v_mul_hi() for VSX
- update intrin tests with new changes
- replace unv 16-bit multiplication operator with v_mul_wrap due behavior changes
- Several improvements depend on vpisarev review
* initial forward declarations for universal intrinsics
* move emulating SSE intrinsics into separate file
* implement v_mul_expand for 8-bit
* reimplement saturating multiply using v_mul_expand + v_pack
* map v_expand, v_load_expand, v_load_expand_q to sse4.1
* fix overflow avx2::v_pack(uint32)
* implement two universal intrinsics v_expand_low and v_expand_high
|
||
|---|---|---|
| .. | ||
| include/opencv2 | ||
| misc/java | ||
| perf | ||
| src | ||
| test | ||
| CMakeLists.txt | ||