Vitaly Tuzov
3b015dfc7d
Merge pull request #14210 from terfendail:wui_512
...
AVX512 wide universal intrinsics (#14210 )
* Added implementation of 512-bit wide universal intrinsics(WIP)
* Added implementation of 512-bit wide universal intrinsics: implemented WUI vector types(WIP)
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented load/store
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented fp16 load/store
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented recombine and zip, implemented non-saturating and saturating arithmetics
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented bit operations
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented comparisons
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented lane shifts and reduction
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented absolute values
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented rounding and cast to float
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented LUT
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented type extension/narrowing and matrix operations
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented load_deinterleave for 2 and 3 channels images
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented load_deinterleave for 2- and implemented for 4-channel images
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented store_interleave
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented signmask and checks
* Added implementation of 512-bit wide universal intrinsics(WIP): build fixes
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented popcount in case AVX512_BITALG is unavailable
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented zip
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented rotate for s8 and s16
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented interleave/deinterleave for s8 and s16
* Added implementation of 512-bit wide universal intrinsics(WIP): updated v512_set macros
* Added implementation of 512-bit wide universal intrinsics(WIP): fix for GCC wrong _mm512_abs_pd definition
* Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_zip to avoid AVX512_VBMI intrinsics
* Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_invsqrt to avoid AVX512_ER intrinsics
* Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_rotate, v_popcount and interleave/deinterleave for U8 to avoid AVX512_VBMI intrinsics
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed integral image SIMD part
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed warnings
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed load_deinterleave for u8 and u16
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed v_invsqrt accuracy for f64
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed interleave/deinterleave for u32 and u64
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed interleave_pairs, interleave_quads and pack_triplets
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed rotate_left
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed rotate_left/right, part 2
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed 512-wide universal intrinsics based resize
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed findContours by avoiding use of uint64 dependent 512-wide v_signmask()
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed trailing whitespaces
* Added implementation of 512-bit wide universal intrinsics(WIP): reworked specific intrinsic sets dependent parts to check availability of intrinsics based on CPU feature group defines
* Added implementation of 512-bit wide universal intrinsics(WIP):Updated AVX512 implementation of v_popcount to avoid AVX512VPOPCNTDQ intrinsics if unavailable.
* Added implementation of 512-bit wide universal intrinsics(WIP): Fixed universal intrinsics data initialisation, v_mul_wrap, v_floor, v_ceil and v_signmask.
* Added implementation of 512-bit wide universal intrinsics(WIP): Removed hasSIMD512()
* Added implementation of 512-bit wide universal intrinsics(WIP): Fixes for gcc build
* Added implementation of 512-bit wide universal intrinsics(WIP): Reworked v_signmask, v_check_any() and v_check_all() implementation.
2019-06-03 18:05:35 +03:00
Vitaly Tuzov
99b39aa5bd
Fixed out of bound reading in LINEAR_EXACT resize for 8UC3
2019-03-05 17:21:21 +03:00
Vitaly Tuzov
334c4d62b5
Merge pull request #13781 from terfendail:warp_wintr
...
Resize reworked using wide universal intrinsics (#13781 )
* Added wide universal intrinsics optimized implementation for 3 channel bit-exact linear resize
* Reworked linear resize using new wide LUT intrinsics
* Fix for VSX intrinsics
2019-02-20 14:30:28 +03:00
Alexander Alekhin
2d5ccc7b3e
imgproc(resize): update checks (static analyzers)
2018-12-03 13:13:48 +03:00
maver1
e397434cb6
Merge pull request #12877 from maver1:3.4
...
* Updated ICV packages and IPP integration
* core(test): minMaxIdx IPP regression test
* core(ipp): workaround minMaxIdx problem
* core(ipp): workaround meanStdDev() CV_32FC3 buffer overrun
* Returned semicolon after CV_INSTRUMENT_REGION_IPP()
2018-10-24 15:02:53 +03:00
Michał Janiszewski
c8e6ce304f
Catch exceptions by const-reference
...
Exceptions caught by value incur needless cost in C++, most of them can
be caught by const-reference, especially as nearly none are actually
used. This could allow compiler generate a slightly more efficient code.
2018-10-16 22:43:54 +02:00
take1014
24af70c7e0
resolves 11283
2018-10-12 23:08:25 +09:00
Vitaly Tuzov
9d602f2752
Replaced SSE2 area resize implementation with wide universal intrinsic implementation
2018-10-08 16:27:52 +03:00
Alexander Alekhin
92ec971453
Merge pull request #12526 from terfendail:avx2_resize_fix
2018-09-14 15:57:47 +00:00
Hamdi Sahloul
5d54def264
Add semicolons after CV_INSTRUMENT macros
2018-09-14 06:45:31 +09:00
Vitaly Tuzov
29770e13e8
Fixed bit-exact resize SIMD implementation for AVX2 baseline
2018-09-13 18:20:27 +03:00
Hamdi Sahloul
a39e0daacf
Utilize CV_UNUSED macro
2018-09-07 20:33:52 +09:00
Vitaly Tuzov
f9a5c4d181
Fixed bit-exact resize wide intrinsics implementation for 16U
2018-09-03 20:37:25 +03:00
Vitaly Tuzov
e345cb03d5
Bit-exact resize reworked to use wide intrinsics ( #12038 )
...
* Bit-exact resize reworked to use wide intrinsics
* Reworked bit-exact resize row data loading
* Added bit-exact resize row data loaders for SIMD256 and SIMD512
* Fixed type punned pointer dereferencing warning
* Reworked loading of source data for SIMD256 and SIMD512 bit-exact resize
2018-08-31 16:54:05 +03:00
Alexander Alekhin
b09a4a98d4
opencv: Use cv::AutoBuffer<>::data()
2018-07-04 19:11:29 +03:00
gnthibault
b46fef327e
Fixed Assertin error due to Size.area() overflowing
2018-06-08 11:22:36 +02:00
Alexander Alekhin
5d36ee2fe7
imgproc: apply CV_OVERRIDE/CV_FINAL
2018-03-28 17:57:59 +03:00
Maksim Shabunin
8b87c4b96a
Fixed several warnings produced by clang 6 and static analyzers
2018-01-16 15:26:28 +03:00
Alexander Alekhin
8acd05f12a
Merge pull request #10421 from cezheng:patch-1
2018-01-05 09:24:33 +00:00
Vadim Pisarevsky
3f68d6d8a7
Merge pull request #10392 from terfendail:bitexact_fallback
2017-12-22 13:23:55 +00:00
Vitaly Tuzov
5fdb42a7c9
Added fallback to generic linear resize in case bit-exact resize of provided matrix isn't supported
2017-12-22 14:29:50 +03:00
Ce Zheng
602b08d9c7
Update resize inline comments
...
Reading through the implementation, I feel this line of comment is not consistent with the actually code, so this is for correcting it.
2017-12-22 16:03:12 +08:00
Vitaly Tuzov
019162486c
Disabled universal intrinsic based implementation for bit-exact resize of 3-channel images
2017-12-22 10:08:30 +03:00
Vitaly Tuzov
1eb2fa9efb
Added universal intrinsics based implementations for CV_8UC2, CV_8UC3, CV_8UC4 bit-exact resizes.
2017-12-20 17:17:10 +03:00
Vitaly Tuzov
51cb56ef2c
Implementation of bit-exact resize. Internal calls to linear resize updated to use bit-exact version. ( #9468 )
2017-12-13 15:00:38 +03:00
Maksim Shabunin
184daa155f
Fixed minor issues reported by GCC 7.2
2017-11-03 18:06:39 +03:00
Vitaly Tuzov
e8caa9b5c0
removed unused interpolateLinear
2017-08-31 15:34:27 +03:00
Vitaly Tuzov
b1f46b6d69
Move resize implementation to separate file
2017-08-31 14:36:19 +03:00