opencv

Author	SHA1	Message	Date
Alexander Alekhin	4733a19bab	Merge pull request #16194 from alalek:fix_16192 * imgproc(test): resize(LANCZOS4) reproducer 16192 * imgproc: fix resize LANCZOS4 coefficients generation	2019-12-19 13:20:42 +03:00
Vitaly Tuzov	f5a84f75c4	Fix for CV_8UC2 linear resize vectorization	2019-12-18 21:41:36 +00:00
Paul Murphy	1c4a64f0a1	Merge pull request #16138 from pmur:reg_16137 * imgproc: Prevent 1B overrun of 8C3 SIMD optimization The fourth value read via v_load_q is essentially ignored, but can cause trouble if it happens to cross page boundaries. The final few iterations may attempt to read the most extreme elements of S, which will read 1B beyond the array in most aligment cases. Dynamically compute the stop. This could be hoised from the loop, but will require a more extensive change. Likewise, cleanup the iteration increment statements to make it more obvious they do channel count (3) elements per pass. This should resolve #16137 * imgproc(resize): extra check	2019-12-12 13:00:44 +03:00
Paul Murphy	a011035ed6	Merge pull request #15257 from pmur:resize * resize: HResizeLinear reduce duplicate work There appears to be a 2x unroll of the HResizeLinear against k, however the k value is only incremented by 1 during the unroll. This results in k - 1 duplicate passes when k > 1. Likewise, the final pass may not respect the work done by the vector loop. Start it with the offset returned by the vector op if implemented. Note, no vector ops are implemented today. The performance is most noticable on a linear downscale. A set of performance tests are added to characterize this. The performance improvement is 10-50% depending on the scaling. * imgproc: vectorize HResizeLinear Performance is mostly gated by the gather operations for x inputs. Likewise, provide a 2x unroll against k, this reduces the number of alpha gathers by 1/2 for larger k. While not a 4x improvement, it still performs substantially better under P9 for a 1.4x improvement. P8 baseline is 1.05-1.10x due to reduced VSX instruction set. For float types, this results in a more modest 1.2x improvement. * Update U8 processing for non-bitexact linear resize * core: hal: vsx: improve v_load_expand_q With a little help, we can do this quickly without gprs on all VSX enabled targets. * resize: Fix cn == 3 step per feedback Per feedback, ensure we don't overrun. This was caught via the failure observed in Test_TensorFlow.inception_accuracy.	2019-12-09 14:54:06 +03:00
clunietp	2185bce4b7	Fix 13577	2019-11-18 07:41:34 -05:00
Vitaly Tuzov	3b015dfc7d	Merge pull request #14210 from terfendail:wui_512 AVX512 wide universal intrinsics (#14210) * Added implementation of 512-bit wide universal intrinsics(WIP) * Added implementation of 512-bit wide universal intrinsics: implemented WUI vector types(WIP) * Added implementation of 512-bit wide universal intrinsics(WIP): implemented load/store * Added implementation of 512-bit wide universal intrinsics(WIP): implemented fp16 load/store * Added implementation of 512-bit wide universal intrinsics(WIP): implemented recombine and zip, implemented non-saturating and saturating arithmetics * Added implementation of 512-bit wide universal intrinsics(WIP): implemented bit operations * Added implementation of 512-bit wide universal intrinsics(WIP): implemented comparisons * Added implementation of 512-bit wide universal intrinsics(WIP): implemented lane shifts and reduction * Added implementation of 512-bit wide universal intrinsics(WIP): implemented absolute values * Added implementation of 512-bit wide universal intrinsics(WIP): implemented rounding and cast to float * Added implementation of 512-bit wide universal intrinsics(WIP): implemented LUT * Added implementation of 512-bit wide universal intrinsics(WIP): implemented type extension/narrowing and matrix operations * Added implementation of 512-bit wide universal intrinsics(WIP): implemented load_deinterleave for 2 and 3 channels images * Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented load_deinterleave for 2- and implemented for 4-channel images * Added implementation of 512-bit wide universal intrinsics(WIP): implemented store_interleave * Added implementation of 512-bit wide universal intrinsics(WIP): implemented signmask and checks * Added implementation of 512-bit wide universal intrinsics(WIP): build fixes * Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented popcount in case AVX512_BITALG is unavailable * Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented zip * Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented rotate for s8 and s16 * Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented interleave/deinterleave for s8 and s16 * Added implementation of 512-bit wide universal intrinsics(WIP): updated v512_set macros * Added implementation of 512-bit wide universal intrinsics(WIP): fix for GCC wrong _mm512_abs_pd definition * Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_zip to avoid AVX512_VBMI intrinsics * Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_invsqrt to avoid AVX512_ER intrinsics * Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_rotate, v_popcount and interleave/deinterleave for U8 to avoid AVX512_VBMI intrinsics * Added implementation of 512-bit wide universal intrinsics(WIP): fixed integral image SIMD part * Added implementation of 512-bit wide universal intrinsics(WIP): fixed warnings * Added implementation of 512-bit wide universal intrinsics(WIP): fixed load_deinterleave for u8 and u16 * Added implementation of 512-bit wide universal intrinsics(WIP): fixed v_invsqrt accuracy for f64 * Added implementation of 512-bit wide universal intrinsics(WIP): fixed interleave/deinterleave for u32 and u64 * Added implementation of 512-bit wide universal intrinsics(WIP): fixed interleave_pairs, interleave_quads and pack_triplets * Added implementation of 512-bit wide universal intrinsics(WIP): fixed rotate_left * Added implementation of 512-bit wide universal intrinsics(WIP): fixed rotate_left/right, part 2 * Added implementation of 512-bit wide universal intrinsics(WIP): fixed 512-wide universal intrinsics based resize * Added implementation of 512-bit wide universal intrinsics(WIP): fixed findContours by avoiding use of uint64 dependent 512-wide v_signmask() * Added implementation of 512-bit wide universal intrinsics(WIP): fixed trailing whitespaces * Added implementation of 512-bit wide universal intrinsics(WIP): reworked specific intrinsic sets dependent parts to check availability of intrinsics based on CPU feature group defines * Added implementation of 512-bit wide universal intrinsics(WIP):Updated AVX512 implementation of v_popcount to avoid AVX512VPOPCNTDQ intrinsics if unavailable. * Added implementation of 512-bit wide universal intrinsics(WIP): Fixed universal intrinsics data initialisation, v_mul_wrap, v_floor, v_ceil and v_signmask. * Added implementation of 512-bit wide universal intrinsics(WIP): Removed hasSIMD512() * Added implementation of 512-bit wide universal intrinsics(WIP): Fixes for gcc build * Added implementation of 512-bit wide universal intrinsics(WIP): Reworked v_signmask, v_check_any() and v_check_all() implementation.	2019-06-03 18:05:35 +03:00
Vitaly Tuzov	99b39aa5bd	Fixed out of bound reading in LINEAR_EXACT resize for 8UC3	2019-03-05 17:21:21 +03:00
Vitaly Tuzov	334c4d62b5	Merge pull request #13781 from terfendail:warp_wintr Resize reworked using wide universal intrinsics (#13781) * Added wide universal intrinsics optimized implementation for 3 channel bit-exact linear resize * Reworked linear resize using new wide LUT intrinsics * Fix for VSX intrinsics	2019-02-20 14:30:28 +03:00
Alexander Alekhin	2d5ccc7b3e	imgproc(resize): update checks (static analyzers)	2018-12-03 13:13:48 +03:00
maver1	e397434cb6	Merge pull request #12877 from maver1:3.4 * Updated ICV packages and IPP integration * core(test): minMaxIdx IPP regression test * core(ipp): workaround minMaxIdx problem * core(ipp): workaround meanStdDev() CV_32FC3 buffer overrun * Returned semicolon after CV_INSTRUMENT_REGION_IPP()	2018-10-24 15:02:53 +03:00
Michał Janiszewski	c8e6ce304f	Catch exceptions by const-reference Exceptions caught by value incur needless cost in C++, most of them can be caught by const-reference, especially as nearly none are actually used. This could allow compiler generate a slightly more efficient code.	2018-10-16 22:43:54 +02:00
take1014	24af70c7e0	resolves 11283	2018-10-12 23:08:25 +09:00
Vitaly Tuzov	9d602f2752	Replaced SSE2 area resize implementation with wide universal intrinsic implementation	2018-10-08 16:27:52 +03:00
Alexander Alekhin	92ec971453	Merge pull request #12526 from terfendail:avx2_resize_fix	2018-09-14 15:57:47 +00:00
Hamdi Sahloul	5d54def264	Add semicolons after `CV_INSTRUMENT` macros	2018-09-14 06:45:31 +09:00
Vitaly Tuzov	29770e13e8	Fixed bit-exact resize SIMD implementation for AVX2 baseline	2018-09-13 18:20:27 +03:00
Hamdi Sahloul	a39e0daacf	Utilize CV_UNUSED macro	2018-09-07 20:33:52 +09:00
Vitaly Tuzov	f9a5c4d181	Fixed bit-exact resize wide intrinsics implementation for 16U	2018-09-03 20:37:25 +03:00
Vitaly Tuzov	e345cb03d5	Bit-exact resize reworked to use wide intrinsics (#12038 ) * Bit-exact resize reworked to use wide intrinsics * Reworked bit-exact resize row data loading * Added bit-exact resize row data loaders for SIMD256 and SIMD512 * Fixed type punned pointer dereferencing warning * Reworked loading of source data for SIMD256 and SIMD512 bit-exact resize	2018-08-31 16:54:05 +03:00
Alexander Alekhin	b09a4a98d4	opencv: Use cv::AutoBuffer<>::data()	2018-07-04 19:11:29 +03:00
gnthibault	b46fef327e	Fixed Assertin error due to Size.area() overflowing	2018-06-08 11:22:36 +02:00
Alexander Alekhin	5d36ee2fe7	imgproc: apply CV_OVERRIDE/CV_FINAL	2018-03-28 17:57:59 +03:00
Maksim Shabunin	8b87c4b96a	Fixed several warnings produced by clang 6 and static analyzers	2018-01-16 15:26:28 +03:00
Alexander Alekhin	8acd05f12a	Merge pull request #10421 from cezheng:patch-1	2018-01-05 09:24:33 +00:00
Vadim Pisarevsky	3f68d6d8a7	Merge pull request #10392 from terfendail:bitexact_fallback	2017-12-22 13:23:55 +00:00
Vitaly Tuzov	5fdb42a7c9	Added fallback to generic linear resize in case bit-exact resize of provided matrix isn't supported	2017-12-22 14:29:50 +03:00
Ce Zheng	602b08d9c7	Update resize inline comments Reading through the implementation, I feel this line of comment is not consistent with the actually code, so this is for correcting it.	2017-12-22 16:03:12 +08:00
Vitaly Tuzov	019162486c	Disabled universal intrinsic based implementation for bit-exact resize of 3-channel images	2017-12-22 10:08:30 +03:00
Vitaly Tuzov	1eb2fa9efb	Added universal intrinsics based implementations for CV_8UC2, CV_8UC3, CV_8UC4 bit-exact resizes.	2017-12-20 17:17:10 +03:00
Vitaly Tuzov	51cb56ef2c	Implementation of bit-exact resize. Internal calls to linear resize updated to use bit-exact version. (#9468 )	2017-12-13 15:00:38 +03:00
Maksim Shabunin	184daa155f	Fixed minor issues reported by GCC 7.2	2017-11-03 18:06:39 +03:00
Vitaly Tuzov	e8caa9b5c0	removed unused interpolateLinear	2017-08-31 15:34:27 +03:00
Vitaly Tuzov	b1f46b6d69	Move resize implementation to separate file	2017-08-31 14:36:19 +03:00

33 Commits