opencv

Author	SHA1	Message	Date
rogday	9cd5a0a1e6	Merge pull request #21884 from rogday:cuda_cleanup Fix CUDA compilation issues and adjust thresholds. * Fix CUDA compilation issues and adjust thresholds. * add conformance tests to denylist	2022-04-19 16:40:25 +00:00
zihaomu	e36948cfbc	add ONNX OP sign, shrink and reciprocal	2022-04-07 15:32:12 +08:00
Alexander Alekhin	76fb3652fc	dnn(ocl): fix fp16 kernel compilation	2021-12-29 19:58:25 +00:00
Smirnov Egor	71a22e45b0	add celu, hardsigmoid, selu, thresholdedrelu layers	2021-12-18 03:19:54 +03:00
Smirnov Egor	1bd382c1d0	Add acos, acosh, asin, asinh, atan, atanh, cos, cosh, erf, hardswish, sin, sinh, softplus, softsign, tan layers	2021-12-17 18:19:40 +03:00
Alexander Alekhin	8b4fa2605e	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2021-12-03 12:32:49 +00:00
Smirnov Egor	0e2a3686c0	add alpha parameter to ELU layer	2021-11-30 12:20:35 +03:00
Alexander Alekhin	394e640909	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2021-11-13 15:11:30 +00:00
Alexander Alekhin	8041ab8a61	Merge pull request #21025 from alalek:issue_21004 * dnn(ocl4dnn): fix LRN layer accuracy problems - FP16 intermediate computation is not accurate and may provide NaN values * dnn(test): update tolerance for FP16	2021-11-12 01:54:07 +03:00
Smirnov Egor	1feb3838b5	add Ceil, Floor, Log, Round, Sqrt, Not, Equal, Less, Greater	2021-10-15 16:02:46 +03:00
Alexander Alekhin	8c2dd5fb9a	dnn(ocl4dnn): cleanup dead code, improve logging	2021-10-08 00:39:40 +00:00
Alexander Alekhin	f977d10a19	dnn(ocl): fix conv DWCONV workgroup	2021-10-01 18:52:07 +00:00
Alexander Alekhin	846317ef37	dnn(ocl): fix conv BASIC workgroup	2021-09-29 14:55:46 +00:00
Alexander Alekhin	35e824c287	dnn(ocl): fix out of bound access in GEMM-like kernels - dropped usage of CreateSubBuffer() - buffers lifetime management issue - fixed elementwise offset - avoid out of bounds read access	2021-09-06 18:17:21 +00:00
Alexander Alekhin	5578ad5e14	dnn(ocl): fix automatic globalsize adjusting - if kernel code doesn't support that	2021-09-06 03:11:29 +00:00
Alexander Alekhin	0a43b23275	Merge pull request #20651 from alalek:issue_18361	2021-09-04 18:22:12 +00:00
Alexander Alekhin	5b2c016834	dnn(ocl): avoid out of buffer access in copyWeightsSwizzled	2021-09-04 15:45:59 +00:00
Alexander Alekhin	407adc7061	dnn(ocl): fix buffer offsets in IDLF kernel - drop CreateSubBuffer - fix FUSED_CONV_ELTWISE mode	2021-09-04 15:28:35 +00:00
SamFC10	96947c30c0	Added exp layer backport of commit: `6111935835` partial backport of commit: `dd5976162b`	2021-02-28 19:59:40 +00:00
Alexander Alekhin	7631056b8a	Merge pull request #19114 from alalek:issue_18937	2020-12-15 20:47:05 +00:00
Alexander Alekhin	c240355cc6	dnn(ocl): avoid mess FP16/FP32 in convolution layer	2020-12-15 08:51:24 +00:00
Alexander Alekhin	4b3d2c8834	dnn(ocl): fix gemm kernels with beta=0 - dst is not initialized, may include NaN values - 0*NaN produces NaN	2020-12-15 00:58:43 +00:00
Alexander Alekhin	c08f29c803	dnn(opencl): fix convolution kernel w/o bias with activation	2020-09-27 23:42:30 +00:00
Tomoaki Teshima	74c8ccb45b	fix build error of kernel on Mali	2020-09-23 21:38:12 +09:00
Tomoaki Teshima	f77c2d700f	add explicit cast for half	2020-09-18 21:04:24 +09:00
Alexander Alekhin	1c8ee3f957	Merge pull request #17885 from alalek:dnn_ocl_slice_update DNN: OpenCL/slice update * dnn(ocl/slice): make slice kernel VTune friendly - more unique names - inline code of copy functions * dnn(ocl/slice): prefer to spawn more work groups - even in case with 1D copy - perf improvement up to 2x of kernel time (due to changed configuration 128x1x1 => 128x32x1) * dnn(ocl/slice): cache kernel exec info	2020-08-03 14:13:34 +00:00
Alexander Alekhin	81e027eef7	dnn: fix OpenCL implementation of Slice layer	2020-07-16 04:33:52 +00:00
thebhatman	8a18d132fc	Port Swish and Mish layers	2019-12-01 11:55:39 +03:00
Alexander Alekhin	24790e4061	Merge pull request #14899 from alalek:dnn_fix_bnll_layer * dnn: fix BNLLLayer implementation details: https://github.com/BVLC/caffe/blame/1.0/src/caffe/layers/bnll_layer.cpp#L17 * dnn: enable OCV/OpenCL BNLL layer	2019-06-26 23:04:26 +03:00
Dmitry Kurtaev	dfdc91f8c9	dnn: fix MVN layer (issue 14683)	2019-06-14 18:38:05 +03:00
Alexander Alekhin	eab6744ac7	dnn(ocl): use compile-time LOCAL_SIZE parameter instead of get_local_size(0) and dynamic local memory allocation	2019-02-05 15:51:16 +03:00
Alexander Alekhin	eec468fa13	dnn(ocl4dnn): calculate activation expression once - to avoid multiple conditional calls via sub_group() functions	2018-10-02 21:23:41 +00:00
Alexander Alekhin	0f031b6680	dnn(ocl4dnn): drop weights_buf - avoid memory access violation during "prefetch" stage	2018-09-30 20:35:41 +00:00
Dmitry Kurtaev	24ab751547	Merge pull request #12565 from dkurt:dnn_non_intel_gpu * Remove isIntel check from deep learning layers * Remove fp16->fp32 fallbacks where it's not necessary * Fix Kernel::run to prevent localsize > globalsize	2018-09-26 16:27:00 +03:00
Lubov Batanina	43f889ae1f	Merge pull request #12519 from l-bat:l-bat/onnx_parser Support asymmetric padding in pooling layer (#12519) * Add Inception_V1 support in ONNX * Add asymmetric padding in OpenCL and Inference engine * Refactoring	2018-09-17 20:26:17 +03:00
Dmitry Kurtaev	09fa758725	Replace Darknet's Reorg to permute layer	2018-09-12 18:13:39 +03:00
Marat K	38f8fc6c82	Merge pull request #12249 from kopytjuk:feature/region-layer-batch-mode Feature/region layer batch mode (#12249) * Add batch mode for Darknet networks. Swap variables in test_darknet. Adapt reorg layer to batch mode. Adapt region layer. Add OpenCL implementation. Remove trailing whitespace. Bugifx reorg opencl implementation. Fix bug in OpenCL reorg. Fix modulo bug. Fix bug. Reorg openCL. Restore reorg layer opencl code. OpenCl fix. Work on openCL reorg. Remove whitespace. Fix openCL region layer implementation. Fix bug. Fix softmax region opencl bug. Fix opencl bug. Fix openCL bug. Update aff_trans.cpp When the fullAffine parameter is set to false, the estimateRigidTransform function maybe return empty, then the _localAffineEstimate function will be called, but the bug in it will result in incorrect results. core(libva): support YV12 too Added to CPU path only. OpenCL code path still expects NV12 only (according to Intel OpenCL extension) cmake: allow to specify own libva paths via CMake: - `-DVA_LIBRARIES=/opt/intel/mediasdk/lib64/libva.so.2\;/opt/intel/mediasdk/lib64/libva-drm.so.2` android: NDK17 support tested with NDK 17b (17.1.4828580) Enable more deep learning tests using Intel's Inference Engine backend ts: don't pass NULL for std::string() constructor openvino: use 2018R3 defines experimental version++ OpenCV version++ OpenCV 3.4.3 OpenCV version '-openvino' openvino: use 2018R3 defines Fixed windows build with InferenceEngine dnn: fix variance setting bug for PriorBoxLayer - The size of second channel should be size[2] of output tensor, - The Scalar should be {variance[0], variance[0], variance[0], variance[0]} for _variance.size() == 1 case. Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com> Fix lifetime of networks which are loaded from Model Optimizer IRs Adds a small note describing BUILD_opencv_world (#12332) * Added a mall note describing BUILD_opencv_world cmake option to the Installation in Windows tutorial. * Made slight changes in BUILD_opencv_world documentation. * Update windows_install.markdown improved grammar Update opengl_interop.cpp resolves #12307 java: fix LIST_GET macro fix typo Added option to fail on missing testdata Fixed that object_detection.py does not work in python3. cleanup: IPP Async (IPP_A) except header file with conversion routines (will be removed in OpenCV 4.0) imgcodecs: add null pointer check Include preprocessing nodes to object detection TensorFlow networks (#12211) * Include preprocessing nodes to object detection TensorFlow networks * Enable more fusion * faster_rcnn_resnet50_coco_2018_01_28 test countNonZero function reworked to use wide universal intrinsics instead of SSE2 intrinsics resolve #5788 imgcodecs(webp): multiple fixes - don't reallocate passed 'img' (test fixed - must use IMREAD_UNCHANGED / IMREAD_ANYCOLOR) - avoid memory DDOS - avoid reading of whole file during header processing - avoid data access after allocated buffer during header processing (missing checks) - use WebPFree() to free allocated buffers (libwebp >= 0.5.0) - drop unused & undefined `.close()` method - added checks for channels >= 5 in encoder ml: fix adjusting K in KNearest (#12358) dnn(perf): fix and merge Convolution tests - OpenCL tests didn't run any OpenCL kernels - use real configuration from existed models (the first 100 cases) - batch size = 1 dnn(test): use dnnBackendsAndTargets() param generator Bit-exact resize reworked to use wide intrinsics (#12038) * Bit-exact resize reworked to use wide intrinsics * Reworked bit-exact resize row data loading * Added bit-exact resize row data loaders for SIMD256 and SIMD512 * Fixed type punned pointer dereferencing warning * Reworked loading of source data for SIMD256 and SIMD512 bit-exact resize Bit-exact GaussianBlur reworked to use wide intrinsics (#12073) * Bit-exact GaussianBlur reworked to use wide intrinsics * Added v_mul_hi universal intrinsic * Removed custom SSE2 branch from bit-exact GaussianBlur * Removed loop unrolling for gaussianBlur horizontal smoothing doc: fix English gramma in tutorial out-of-focus-deblur filter (#12214) * doc: fix English gramma in tutorial out-of-focus-deblur filter * Update out_of_focus_deblur_filter.markdown slightly modified one sentence doc: add new tutorial motion deblur filter (#12215) * doc: add new tutorial motion deblur filter * Update motion_deblur_filter.markdown a few minor changes Replace Slice layer to Crop in Faster-RCNN networks from Caffe js: use generated list of OpenCV headers - replaces hand-written list imgcodecs(webp): use safe cast to size_t on Win32 * Put Version status back to -dev. follow the common codestyle Exclude some target engines. Refactor formulas. Refactor code. * Remove unused variable. * Remove inference engine check for yolov2. * Alter darknet batch tests to test with two different images. * Add yolov3 second image GT. * Fix bug. * Fix bug. * Add second test. * Remove comment. * Add NMS on network level. * Add helper files to dev. * syntax fix. * Fix OD sample. Fix sample dnn object detection. Fix NMS boxes bug. remove trailing whitespace. Remove debug function. Change thresholds for opencl tests. * Adapt score diff and iou diff. * Alter iouDiffs. * Add debug messages. * Adapt iouDiff. * Fix tests	2018-09-12 13:29:43 +03:00
Alexander Alekhin	b597c87bed	dnn(ocl): avoid memory access violation	2018-07-27 15:35:11 +03:00
Dmitry Kurtaev	faa6c4e1e1	Faster-RCNN anf RFCN models on CPU using Intel's Inference Engine backend. Enable Torch layers tests with Intel's Inference Engine backend.	2018-07-25 19:04:55 +03:00
Alexander Alekhin	78d07e841d	Merge pull request #11959 from pengli:3.4	2018-07-17 11:20:02 +00:00
Li Peng	f0cadaa6e3	enable concat layer fuse for OCL target Signed-off-by: Li Peng <peng.li@intel.com>	2018-07-17 12:46:16 +08:00
Dmitry Kurtaev	dcc1beb1f8	Clip kernel for OpenCL PriorBox layer	2018-07-13 14:49:13 +03:00
Li Peng	4c5a86828a	Fix gemmlike convolution input reading use vload3 for half3 or float3 input vector reading, also check read position to see if it exceed input width Signed-off-by: Li Peng <peng.li@intel.com>	2018-07-11 15:25:21 +08:00
Li Peng	145eae321e	pooling ocl kernel optimization set global size with real output size, also optimize max pooling index computation if necessary. Signed-off-by: Li Peng <peng.li@intel.com>	2018-06-29 15:22:49 +08:00
Li, Peng	ab8022f74e	update convolution opencl kernels in dnn module (#11762 ) * optimize ocl kernel enqueue in fc layer Signed-off-by: Li Peng <peng.li@intel.com> * use CV_LOG_INFO in convolution auto tuning Signed-off-by: Li Peng <peng.li@intel.com> * update convolution IDLF kernel extend parameter tuning range, also cleanup ocl kernel implementation Signed-off-by: Li Peng <peng.li@intel.com> * update in-memory convolution cache config fp16 and fp32 cache config are stored separately Signed-off-by: Li Peng <peng.li@intel.com>	2018-06-25 17:06:18 +03:00
Tomoaki Teshima	2e9e71ab9e	make ocl4dnn available to run on other platform than Intel GPU	2018-05-29 19:18:10 +09:00
Li Peng	ba5e8befa9	fp16 ocl support for more layers Signed-off-by: Li Peng <peng.li@intel.com>	2018-05-16 22:45:04 +08:00
Li Peng	3dd916882a	fp16 ocl support for googlenet Signed-off-by: Li Peng <peng.li@intel.com>	2018-05-16 22:45:02 +08:00
Wu Zhiwen	ef937dd676	ocl4dnn: Fix SAME padding mode for convolve Signed-off-by: Wu, Zhiwen <zhiwen.wu@intel.com> Signed-off-by: Li Peng <peng.li@intel.com>	2018-02-28 21:02:41 +08:00
Li, Peng	5caf6244a3	Merge pull request #10922 from pengli:dnn * ave pooling ocl fix support the padded area control in ave pooling Signed-off-by: Li Peng <peng.li@intel.com> * warning fix: ununitialized field	2018-02-22 21:01:12 +03:00

1 2

86 Commits