opencv

Files

T

Paul E. Murphy 33fb253a66 core: vectorize dotProd_32s

Use 4x FMA chains to sum on SIMD 128 FP64 targets. On
x86 this showed about 1.4x improvement.

For PPC, do a full multiply (32x32->64b), convert to DP
then accumulate. This may be slightly less precise for
some inputs. But is 1.5x faster than the above which
is about 1.5x than the FMA above for ~2.5x speedup.

2019-08-20 15:28:36 -05:00

3rdparty/SoftFloat

…

doc

docs: intro formatting update, minor cleanup

2018-11-04 02:36:24 +00:00

include/opencv2

core: vectorize dotProd_32s

2019-08-20 15:28:36 -05:00

misc

Merge pull request #14440 from alalek:async_array

2019-06-08 20:57:15 +00:00