Skip to content
Commit 56091b75 authored by Szabolcs Nagy's avatar Szabolcs Nagy
Browse files

Add new log implementation

Optimized log using carefully generated lookup table with 1/c and log(c)
values for small intervalls around 1.  The log(c) is very near a double
precision value, it has about 62 bits precision.  The algorithm is
log(2^k x) = k log(2) + log(c) + log(x/c), where the last term is
approximated by a polinomial of x/c - 1.  Near 1 a single polynomial of
x - 1 is used.

There is separate code path when fma instruction is not available for
computing x/c - 1 precisely, in which case the table size is doubled.

With the default configuration settings the worst case error is 0.519 ULP
(and 0.520 without FMA), the read only global data size is 2192 bytes
(4240 without FMA).  The non-nearest rounding error is less than 1 ULP.

Improvements on Cortex-A72 compared to current glibc master:
log latency: 1.98x
log thruput: 2.92x
parent e08890a7
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment