- Jul 22, 2019
-
-
Szabolcs Nagy authored
To allow subprojects other than math, the build system and directory layout is changed: all math related code, tools and tests are under the math directory now, new subprojects should be similarly self- contained. The top level Makefile design is still kept, but the subproject build directories provide their own Dir.mk with the build rules for the subproject. The user interface of config.mk is kept for now, in the future subproject specific flags and make variables may be added for finer grained control.
-
- Jul 18, 2019
-
-
Szabolcs Nagy authored
math/single contained code for systems without double precision fpu and rem_pio2 is not used currently and likely will be designed differently when double precision trigonometric functions are added.
-
Szabolcs Nagy authored
Check ULP error by random sampling and comparing against a higher precision implmenetation. This is similar to the randomized tests in the mathtest code, but it runs on the target only and is much faster to allow exhaustive ULP checks for single precision functions. It also supports non-nearest rounding modes. The ULP error is reported in an unconventional way: instead of the difference between the observed rounded result and accurate result, the minimum error is reported that makes the accurate result round to the observed result. This is more useful for comparing errors across different rounding modes. In nearest-rounding mode usually 0.5 has to be added to the reported error to get the conventional ULP error. The code optionally depends on mpfr. On targets where double has the same format as long double, mpfr is required for testing double precision functions. By default there is no dependency on mpfr on the target, to use mpfr add -DUSE_MPFR to the CFLAGS and -lmpfr to LDLIBS. ucheck is a new make target for running ulp error checks. Typical usage and output of the new ulp tool: $ build/bin/ulp -e .001 exp 1.0 2.0 12345 exp(0x1.79ef3658a63c9p+0) got 0x1.181caa32757a7p+2 want 0x1.181caa32757a6p+2 +0.499708 ulp err 0.000291756 exp(0x1.9c8a65340f80cp+0) got 0x1.40a8032e5f576p+2 want 0x1.40a8032e5f575p+2 +0.498903 ulp err 0.00109668 FAIL exp in [0x1p+0;0x1p+1] round n errlim 0.001 maxerr 0.00109668 +0.5 cnt 12345 cnt1 6 0.0486027% cnt2 0 0% cntfail 1 0.00810045% Floating-point exceptions are not guaranteed to be reported accurately and can be turned off by -f. The implementation is generic over the argument types which complicates the code, but at least the difficult inner loop logic is not repeated many times this way. To add support for a new function foo, the fun array needs to be updated with an entry for foo. (This usually requires the functions foo, fool and mpfr_foo to be defined.)
-
Szabolcs Nagy authored
Scripts are known to work with sollya 6.0.
-
Szabolcs Nagy authored
-
- Jul 01, 2019
-
-
Simon Tatham authored
When range-reducing a value of the maximum double precision exponent, it was possible to read one word past the end of the const data array. The error introduced was negligible, but of course overrunning an array is a bug regardless.
-
- Dec 10, 2018
-
-
Szabolcs Nagy authored
This fix is slightly more correct than the previous one (which introduced a 1ulp error when rounding toward zero near the overflow limit) and has slightly smaller code size.
-
- Dec 07, 2018
-
-
Szabolcs Nagy authored
The current code aims to support FLT_EVAL_METHOD!=0 targets (such as i386 with x87 fpu or m68k) assuming appropriate narrowing eval functions are defined for them. But the narrowing eval functions were not used consistently: the return statement may not guarantee narrowing (e.g. that was the C99 behaviour which got changed in C11 annex F) so we should use the narrowing eval_as_ functions at return statements too. Results should be correct if narrowing only happens at eval_as_ calls. On most targets this change has no effect because eval_as_ is a noop. Most math implementations that care about excess precision already compile in a mode that narrows at returns so this change is not necessary for them, just better documents the assumptions.
-
Szabolcs Nagy authored
The threshold value at which powf overflows depends on the rounding mode and the current check did not take this into account. So when the result was rounded away from zero it could become infinity without setting errno to ERANGE. Example: pow(0x1.7ac7cp+5, 23) is 0x1.fffffep+127 + 0.1633ulp If the result goes above 0x1.fffffep+127 + 0.5ulp then errno is set, which is fine in nearest rounding mode, but powf(0x1.7ac7cp+5, 23) is inf in upward rounding mode powf(-0x1.7ac7cp+5, 23) is -inf in downward rounding mode and the previous implementation did not set errno in these cases. This special case is fixed without affecting the common code path by checking the rounding mode and using appropriate threshold value. Arithmetics is used to check the rounding mode to avoid introducing a stack frame when calling fegetround. Unfortunately the current test system does not support rounding modes. (Found by comparing against musl powf behaviour on the libc-test test suite, unfortunately that test system does not support errno.)
-
- Nov 22, 2018
-
-
Szabolcs Nagy authored
-
- Sep 18, 2018
-
-
Szabolcs Nagy authored
checkint in pow is not supposed to be used with 0, inf or nan inputs.
-
- Sep 05, 2018
-
-
Szabolcs Nagy authored
Add comments with enough detail so the log lookup tables can be recreated.
-
- Aug 17, 2018
-
-
Simon Tatham authored
This program was originally developed for a much earlier version of Julia, and there have been a lot of changes to Julia semantics since then. But 1.0 is intended to be stable, so updating once to work with that should mean that no further updates are needed for a long time. Changes relevant to this script include new API calls for making arrays, evaluating Julia source-code expressions from strings, and parsing strings as numbers; scope and semantics changes requiring extra escaping in the @debug macro and some explicit 'global' to allow for-loops to update variables outside themselves; a syntax change for tuple types; and replacing 'beginswith' on strings with 'startswith'. There are a couple of remaining inconveniences with this version of the code. Firstly, the standard Julia 1.0 interpreter will consume a "--" option terminator on its command line even if it appears after the script name, so commands of the form 'julia remez.jl -- -1 1 ...' that worked in Julia 0.2 will no longer work. Adding an extra "--" before the script name works around this ('julia -- remez.jl -- ...') because the first "--" is seen by the Julia interpreter and the second goes to the script. But unfortunately that "--" can't be put in the #! line as well as the 'env', because of #! command line semantics. So users may have to work around this by explicitly invoking the interpreter. Secondly, Julia 1.0 has moved some mathematical functions (e.g. erf and gamma) out of its core library into the SpecialFunctions package, so any pre-existing command lines that used those functions will now need to qualify them with a package name, and be run on a Julia installation which has that package installed. Because Julia 0.4.x is still common (Ubuntu 16.04 and 18.04 both provide 0.4.5), I've included some backwards-compatibility code so that the script still runs on that version as well.
-
Wilco Dijkstra authored
If math.h doesn't set FP_FAST_FMA correctly, ensure HAVE_FAST_FMA is set on AArch64.
-
- Aug 08, 2018
-
-
Wilco Dijkstra authored
Improve comments. Use TOINT_INTRINSICS rather than HAVE_FAST_ROUND.
-
- Jul 30, 2018
-
-
Szabolcs Nagy authored
These are no longer maintained and only kept for WANT_SINGLEPREC build, which is useful for microcontrollers with single precision fpu only. Removed all tests that are not testing code in the default build.
-
Szabolcs Nagy authored
Don't use target flags when building host tools. The example config.mk.dist is updated accordingly.
-
Szabolcs Nagy authored
rtest is the only binary that is built for the host and has additional dependencies: mpfr and mpc. To avoid build issues only build it if the randomized tests are run. New build target is introduced for randomized testing, so make check works without rtest. Tests are no longer copied to the build directory and the runtest.sh tool is removed as it is not very useful.
-
Szabolcs Nagy authored
If lot of benchmarks are run and the output is not a tty then the results only showed up after the stdio buffer was full. With fflush if the output is redirected and mathbench is killed the results before the kill are still available.
-
- Jul 10, 2018
-
-
Szabolcs Nagy authored
PREFER_FLOAT_COMPARISON setting was not correct as it could raise spurious exceptions. Fixing it is easy: just use ISLESS(x, y) instead of abstop12(x) < abstop12(y) with appropriate non-signaling definition for ISLESS. However it seems this setting is not very useful (there is only minor performance difference on various architectures), so remove this option for now.
-
Szabolcs Nagy authored
There was a typo and the arguments were not explained clearly.
-
- Jul 04, 2018
-
-
Wilco Dijkstra authored
Use const sincos_t for clarity instead of making the typedef const. Use __inv_pi4 and __sincosf_table to avoid namespace issues with static linking.
-
Szabolcs Nagy authored
The roundtoint and converttoint internal functions are only called with small values, so 32 bit result is enough for converttoint and it is a signed int conversion so the natural return type is int32_t. The original idea was to help the compiler keeping the result in uint64_t, then it's clear that no sign extension is needed and there is no accidental undefined or implementation defined signed int arithmetics. But it turns out gcc does a good job with inlining so changing the type has no overhead and the semantics of the conversion is less surprising this way. Since we want to allow the asuint64 (x + 0x1.8p52) style conversion, the top bits were never usable and the existing code ensures that only the bottom 32 bits of the conversion result are used.
-
Szabolcs Nagy authored
Rewrote some documentation text and fixed a GNU style issue based on feedback from Joseph Myers on libc-alpha mailing list.
-
- Jul 03, 2018
-
-
Szabolcs Nagy authored
The !HAVE_FAST_FMA code path split r = z/c - 1 into r = rhi + rlo such that when z = 1-tiny and c = 1 then rlo and rhi could have much larger magnitude than r which later caused large rounding errors. So do a nearest rounding instead of truncation at the split.
-
- Jun 29, 2018
-
-
Szabolcs Nagy authored
Explain the semantics of internal functions.
-
Szabolcs Nagy authored
Whitespace changes only.
-
- Jun 27, 2018
-
-
Szabolcs Nagy authored
Extracted 16000 samples from several exp call traces. First half is quite generic, second half has large number of zeros.
-
- Jun 25, 2018
-
-
Szabolcs Nagy authored
The pow error bound was miscalculated, it is slightly below 0.54 ULP when using fma and slightly above it without fma. If 2^-400 < |pow(x,y)| < 2^400 then the error is less than 0.52 ULP with fma.
-
- Jun 22, 2018
-
-
Szabolcs Nagy authored
-
Szabolcs Nagy authored
The log part of pow got rewritten to use a slightly different algorithm. This improves precision and throughput while keeps the same table size. Near 1 cases are no longer special cased, there is a slight performance regression in that case. And when the fma instruction is not available this algorithm is expected to have slightly worse performance. Worst-case error improved from 0.67 ULP to 0.57 ULP. On Cortex-A72 i see thruput near 1: 7% worse latency near 1: 2% worse thruput general: 8% better latency general: 2% better
-
- Jun 20, 2018
-
-
Szabolcs Nagy authored
uint64_t works too, but the correct type is uint32_t.
-
Wilco Dijkstra authored
Add trace for sinf/cosf/sincosf with easy, hard and medium cases extracted from a trace of 1.4 million calls.
-
Wilco Dijkstra authored
Improve support for large traces by reading all of the trace and splitting it into smaller parts. Also remove the B arrays which are not required.
-
Wilco Dijkstra authored
Use int32_t rather than int since we require it to be 32 bits here.
-
Szabolcs Nagy authored
The sign needs to be fixed even in nearest rounding mode.
-
- Jun 15, 2018
-
-
Szabolcs Nagy authored
The last multiplication in exp and exp2 could underflow when it was not contracted into an fma. Changed the thresholds so the problematic cases end up in the specialcase code path (which handles underflow correctly). The initial check now only looks at the exponent bits which has slightly better performance on aarch64. The overflow threshold can be tight for exp2, but was let loose in exp so the specialcase handling got updated accordingly. Added comments about this issue and the assumptions exp_inline is making in pow.
-
Szabolcs Nagy authored
The portable function prototype was wrong.
-
- Jun 13, 2018
-
-
Szabolcs Nagy authored
Previously only uniform random and linear (equidistant) input generators were supported. Now with "-g trace" numbers are read from stdin or with "-f file" they are read from file, so previously recorded call traces can be benchmarked. Only the first 8000 numbers are used and if there are less then the numbers are repeated to fill up the benchmark array.
-
Szabolcs Nagy authored
Add optimization barriers to prevent constant folding and code moving transformations. Use opt_barrier_ and force_eval_ consistently to work around fenv semantics breaking optimizations.
-