Skip to content

Commit

Permalink
Improve tgamma accuracy (bug 18613).
Browse files Browse the repository at this point in the history
In non-default rounding modes, tgamma can be slightly less accurate
than permitted by glibc's accuracy goals.

Part of the problem is error accumulation, addressed in this patch by
setting round-to-nearest for internal computations.  However, there
was also a bug in the code dealing with computing pow (x + n, x + n)
where x + n is not exactly representable, providing another source of
error even in round-to-nearest mode; it was necessary to address both
bugs to get errors for all testcases within glibc's accuracy goals.
Given this second fix, accuracy in round-to-nearest mode is also
improved (hence regeneration of ulps for tgamma should be from scratch
- truncate libm-test-ulps or at least remove existing tgamma entries -
so that the expected ulps can be reduced).

Some additional complications also arose.  Certain tgamma tests should
strictly, according to IEEE semantics, overflow or not depending on
the rounding mode; this is beyond the scope of glibc's accuracy goals
for any function without exactly-determined results, but
gen-auto-libm-tests doesn't handle being lax there as it does for
underflow.  (libm-test.inc also doesn't handle being lax about whether
the result in cases very close to the overflow threshold is infinity
or a finite value close to overflow, but that doesn't cause problems
in this case though I've seen it cause problems with random test
generation for some functions.)  Thus, spurious-overflow markings,
with a comment, are added to auto-libm-test-in (no bug in Bugzilla
because the issue is with the testsuite, not a user-visible bug in
glibc).  And on x86, after the patch I saw ERANGE issues as previously
reported by Carlos (see my commentary in
<https://sourceware.org/ml/libc-alpha/2015-01/msg00485.html>), which
needed addressing by ensuring excess range and precision were
eliminated at various points if FLT_EVAL_METHOD != 0.

I also noticed and fixed a cosmetic issue where 1.0f was used in long
double functions and should have been 1.0L.

This completes the move of all functions to testing in all rounding
modes with ALL_RM_TEST, so gen-libm-have-vector-test.sh is updated to
remove the workaround for some functions not using ALL_RM_TEST.

Tested for x86_64, x86, mips64 and powerpc.

	[BZ #18613]
	* sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Take log of
	X_ADJ not X when adjusting exponent.
	(__ieee754_gamma_r): Do intermediate computations in
	round-to-nearest then adjust overflowing and underflowing results
	as needed.
	* sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Take log
	of X_ADJ not X when adjusting exponent.
	(__ieee754_gammaf_r): Do intermediate computations in
	round-to-nearest then adjust overflowing and underflowing results
	as needed.
	* sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Take
	log of X_ADJ not X when adjusting exponent.
	(__ieee754_gammal_r): Do intermediate computations in
	round-to-nearest then adjust overflowing and underflowing results
	as needed.  Use 1.0L not 1.0f as numerator of division.
	* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Take
	log of X_ADJ not X when adjusting exponent.
	(__ieee754_gammal_r): Do intermediate computations in
	round-to-nearest then adjust overflowing and underflowing results
	as needed.  Use 1.0L not 1.0f as numerator of division.
	* sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Take log
	of X_ADJ not X when adjusting exponent.
	(__ieee754_gammal_r): Do intermediate computations in
	round-to-nearest then adjust overflowing and underflowing results
	as needed.  Use 1.0L not 1.0f as numerator of division.
	* math/libm-test.inc (tgamma_test_data): Remove one test.  Moved
	to auto-libm-test-in.
	(tgamma_test): Use ALL_RM_TEST.
	* math/auto-libm-test-in: Add one test of tgamma.  Mark some other
	tests of tgamma with spurious-overflow.
	* math/auto-libm-test-out: Regenerated.
	* math/gen-libm-have-vector-test.sh: Do not check for START.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
  • Loading branch information
Joseph Myers committed Jun 29, 2015
1 parent 4aa10d0 commit e02920b
Show file tree
Hide file tree
Showing 13 changed files with 540 additions and 249 deletions.
36 changes: 36 additions & 0 deletions ChangeLog
Original file line number Diff line number Diff line change
@@ -1,5 +1,41 @@
2015-06-29 Joseph Myers <joseph@codesourcery.com>

[BZ #18613]
* sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Take log of
X_ADJ not X when adjusting exponent.
(__ieee754_gamma_r): Do intermediate computations in
round-to-nearest then adjust overflowing and underflowing results
as needed.
* sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Take log
of X_ADJ not X when adjusting exponent.
(__ieee754_gammaf_r): Do intermediate computations in
round-to-nearest then adjust overflowing and underflowing results
as needed.
* sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Take
log of X_ADJ not X when adjusting exponent.
(__ieee754_gammal_r): Do intermediate computations in
round-to-nearest then adjust overflowing and underflowing results
as needed. Use 1.0L not 1.0f as numerator of division.
* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Take
log of X_ADJ not X when adjusting exponent.
(__ieee754_gammal_r): Do intermediate computations in
round-to-nearest then adjust overflowing and underflowing results
as needed. Use 1.0L not 1.0f as numerator of division.
* sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Take log
of X_ADJ not X when adjusting exponent.
(__ieee754_gammal_r): Do intermediate computations in
round-to-nearest then adjust overflowing and underflowing results
as needed. Use 1.0L not 1.0f as numerator of division.
* math/libm-test.inc (tgamma_test_data): Remove one test. Moved
to auto-libm-test-in.
(tgamma_test): Use ALL_RM_TEST.
* math/auto-libm-test-in: Add one test of tgamma. Mark some other
tests of tgamma with spurious-overflow.
* math/auto-libm-test-out: Regenerated.
* math/gen-libm-have-vector-test.sh: Do not check for START.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.

[BZ #18612]
* sysdeps/ieee754/ldbl-128/e_j1l.c (__ieee754_j1l): For small
arguments, just return 0.5 times the argument, with underflow
Expand Down
2 changes: 1 addition & 1 deletion NEWS
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ Version 2.22
18497, 18498, 18502, 18507, 18512, 18513, 18519, 18520, 18522, 18527,
18528, 18529, 18530, 18532, 18533, 18534, 18536, 18539, 18540, 18542,
18544, 18545, 18546, 18547, 18549, 18553, 18558, 18569, 18583, 18585,
18586, 18593, 18594, 18602, 18612.
18586, 18593, 18594, 18602, 18612, 18613.

* Cache information can be queried via sysconf() function on s390 e.g. with
_SC_LEVEL1_ICACHE_SIZE as argument.
Expand Down
10 changes: 7 additions & 3 deletions math/auto-libm-test-in
Original file line number Diff line number Diff line change
Expand Up @@ -2682,19 +2682,22 @@ tgamma 0x1p-113
tgamma -0x1p-113
tgamma 0x1p-127
tgamma -0x1p-127
tgamma 0x1p-128
# IEEE semantics mean overflow very close to the threshold depends on
# the rounding mode; gen-auto-libm-tests does not reflect that glibc
# does not try to achieve this.
tgamma 0x1p-128 spurious-overflow:flt-32
tgamma -0x1p-128
tgamma 0x1p-149
tgamma -0x1p-149
tgamma 0x1p-1023
tgamma -0x1p-1023
tgamma 0x1p-1024
tgamma 0x1p-1024 spurious-overflow:dbl-64 spurious-overflow:ldbl-128ibm
tgamma -0x1p-1024
tgamma 0x1p-1074
tgamma -0x1p-1074
tgamma 0x1p-16383
tgamma -0x1p-16383
tgamma 0x1p-16384
tgamma 0x1p-16384 spurious-overflow:ldbl-96-intel spurious-overflow:ldbl-96-m68k spurious-overflow:ldbl-128
tgamma -0x1p-16384
tgamma 0x1p-16445
tgamma -0x1p-16445
Expand Down Expand Up @@ -3075,6 +3078,7 @@ tgamma 0x6.db8c603359a971081bc4a2e9dfdp+8
tgamma 0x6.db8c603359a971081bc4a2e9dfd4p+8
tgamma 1e3
tgamma -100000.5
tgamma max

tgamma -0x3.06644cp+0
tgamma -0x6.fe4636e0c5064p+0
Expand Down
263 changes: 166 additions & 97 deletions math/auto-libm-test-out

Large diffs are not rendered by default.

8 changes: 0 additions & 8 deletions math/gen-libm-have-vector-test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -50,11 +50,3 @@ for func in $(cat libm-test.inc | grep ALL_RM_TEST | grep RUN_TEST_LOOP_fFF_11 |
print_defs ${func}f "_fFF"
print_defs ${func}l "_fFF"
done

# When all functions will use ALL_RM_TEST instead of using START directly,
# this code can be removed.
for func in $(grep 'START.*;$' libm-test.inc | sed -r "s/.*\(//; s/,.*//"); do
print_defs ${func}
print_defs ${func}f
print_defs ${func}l
done
5 changes: 1 addition & 4 deletions math/libm-test.inc
Original file line number Diff line number Diff line change
Expand Up @@ -9484,7 +9484,6 @@ tanh_test (void)
static const struct test_f_f_data tgamma_test_data[] =
{
TEST_f_f (tgamma, plus_infty, plus_infty),
TEST_f_f (tgamma, max_value, plus_infty, OVERFLOW_EXCEPTION|ERRNO_ERANGE),
TEST_f_f (tgamma, 0, plus_infty, DIVIDE_BY_ZERO_EXCEPTION|ERRNO_ERANGE),
TEST_f_f (tgamma, minus_zero, minus_infty, DIVIDE_BY_ZERO_EXCEPTION|ERRNO_ERANGE),
/* tgamma (x) == qNaN plus invalid exception for integer x <= 0. */
Expand All @@ -9499,9 +9498,7 @@ static const struct test_f_f_data tgamma_test_data[] =
static void
tgamma_test (void)
{
START (tgamma,, 0);
RUN_TEST_LOOP_f_f (tgamma, tgamma_test_data, );
END;
ALL_RM_TEST (tgamma, 0, tgamma_test_data, RUN_TEST_LOOP_f_f, END);
}


Expand Down
36 changes: 30 additions & 6 deletions sysdeps/i386/fpu/libm-test-ulps
Original file line number Diff line number Diff line change
Expand Up @@ -1932,12 +1932,36 @@ ildouble: 5
ldouble: 4

Function: "tgamma":
double: 6
float: 4
idouble: 6
ifloat: 4
ildouble: 6
ldouble: 6
double: 2
float: 3
idouble: 2
ifloat: 3
ildouble: 3
ldouble: 3

Function: "tgamma_downward":
double: 2
float: 3
idouble: 2
ifloat: 3
ildouble: 3
ldouble: 3

Function: "tgamma_towardzero":
double: 3
float: 3
idouble: 3
ifloat: 3
ildouble: 3
ldouble: 3

Function: "tgamma_upward":
double: 3
float: 3
idouble: 3
ifloat: 3
ildouble: 3
ldouble: 3

Function: "y0":
double: 1
Expand Down
87 changes: 62 additions & 25 deletions sysdeps/ieee754/dbl-64/e_gamma_r.c
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ gamma_positive (double x, int *exp2_adj)
* __ieee754_exp (-x_adj)
* __ieee754_sqrt (2 * M_PI / x_adj)
/ prod);
exp_adj += x_eps * __ieee754_log (x);
exp_adj += x_eps * __ieee754_log (x_adj);
double bsum = gamma_coeff[NCOEFF - 1];
double x_adj2 = x_adj * x_adj;
for (size_t i = 1; i <= NCOEFF - 1; i++)
Expand All @@ -119,6 +119,10 @@ __ieee754_gamma_r (double x, int *signgamp)
{
int32_t hx;
u_int32_t lx;
#if FLT_EVAL_METHOD != 0
volatile
#endif
double ret;

EXTRACT_WORDS (hx, lx, x);

Expand Down Expand Up @@ -153,36 +157,69 @@ __ieee754_gamma_r (double x, int *signgamp)
{
/* Overflow. */
*signgamp = 0;
return DBL_MAX * DBL_MAX;
ret = DBL_MAX * DBL_MAX;
return ret;
}
else if (x > 0.0)
else
{
*signgamp = 0;
int exp2_adj;
double ret = gamma_positive (x, &exp2_adj);
return __scalbn (ret, exp2_adj);
SET_RESTORE_ROUND (FE_TONEAREST);
if (x > 0.0)
{
*signgamp = 0;
int exp2_adj;
double tret = gamma_positive (x, &exp2_adj);
ret = __scalbn (tret, exp2_adj);
}
else if (x >= -DBL_EPSILON / 4.0)
{
*signgamp = 0;
ret = 1.0 / x;
}
else
{
double tx = __trunc (x);
*signgamp = (tx == 2.0 * __trunc (tx / 2.0)) ? -1 : 1;
if (x <= -184.0)
/* Underflow. */
ret = DBL_MIN * DBL_MIN;
else
{
double frac = tx - x;
if (frac > 0.5)
frac = 1.0 - frac;
double sinpix = (frac <= 0.25
? __sin (M_PI * frac)
: __cos (M_PI * (0.5 - frac)));
int exp2_adj;
double tret = M_PI / (-x * sinpix
* gamma_positive (-x, &exp2_adj));
ret = __scalbn (tret, -exp2_adj);
}
}
}
else if (x >= -DBL_EPSILON / 4.0)
if (isinf (ret) && x != 0)
{
*signgamp = 0;
return 1.0 / x;
if (*signgamp < 0)
{
ret = -__copysign (DBL_MAX, ret) * DBL_MAX;
ret = -ret;
}
else
ret = __copysign (DBL_MAX, ret) * DBL_MAX;
return ret;
}
else
else if (ret == 0)
{
double tx = __trunc (x);
*signgamp = (tx == 2.0 * __trunc (tx / 2.0)) ? -1 : 1;
if (x <= -184.0)
/* Underflow. */
return DBL_MIN * DBL_MIN;
double frac = tx - x;
if (frac > 0.5)
frac = 1.0 - frac;
double sinpix = (frac <= 0.25
? __sin (M_PI * frac)
: __cos (M_PI * (0.5 - frac)));
int exp2_adj;
double ret = M_PI / (-x * sinpix * gamma_positive (-x, &exp2_adj));
return __scalbn (ret, -exp2_adj);
if (*signgamp < 0)
{
ret = -__copysign (DBL_MIN, ret) * DBL_MIN;
ret = -ret;
}
else
ret = __copysign (DBL_MIN, ret) * DBL_MIN;
return ret;
}
else
return ret;
}
strong_alias (__ieee754_gamma_r, __gamma_r_finite)
88 changes: 62 additions & 26 deletions sysdeps/ieee754/flt-32/e_gammaf_r.c
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ gammaf_positive (float x, int *exp2_adj)
* __ieee754_expf (-x_adj)
* __ieee754_sqrtf (2 * (float) M_PI / x_adj)
/ prod);
exp_adj += x_eps * __ieee754_logf (x);
exp_adj += x_eps * __ieee754_logf (x_adj);
float bsum = gamma_coeff[NCOEFF - 1];
float x_adj2 = x_adj * x_adj;
for (size_t i = 1; i <= NCOEFF - 1; i++)
Expand All @@ -111,6 +111,10 @@ float
__ieee754_gammaf_r (float x, int *signgamp)
{
int32_t hx;
#if FLT_EVAL_METHOD != 0
volatile
#endif
float ret;

GET_FLOAT_WORD (hx, x);

Expand Down Expand Up @@ -145,37 +149,69 @@ __ieee754_gammaf_r (float x, int *signgamp)
{
/* Overflow. */
*signgamp = 0;
return FLT_MAX * FLT_MAX;
ret = FLT_MAX * FLT_MAX;
return ret;
}
else if (x > 0.0f)
else
{
*signgamp = 0;
int exp2_adj;
float ret = gammaf_positive (x, &exp2_adj);
return __scalbnf (ret, exp2_adj);
SET_RESTORE_ROUNDF (FE_TONEAREST);
if (x > 0.0f)
{
*signgamp = 0;
int exp2_adj;
float tret = gammaf_positive (x, &exp2_adj);
ret = __scalbnf (tret, exp2_adj);
}
else if (x >= -FLT_EPSILON / 4.0f)
{
*signgamp = 0;
ret = 1.0f / x;
}
else
{
float tx = __truncf (x);
*signgamp = (tx == 2.0f * __truncf (tx / 2.0f)) ? -1 : 1;
if (x <= -42.0f)
/* Underflow. */
ret = FLT_MIN * FLT_MIN;
else
{
float frac = tx - x;
if (frac > 0.5f)
frac = 1.0f - frac;
float sinpix = (frac <= 0.25f
? __sinf ((float) M_PI * frac)
: __cosf ((float) M_PI * (0.5f - frac)));
int exp2_adj;
float tret = (float) M_PI / (-x * sinpix
* gammaf_positive (-x, &exp2_adj));
ret = __scalbnf (tret, -exp2_adj);
}
}
}
else if (x >= -FLT_EPSILON / 4.0f)
if (isinf (ret) && x != 0)
{
*signgamp = 0;
return 1.0f / x;
if (*signgamp < 0)
{
ret = -__copysignf (FLT_MAX, ret) * FLT_MAX;
ret = -ret;
}
else
ret = __copysignf (FLT_MAX, ret) * FLT_MAX;
return ret;
}
else
else if (ret == 0)
{
float tx = __truncf (x);
*signgamp = (tx == 2.0f * __truncf (tx / 2.0f)) ? -1 : 1;
if (x <= -42.0f)
/* Underflow. */
return FLT_MIN * FLT_MIN;
float frac = tx - x;
if (frac > 0.5f)
frac = 1.0f - frac;
float sinpix = (frac <= 0.25f
? __sinf ((float) M_PI * frac)
: __cosf ((float) M_PI * (0.5f - frac)));
int exp2_adj;
float ret = (float) M_PI / (-x * sinpix
* gammaf_positive (-x, &exp2_adj));
return __scalbnf (ret, -exp2_adj);
if (*signgamp < 0)
{
ret = -__copysignf (FLT_MIN, ret) * FLT_MIN;
ret = -ret;
}
else
ret = __copysignf (FLT_MIN, ret) * FLT_MIN;
return ret;
}
else
return ret;
}
strong_alias (__ieee754_gammaf_r, __gammaf_r_finite)
Loading

0 comments on commit e02920b

Please sign in to comment.