Improve tgamma accuracy (bug 18613).

In non-default rounding modes, tgamma can be slightly less accurate than permitted by glibc's accuracy goals. Part of the problem is error accumulation, addressed in this patch by setting round-to-nearest for internal computations. However, there was also a bug in the code dealing with computing pow (x + n, x + n) where x + n is not exactly representable, providing another source of error even in round-to-nearest mode; it was necessary to address both bugs to get errors for all testcases within glibc's accuracy goals. Given this second fix, accuracy in round-to-nearest mode is also improved (hence regeneration of ulps for tgamma should be from scratch - truncate libm-test-ulps or at least remove existing tgamma entries - so that the expected ulps can be reduced). Some additional complications also arose. Certain tgamma tests should strictly, according to IEEE semantics, overflow or not depending on the rounding mode; this is beyond the scope of glibc's accuracy goals for any function without exactly-determined results, but gen-auto-libm-tests doesn't handle being lax there as it does for underflow. (libm-test.inc also doesn't handle being lax about whether the result in cases very close to the overflow threshold is infinity or a finite value close to overflow, but that doesn't cause problems in this case though I've seen it cause problems with random test generation for some functions.) Thus, spurious-overflow markings, with a comment, are added to auto-libm-test-in (no bug in Bugzilla because the issue is with the testsuite, not a user-visible bug in glibc). And on x86, after the patch I saw ERANGE issues as previously reported by Carlos (see my commentary in <https://sourceware.org/ml/libc-alpha/2015-01/msg00485.html>), which needed addressing by ensuring excess range and precision were eliminated at various points if FLT_EVAL_METHOD != 0. I also noticed and fixed a cosmetic issue where 1.0f was used in long double functions and should have been 1.0L. This completes the move of all functions to testing in all rounding modes with ALL_RM_TEST, so gen-libm-have-vector-test.sh is updated to remove the workaround for some functions not using ALL_RM_TEST. Tested for x86_64, x86, mips64 and powerpc. [BZ #18613] * sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Take log of X_ADJ not X when adjusting exponent. (__ieee754_gamma_r): Do intermediate computations in round-to-nearest then adjust overflowing and underflowing results as needed. * sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Take log of X_ADJ not X when adjusting exponent. (__ieee754_gammaf_r): Do intermediate computations in round-to-nearest then adjust overflowing and underflowing results as needed. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Take log of X_ADJ not X when adjusting exponent. (__ieee754_gammal_r): Do intermediate computations in round-to-nearest then adjust overflowing and underflowing results as needed. Use 1.0L not 1.0f as numerator of division. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Take log of X_ADJ not X when adjusting exponent. (__ieee754_gammal_r): Do intermediate computations in round-to-nearest then adjust overflowing and underflowing results as needed. Use 1.0L not 1.0f as numerator of division. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Take log of X_ADJ not X when adjusting exponent. (__ieee754_gammal_r): Do intermediate computations in round-to-nearest then adjust overflowing and underflowing results as needed. Use 1.0L not 1.0f as numerator of division. * math/libm-test.inc (tgamma_test_data): Remove one test. Moved to auto-libm-test-in. (tgamma_test): Use ALL_RM_TEST. * math/auto-libm-test-in: Add one test of tgamma. Mark some other tests of tgamma with spurious-overflow. * math/auto-libm-test-out: Regenerated. * math/gen-libm-have-vector-test.sh: Do not check for START. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
git-mirror · Jun 29, 2015 · e02920b · e02920b
1 parent 4aa10d0
commit e02920b
Show file tree

Hide file tree

Showing 13 changed files with 540 additions and 249 deletions.
diff --git a/ChangeLog b/ChangeLog
@@ -1,5 +1,41 @@
 2015-06-29  Joseph Myers  <joseph@codesourcery.com>
 
+	[BZ #18613]
+	* sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Take log of
+	X_ADJ not X when adjusting exponent.
+	(__ieee754_gamma_r): Do intermediate computations in
+	round-to-nearest then adjust overflowing and underflowing results
+	as needed.
+	* sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Take log
+	of X_ADJ not X when adjusting exponent.
+	(__ieee754_gammaf_r): Do intermediate computations in
+	round-to-nearest then adjust overflowing and underflowing results
+	as needed.
+	* sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Take
+	log of X_ADJ not X when adjusting exponent.
+	(__ieee754_gammal_r): Do intermediate computations in
+	round-to-nearest then adjust overflowing and underflowing results
+	as needed.  Use 1.0L not 1.0f as numerator of division.
+	* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Take
+	log of X_ADJ not X when adjusting exponent.
+	(__ieee754_gammal_r): Do intermediate computations in
+	round-to-nearest then adjust overflowing and underflowing results
+	as needed.  Use 1.0L not 1.0f as numerator of division.
+	* sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Take log
+	of X_ADJ not X when adjusting exponent.
+	(__ieee754_gammal_r): Do intermediate computations in
+	round-to-nearest then adjust overflowing and underflowing results
+	as needed.  Use 1.0L not 1.0f as numerator of division.
+	* math/libm-test.inc (tgamma_test_data): Remove one test.  Moved
+	to auto-libm-test-in.
+	(tgamma_test): Use ALL_RM_TEST.
+	* math/auto-libm-test-in: Add one test of tgamma.  Mark some other
+	tests of tgamma with spurious-overflow.
+	* math/auto-libm-test-out: Regenerated.
+	* math/gen-libm-have-vector-test.sh: Do not check for START.
+	* sysdeps/i386/fpu/libm-test-ulps: Update.
+	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
+
 	[BZ #18612]
 	* sysdeps/ieee754/ldbl-128/e_j1l.c (__ieee754_j1l): For small
 	arguments, just return 0.5 times the argument, with underflow

diff --git a/NEWS b/NEWS
@@ -25,7 +25,7 @@ Version 2.22
   18497, 18498, 18502, 18507, 18512, 18513, 18519, 18520, 18522, 18527,
   18528, 18529, 18530, 18532, 18533, 18534, 18536, 18539, 18540, 18542,
   18544, 18545, 18546, 18547, 18549, 18553, 18558, 18569, 18583, 18585,
-  18586, 18593, 18594, 18602, 18612.
+  18586, 18593, 18594, 18602, 18612, 18613.
 
 * Cache information can be queried via sysconf() function on s390 e.g. with
   _SC_LEVEL1_ICACHE_SIZE as argument.

diff --git a/math/auto-libm-test-in b/math/auto-libm-test-in
@@ -2682,19 +2682,22 @@ tgamma 0x1p-113
 tgamma -0x1p-113
 tgamma 0x1p-127
 tgamma -0x1p-127
-tgamma 0x1p-128
+# IEEE semantics mean overflow very close to the threshold depends on
+# the rounding mode; gen-auto-libm-tests does not reflect that glibc
+# does not try to achieve this.
+tgamma 0x1p-128 spurious-overflow:flt-32
 tgamma -0x1p-128
 tgamma 0x1p-149
 tgamma -0x1p-149
 tgamma 0x1p-1023
 tgamma -0x1p-1023
-tgamma 0x1p-1024
+tgamma 0x1p-1024 spurious-overflow:dbl-64 spurious-overflow:ldbl-128ibm
 tgamma -0x1p-1024
 tgamma 0x1p-1074
 tgamma -0x1p-1074
 tgamma 0x1p-16383
 tgamma -0x1p-16383
-tgamma 0x1p-16384
+tgamma 0x1p-16384 spurious-overflow:ldbl-96-intel spurious-overflow:ldbl-96-m68k spurious-overflow:ldbl-128
 tgamma -0x1p-16384
 tgamma 0x1p-16445
 tgamma -0x1p-16445
@@ -3075,6 +3078,7 @@ tgamma 0x6.db8c603359a971081bc4a2e9dfdp+8
 tgamma 0x6.db8c603359a971081bc4a2e9dfd4p+8
 tgamma 1e3
 tgamma -100000.5
+tgamma max
 
 tgamma -0x3.06644cp+0
 tgamma -0x6.fe4636e0c5064p+0

diff --git a/math/auto-libm-test-out b/math/auto-libm-test-out
diff --git a/math/gen-libm-have-vector-test.sh b/math/gen-libm-have-vector-test.sh
@@ -50,11 +50,3 @@ for func in $(cat libm-test.inc | grep ALL_RM_TEST | grep RUN_TEST_LOOP_fFF_11 |
   print_defs ${func}f "_fFF"
   print_defs ${func}l "_fFF"
 done
-
-# When all functions will use ALL_RM_TEST instead of using START directly,
-# this code can be removed.
-for func in $(grep 'START.*;$' libm-test.inc | sed -r "s/.*\(//; s/,.*//"); do
-  print_defs ${func}
-  print_defs ${func}f
-  print_defs ${func}l
-done
diff --git a/math/libm-test.inc b/math/libm-test.inc
@@ -9484,7 +9484,6 @@ tanh_test (void)
 static const struct test_f_f_data tgamma_test_data[] =
   {
     TEST_f_f (tgamma, plus_infty, plus_infty),
-    TEST_f_f (tgamma, max_value, plus_infty, OVERFLOW_EXCEPTION|ERRNO_ERANGE),
     TEST_f_f (tgamma, 0, plus_infty, DIVIDE_BY_ZERO_EXCEPTION|ERRNO_ERANGE),
     TEST_f_f (tgamma, minus_zero, minus_infty, DIVIDE_BY_ZERO_EXCEPTION|ERRNO_ERANGE),
     /* tgamma (x) == qNaN plus invalid exception for integer x <= 0.  */
@@ -9499,9 +9498,7 @@ static const struct test_f_f_data tgamma_test_data[] =
 static void
 tgamma_test (void)
 {
-  START (tgamma,, 0);
-  RUN_TEST_LOOP_f_f (tgamma, tgamma_test_data, );
-  END;
+  ALL_RM_TEST (tgamma, 0, tgamma_test_data, RUN_TEST_LOOP_f_f, END);
 }
 
 

diff --git a/sysdeps/i386/fpu/libm-test-ulps b/sysdeps/i386/fpu/libm-test-ulps
@@ -1932,12 +1932,36 @@ ildouble: 5
 ldouble: 4
 
 Function: "tgamma":
-double: 6
-float: 4
-idouble: 6
-ifloat: 4
-ildouble: 6
-ldouble: 6
+double: 2
+float: 3
+idouble: 2
+ifloat: 3
+ildouble: 3
+ldouble: 3
+
+Function: "tgamma_downward":
+double: 2
+float: 3
+idouble: 2
+ifloat: 3
+ildouble: 3
+ldouble: 3
+
+Function: "tgamma_towardzero":
+double: 3
+float: 3
+idouble: 3
+ifloat: 3
+ildouble: 3
+ldouble: 3
+
+Function: "tgamma_upward":
+double: 3
+float: 3
+idouble: 3
+ifloat: 3
+ildouble: 3
+ldouble: 3
 
 Function: "y0":
 double: 1

diff --git a/sysdeps/ieee754/dbl-64/e_gamma_r.c b/sysdeps/ieee754/dbl-64/e_gamma_r.c
@@ -104,7 +104,7 @@ gamma_positive (double x, int *exp2_adj)
 		    * __ieee754_exp (-x_adj)
 		    * __ieee754_sqrt (2 * M_PI / x_adj)
 		    / prod);
-      exp_adj += x_eps * __ieee754_log (x);
+      exp_adj += x_eps * __ieee754_log (x_adj);
       double bsum = gamma_coeff[NCOEFF - 1];
       double x_adj2 = x_adj * x_adj;
       for (size_t i = 1; i <= NCOEFF - 1; i++)
@@ -119,6 +119,10 @@ __ieee754_gamma_r (double x, int *signgamp)
 {
   int32_t hx;
   u_int32_t lx;
+#if FLT_EVAL_METHOD != 0
+  volatile
+#endif
+  double ret;
 
   EXTRACT_WORDS (hx, lx, x);
 
@@ -153,36 +157,69 @@ __ieee754_gamma_r (double x, int *signgamp)
     {
       /* Overflow.  */
       *signgamp = 0;
-      return DBL_MAX * DBL_MAX;
+      ret = DBL_MAX * DBL_MAX;
+      return ret;
     }
-  else if (x > 0.0)
+  else
     {
-      *signgamp = 0;
-      int exp2_adj;
-      double ret = gamma_positive (x, &exp2_adj);
-      return __scalbn (ret, exp2_adj);
+      SET_RESTORE_ROUND (FE_TONEAREST);
+      if (x > 0.0)
+	{
+	  *signgamp = 0;
+	  int exp2_adj;
+	  double tret = gamma_positive (x, &exp2_adj);
+	  ret = __scalbn (tret, exp2_adj);
+	}
+      else if (x >= -DBL_EPSILON / 4.0)
+	{
+	  *signgamp = 0;
+	  ret = 1.0 / x;
+	}
+      else
+	{
+	  double tx = __trunc (x);
+	  *signgamp = (tx == 2.0 * __trunc (tx / 2.0)) ? -1 : 1;
+	  if (x <= -184.0)
+	    /* Underflow.  */
+	    ret = DBL_MIN * DBL_MIN;
+	  else
+	    {
+	      double frac = tx - x;
+	      if (frac > 0.5)
+		frac = 1.0 - frac;
+	      double sinpix = (frac <= 0.25
+			       ? __sin (M_PI * frac)
+			       : __cos (M_PI * (0.5 - frac)));
+	      int exp2_adj;
+	      double tret = M_PI / (-x * sinpix
+				    * gamma_positive (-x, &exp2_adj));
+	      ret = __scalbn (tret, -exp2_adj);
+	    }
+	}
     }
-  else if (x >= -DBL_EPSILON / 4.0)
+  if (isinf (ret) && x != 0)
     {
-      *signgamp = 0;
-      return 1.0 / x;
+      if (*signgamp < 0)
+	{
+	  ret = -__copysign (DBL_MAX, ret) * DBL_MAX;
+	  ret = -ret;
+	}
+      else
+	ret = __copysign (DBL_MAX, ret) * DBL_MAX;
+      return ret;
     }
-  else
+  else if (ret == 0)
     {
-      double tx = __trunc (x);
-      *signgamp = (tx == 2.0 * __trunc (tx / 2.0)) ? -1 : 1;
-      if (x <= -184.0)
-	/* Underflow.  */
-	return DBL_MIN * DBL_MIN;
-      double frac = tx - x;
-      if (frac > 0.5)
-	frac = 1.0 - frac;
-      double sinpix = (frac <= 0.25
-		       ? __sin (M_PI * frac)
-		       : __cos (M_PI * (0.5 - frac)));
-      int exp2_adj;
-      double ret = M_PI / (-x * sinpix * gamma_positive (-x, &exp2_adj));
-      return __scalbn (ret, -exp2_adj);
+      if (*signgamp < 0)
+	{
+	  ret = -__copysign (DBL_MIN, ret) * DBL_MIN;
+	  ret = -ret;
+	}
+      else
+	ret = __copysign (DBL_MIN, ret) * DBL_MIN;
+      return ret;
     }
+  else
+    return ret;
 }
 strong_alias (__ieee754_gamma_r, __gamma_r_finite)
diff --git a/sysdeps/ieee754/flt-32/e_gammaf_r.c b/sysdeps/ieee754/flt-32/e_gammaf_r.c
@@ -97,7 +97,7 @@ gammaf_positive (float x, int *exp2_adj)
 		   * __ieee754_expf (-x_adj)
 		   * __ieee754_sqrtf (2 * (float) M_PI / x_adj)
 		   / prod);
-      exp_adj += x_eps * __ieee754_logf (x);
+      exp_adj += x_eps * __ieee754_logf (x_adj);
       float bsum = gamma_coeff[NCOEFF - 1];
       float x_adj2 = x_adj * x_adj;
       for (size_t i = 1; i <= NCOEFF - 1; i++)
@@ -111,6 +111,10 @@ float
 __ieee754_gammaf_r (float x, int *signgamp)
 {
   int32_t hx;
+#if FLT_EVAL_METHOD != 0
+  volatile
+#endif
+  float ret;
 
   GET_FLOAT_WORD (hx, x);
 
@@ -145,37 +149,69 @@ __ieee754_gammaf_r (float x, int *signgamp)
     {
       /* Overflow.  */
       *signgamp = 0;
-      return FLT_MAX * FLT_MAX;
+      ret = FLT_MAX * FLT_MAX;
+      return ret;
     }
-  else if (x > 0.0f)
+  else
     {
-      *signgamp = 0;
-      int exp2_adj;
-      float ret = gammaf_positive (x, &exp2_adj);
-      return __scalbnf (ret, exp2_adj);
+      SET_RESTORE_ROUNDF (FE_TONEAREST);
+      if (x > 0.0f)
+	{
+	  *signgamp = 0;
+	  int exp2_adj;
+	  float tret = gammaf_positive (x, &exp2_adj);
+	  ret = __scalbnf (tret, exp2_adj);
+	}
+      else if (x >= -FLT_EPSILON / 4.0f)
+	{
+	  *signgamp = 0;
+	  ret = 1.0f / x;
+	}
+      else
+	{
+	  float tx = __truncf (x);
+	  *signgamp = (tx == 2.0f * __truncf (tx / 2.0f)) ? -1 : 1;
+	  if (x <= -42.0f)
+	    /* Underflow.  */
+	    ret = FLT_MIN * FLT_MIN;
+	  else
+	    {
+	      float frac = tx - x;
+	      if (frac > 0.5f)
+		frac = 1.0f - frac;
+	      float sinpix = (frac <= 0.25f
+			      ? __sinf ((float) M_PI * frac)
+			      : __cosf ((float) M_PI * (0.5f - frac)));
+	      int exp2_adj;
+	      float tret = (float) M_PI / (-x * sinpix
+					   * gammaf_positive (-x, &exp2_adj));
+	      ret = __scalbnf (tret, -exp2_adj);
+	    }
+	}
     }
-  else if (x >= -FLT_EPSILON / 4.0f)
+  if (isinf (ret) && x != 0)
     {
-      *signgamp = 0;
-      return 1.0f / x;
+      if (*signgamp < 0)
+	{
+	  ret = -__copysignf (FLT_MAX, ret) * FLT_MAX;
+	  ret = -ret;
+	}
+      else
+	ret = __copysignf (FLT_MAX, ret) * FLT_MAX;
+      return ret;
     }
-  else
+  else if (ret == 0)
     {
-      float tx = __truncf (x);
-      *signgamp = (tx == 2.0f * __truncf (tx / 2.0f)) ? -1 : 1;
-      if (x <= -42.0f)
-	/* Underflow.  */
-	return FLT_MIN * FLT_MIN;
-      float frac = tx - x;
-      if (frac > 0.5f)
-	frac = 1.0f - frac;
-      float sinpix = (frac <= 0.25f
-		      ? __sinf ((float) M_PI * frac)
-		      : __cosf ((float) M_PI * (0.5f - frac)));
-      int exp2_adj;
-      float ret = (float) M_PI / (-x * sinpix
-				  * gammaf_positive (-x, &exp2_adj));
-      return __scalbnf (ret, -exp2_adj);
+      if (*signgamp < 0)
+	{
+	  ret = -__copysignf (FLT_MIN, ret) * FLT_MIN;
+	  ret = -ret;
+	}
+      else
+	ret = __copysignf (FLT_MIN, ret) * FLT_MIN;
+      return ret;
     }
+  else
+    return ret;
 }
 strong_alias (__ieee754_gammaf_r, __gammaf_r_finite)