1 msg[tuples] Merged with trunk @135126
4 msgIRA performance testing on Fortran
14 msgbit_size_type - a data type?
1 msgAbout symbol and label
2 msg[Windows] Fixing fprintf errors breaking bootst...
3 msginline assembly question (memory side-effects)
3 msgHow do I add target specific tests?
2 msgDeprecation?!
1 msggcc-4.4-20080509 is now available
1 msgRFC: Optimize caller-saved register
5 msgHow to handle loop iterator variable?

Division using FMAC, reciprocal estimates and N...
\ Hasjim Williams (9 May 2008)
. \ Paolo Bonzini (9 May 2008)
. . \ Martin Guy (10 May 2008)
. . . \ Hasjim Williams (12 May 2008)
. . \ Andrew Haley (10 May 2008)
. . . \ Andrew Haley (10 May 2008)
. . \ Andrew Haley (12 May 2008)

1 msgQuestions about attributes
2 msgssa_name issues
6 msgRFH: Building and testing gimple-tuples-branch
1 msggcc-4.3-20080508 is now available
4 msgHow to implement the instruction in the back end
1 msggcc-4.2-20080507 is now available
5 msgall-target-libstdc++-v3 broken again
19 msgBad code generation on HPPA platform
Subject:Re: Division using FMAC, reciprocal estimates and Newton-Raphson - eg ia64, rs6000, SSE, ARM MaverickCrunch?
Group:Gcc
From:Andrew Haley
Date:10 May 2008


Andrew Haley wrote:
> Paolo Bonzini wrote:
>>> I'd like to implement something similar for MaverickCrunch, using the
>>> integer 32-bit MAC functions, but there is no reciprocal estimate
>>> function on the MaverickCrunch. I guess a lookup table could be
>>> implemented, but how many entries will need to be generated, and how
>>> accurate will it have to be IEEE754 compliant (in the swdiv routine)?
>> I think sh does something like that. It is quite a mess, as it has half
>> a dozen ways to implement division.
>>
>> The idea is to use integer arithmetic to compute the right exponent, and
>> the lookup table to estimate the mantissa. I used something like this
>> for square root:
>>
>> 1) shift the entire FP number by 1 to the right (logical right shift)
>> 2) sum 0x20000000 so that the exponent is still offset by 64
>> 3) extract the 8 bits from 14 to 22 and look them up in a 256-entry,
>> 32-bit table
>> 4) sum the value (as a 32-bit integer!) with the content of the table
>> 5) perform 2 Newton-Raphson iterations as necessary
>
> To avoid the lookup table, calculate
>
> x = (a/2) + (8^(1/4) - 1)^2
>
> which gives relative errors less than 0.036 over the range 1/2 <= a <= 2
> at a cost of one shift and one addition. The errors after 1,2,3, and 4
> iterations of Heron's rule are 0.64E-3, 0.204E-6, 0.211E-13, and 0.222E-27.
>
> So, this requires one more iteration but avoids the use of a table and the
> corresponding memory hit.
>
> Source: Computer Approximations, Hart et al.

Sorry, a context switch: this is for sqrt, not division. Brain fade.

Andrew.


© 2004-2008 readlist.com