https://eclipse.umbc.edu/robucci/cmpeRSD/Lectures/Lecture19__Implementation_of_Elementary_Functions/
Lecture 19 – Implementations of Elementary Functions
References
- Koren, Israel. Computer arithmetic algorithms, 2nd ed. AK Peters/CRC Press, 2001.
Implementation of Functions
- Computation of various elementary functions have several approaches include lookup tables (LUT), approximations, iterative methods, and combinations thereof.
LUT
- LUT/ROM: store results for all possible inputs, unfortunately exhaustive storage can be too large
- 32-bit input implies 4 billion entries!!
- 32-bit input implies 4 billion entries!!
Interpolation (+LUT)
- Interpolation may be linear, cubic spline,..
- finite set of points in a table, e.g. ,
- Estimate in the interval with and stored in a LUT
- Linear Interpolation:
MAPPING AND INTERPOLATION
- need to design mapping scale offset for encoding:
digital encoding = - Choice of LUT Indexing for Linear Interpolation
- May use uniformly spaced points to create equal-sized intervals, or non-uniform spacing
- Irregular spacing requires a search for the relevant entries while uniform intervals () allow a more direct referencing to an offset related to
- Not all non-uniform spacing is irregular, for instance indexes of logarithmically spaced samples (10 per decade) can also be directly computed
- A useful approach for uniform interval indexes is to use an interval size such that the first few bits of x are used directly as the index into the LUT
Polynomial Approximation
- Polynomials
- Ratio of Polynomials
- Piecewise Approximation
TAYLOR SERIES EXPANSION
- Polynomial approximation in the vicinity of a point
- Ex:
- requires large number of terms for accuracy at when is close to , though few terms are needed when x is close to 0
APPROXIMATION USING CHEBYSHEV POLYNOMIALS
Use polynomial basis functions to estimate an interval
Better than approximating a high-order polynomial polynomial by just omitting higher-order terms
- e.g. better than approximating
by
- e.g. better than approximating
Example approx. using "1st kind of Chebyshev Polynomials":
Available Basis:
https://en.wikipedia.org/wiki/Chebyshev_polynomialsCompute weights according to where is the polynomial or function to approximate, and k is 1 for n=0 and 2 otherwise
Create polynomial approximation with computed weights
Example:
Taylor Series:
Chebyshev Polynomial Weights: ,,,,,
additional note for future reference: if evaluating over a range other than -1 to 1, then use an input mapping function to shift the center of the range to 0 and scale the range
- Ex: if interested in with x in range 2 to 5, instead approximate , where
RATIONAL POLYNOMIAL APPROX.
- Approx. using Ratio of Polynomials:
- Ex:
- let , where is an integer and is a fractional value
- computing is trivial
- where and are known constants
10 multiplications, 10 additions, 1 division
- Ex:
PIECEWISE APPROXIMATION
- Piecewise Polynomial Approximation
- Break function into multiple intervals
- Create Polynomial Approximation for each interval (e.g. Taylor series expansion around middle of interval)
- Store polynomial approximation for each interval in a lookup table (LUT)
- During run:
- decide interval to use,
- use LUT to recall parameters for that interval
- compute approximation
Partial Results LUT
Consider , is 32-bit UNSIGNED fractional argument,
Y can be iteratively computed as
with Y initialized to 1
One choice for implementing the iteration is to use a lookup table storing 32 precomputed entriesThis requires a LU and multiply per iteration
But, there is a way to construct the iterations and LUT to make the multiplication very cheap ( scale by a scale)...
Interestingly, it involves a lookup table for instead of with a pair of coupled iteration equations
Additive and & Multiplicative Normalization
- Additive and Multiplicative Normalization are two approaches to change a variable towards a target value
- Algorithms involve record decisions/values chosen along the way, which are used to construct an answer to a question
- Earlier we saw Multiplicative Normalization in the division through multiplication to move the denominator towards a target value, 1
- Additive Normalization, used in the next approach, changes variable to a target value using add/sub operations.
Exponentiation via Additive Normalization
- Iterative approximation for :
- in each iteration will be restricted to a set that makes the update inexpensive
- will be a parameter of the restriction, which makes the possible manipulation of Y unique to each iteration
- is a selector allowing choice within the allowable set in each iteration, but the decision rule is also based on the X iteration
- the selector in each iteration is chosen to make the successive approximation update move towards the desired X
- avoid the expense of the run-time computation of by precomputing all possible values of that might be used, and storing them in a LUT for use during run:
Restriction on B to simplify the multiplication:
Therefore the update requires only shift and add/sub operations
Remember, for a given input X, the particular in a given iteration is selected by and is so selected according to the desire to make
converge to :
RANGE OF REPRESENTABLE VALUES
- The range of representable values can be found by assuming the selection argument is the same in every iteration
It also depends on the number of bits or iterations. For just 10 iterations we can say , though typically even more bits in the argument are desired
ITERATIVE CONVERGENCE TO X
- If we restrict X to be POSITIVE fractional values, we can use a binary selection rule to subtract or keep (instead of add,sub,keep)
- My preferred bookkeeping for additive normalization is to iteratively move a value towards 0,
- Initialize R=X
- Step:
- if
- ,
- else
- if
SIMPLIFYING THE SELECTION RULE
- Sometimes the iterative steps can be simplified by analyzing the numerical possibilities in the "binary representation" of the hardware domain.
- Examine the subtractive term:
- Only the first term affects bit -i
- Therefore, in each iteration we can choose to cancel the -i bit in the remainder R, using only selection values 0,1
- This simplifies the selection rule to just checking the bit at -i and requires one iteration per bit of the input argument (R=X).
ALLOWING LARGER THAN FRACTIONAL ARGUMENTS
Let x be represented by where
- I is an integer
- f a fractional value with range and
- S is a scale
X : .
- Find I and f by scaling down by and splitting the result at the decimal (I.f)
Therefore we can compute by augmenting the previous method with a shift:
set R = f ln 2, this is a fraction
CORDIC
- A popular method for iterative computation of trigonometric functions is called CORDIC (COordinate Rotation Digital Computer)
- reduces trig functions to a series of only shift and add/sub operations with one multiplication gain correction at the end
- next lecture ...
No hay comentarios:
Publicar un comentario