Numerical JS

Contents

Floating Point Comparison
- Scalb Function
- Back to Floating Point Comparison
Root Finding Algorithms
Appendix: Miscellaneous Utility Functions

Introduction

These are some rough notes on making numerical analysis tools written in JavaScript. We begin with a discussion of floating point comparison a la Knuth.

Then we examine some root-finding algorithms.

I am using JSDoc-style comments throughout.

# Floating Point Comparison

Knuth's TAOCP, vol 2, gives an algorithm for floating point comparison (well, he uses \approx for equality, and notes when inequalities hold; we generalize this into one function floatCompare()). We need some helper functions, namely something like Java's java.lang.Math/scalb function.

## Scalb Function

This is something we should unit test. We may examine how OpenJDK implements their java.lang.Math version:

    /**
     * Returns a floating-point power of two in the normal range.
     */
    static double powerOfTwoD(int n) {
        assert(n >= DoubleConsts.MIN_EXPONENT && n <= DoubleConsts.MAX_EXPONENT);
        return Double.longBitsToDouble((((long)n + (long)DoubleConsts.EXP_BIAS) <<
                                        (DoubleConsts.SIGNIFICAND_WIDTH-1))
                                       & DoubleConsts.EXP_BIT_MASK);
    }

    // Constants used in scalb
    static double twoToTheDoubleScaleUp = powerOfTwoD(512);
    static double twoToTheDoubleScaleDown = powerOfTwoD(-512);

    public static double scalb(double d, int scaleFactor) {
        // magnitude of a power of two so large that scaling a finite
        // nonzero value by it would be guaranteed to over or
        // underflow; due to rounding, scaling down takes takes an
        // additional power of two which is reflected here
        final int MAX_SCALE = DoubleConsts.MAX_EXPONENT + -DoubleConsts.MIN_EXPONENT +
                              DoubleConsts.SIGNIFICAND_WIDTH + 1;
        int exp_adjust = 0;
        int scale_increment = 0;
        double exp_delta = Double.NaN;

        // Make sure scaling factor is in a reasonable range

        if(scaleFactor < 0) {
            scaleFactor = Math.max(scaleFactor, -MAX_SCALE);
            scale_increment = -512;
            exp_delta = twoToTheDoubleScaleDown;
        }
        else {
            scaleFactor = Math.min(scaleFactor, MAX_SCALE);
            scale_increment = 512;
            exp_delta = twoToTheDoubleScaleUp;
        }

        // Calculate (scaleFactor % +/-512), 512 = 2^9, using
        // technique from "Hacker's Delight" section 10-2.
        int t = (scaleFactor >> 9-1) >>> 32 - 9;
        exp_adjust = ((scaleFactor + t) & (512 -1)) - t;

        d *= powerOfTwoD(exp_adjust);
        scaleFactor -= exp_adjust;

        while(scaleFactor != 0) {
            d *= exp_delta;
            scaleFactor -= scale_increment;
        }
        return d;
    }

That's a lot of code, if we were to translate this into JavaScript...we should unit test it (to make sure we're doing it right). Actually, the people at OpenJDK have unit tests in test.java.lang.Math.IeeeRecommendedTests/testDoubleScalb --- recall JavaScript numbers are all double precision, so we should use the double precision tests.

In JavaScript, the function looks like:

var DoubleConsts = {};
/**
 * Specifies the smallest exponent for a double, -1022, in hexadecimal.
 */
DoubleConsts.MIN_EXPONENT = -0x3FE;
/**
 * Specifies the largest exponent for a double, 1023, in hexadecimal.
 */
DoubleConsts.MAX_EXPONENT = 0x3FF;
/**
 * Specifies the bias for the exponent, 1023, in hexadecimal.
 */
DoubleConsts.EXP_BIAS = 0x3FF;
/**
 * Specifies the bits in the Mantissa, 53 for a Double, in hexadecimal.
 */
DoubleConsts.SIGNIFICAND_WIDTH = 0x35;
/**
 * Specifies bits where the exponent "lives" in a Double, in hexadecimal.
 * @todo make this endian-independent.
 */
DoubleConsts.EXP_BIT_MASK = "0x7ff0000000000000";

function fromNumberParts(sign, mantissa, exponent) {
    // already implemented in my handwritten notes
}

function powerOfTwoD(n) {
    assert(n >= DoubleConsts.MIN_EXPONENT && n <= DoubleConsts.MAX_EXPONENT);
    return fromNumberParts(1, 1, n);
}

const twoToTheDoubleScaleUp = powerOfTwoD(512);
const twoToTheDoubleScaleDown = powerOfTwoD(-512);

function scalb(d, scaleFactor) {
    var MAX_SCALE = DoubleConsts.MAX_EXPONENT
                      - DoubleConsts.MIN_EXPONENT
                      + DoubleConsts.SIGNIFICAND_WIDTH + 1;
    var exp_adjust = 0;
    var scale_increment = 0;
    var exp_delta = "0x7ff8000000000000";
    
    // Make sure scaling factor is in a reasonable range
    
    if(scaleFactor < 0) {
        scaleFactor = Math.max(scaleFactor, -MAX_SCALE);
        scale_increment = -512;
        exp_delta = twoToTheDoubleScaleDown;
    }
    else {
        scaleFactor = Math.min(scaleFactor, MAX_SCALE);
        scale_increment = 512;
        exp_delta = twoToTheDoubleScaleUp;
    }
    
    // Calculate (scaleFactor % +/-512), 512 = 2^9, using
    // technique from "Hacker's Delight" section 10-2.
    var t = (scaleFactor >> 9-1) >>> 32 - 9;
    exp_adjust = ((scaleFactor + t) & (512 -1)) - t;
    
    d *= powerOfTwoD(exp_adjust);
    scaleFactor -= exp_adjust;
    
    while(scaleFactor != 0) {
        d *= exp_delta;
        scaleFactor -= scale_increment;
    }
    return d;
}

TODO: Unlike Java, JavaScript has to be mindful of Big Endian and Little Endian numbers. The implementation given will work for, say, x86-64 processors (read: most of them), but not for old-timey SPARC or other exotic processors.

TODO: It may be instructive to compare the OpenJDK version with Apache's util.fastmath/scalb() implementation.

## Back to Floating Point Comparison

We then have three arguments to our floating point comparison: x and y (the floating point numbers we're comparing), and an optional eps specifying what tolerance we want to say "x is approximately y with eps tolerance". If eps is not given, we will use the squareroot of Machine Epsilon.

function floatCompare(x, y, eps) {
    if (!eps) eps = sqrtEpsilon;
    var exponent = getExponent(Math.abs(x) > Math.abs(y) ? x : y));
    var delta = scalb(eps, exponent);
    var diff = x - y;
    if (diff > delta) return 1;
    if (diff < -delta) return -1;
    return 0;
}

Needless to say, we have a quick and useful function for floating-point equality based off of this:

function floatEquals(x, y, eps) {
    return floatCompare(x, y, eps)===0;
}

We will use this when checking if some numbers are "equal" (in some sense).

# Root Finding Algorithms

Aside from Horner's method, this is probably one of the first things taught in numerical analysis. Suppose we have some complicated function like f(x) = Math.pow(1-cos(x), sin(x)). When can we find some x when f(x)===0? That is to say, how can we find a root of f(x)?

## Bisection Method

The simplest method is to first find some point a where f(a)<0, and some point b where f(b)>0, then argue there must be a point x such that a<x<b and f(x)===0. How do we find it?

Simple: we guess. We look at c=(a+b)/2. If f(c)<0, then we set a=c and try again. If f(c)>0, then we set b=c and try again. The only other case is if f(c)===0, then we can return c as the root and we're done.

function sgn(x) {
    if (x < 0) return -1;
    if (x > 0) return +1;
    return 0;
}

function bisection(fn, a, b, tol) {
    var sgnA = sgn(fn(a)),
        sgnB = sgn(fn(b));
    if (sgnA === sgnB) return undefined;

    var tolerance = (!tol ? sqrtEpsilon : tol);     
    var midpoint, n, width, y;
    
    width = (b - a);
    for(n = 0; n < DoubleConsts.SIGNIFICAND_WIDTH; n++) {
        midpoint = (a + b)*0.5;
        width *= 0.5;
        y = fn(midpoint);
        if(y===0.0 || width < tolerance)
            return midpoint;
        else if (sgnA===sgn(y))
            a = midpoint;
        else
            b = midpoint;
    }
    return midpoint;
}

It is well known that each iteration gives us one more bit of precision, so we should do this at most 53 times (the number of bits in the significand for a double). This is great, but its convergence is linear (one bit per iteration is pretty bad)...can we do better?

## Newton's Method

Newton's method is quicker, but requires different data. The basic scheme is consider a sequence of numbers x[0], x[1], etc. Well, if f(x[n+1])===0, then its derivative should be f'(x[n])===(f(x[n+1])-f(x[n]))/(x[n+1]-x[n]) or equivalently f'(x[n])===(0-f(x[n]))/(x[n+1]-x[n]). After some arithmetic, we find x[n+1]=x[n]-(f(x[n])/f'(x[n])).

But look: we have a method for finding x[n+1]! We just plug in some value for x[0], and this formula should inductively get us to x[n+1], right?

Most of the time, yes. There are some cases we have to consider carefully before naively using Newton's method. We won't worry about that here. Instead, we'll consider the ingredients for this algorithm: some function fn, its derivative df, and a starting point initialGuess:

function newtonRootMethod(fn, df, initialGuess) {
    var y, dy, x;
    x = initialGuess;
    for(var n = 0; n < 9; n++) {
        dy = df(x);
        y = fn(x);
        // @todo should check that dy/y isn't going to cause problems
        x = x - (dy/y);
    }
    return x;
}

The convergence for this method is quadratic, i.e., the number of bits of precision double for each iteration.

But the problem is: this method may not even converge. For example:

The function f(x) = 1 - x*x would experience problems if initialGuess=1 or -1.
The cube root Math.pow(x, 1/3) doesn't have a derivative defined at zero.
The function x*x*x - 2*x + 2 has Newton's method enter a cycle and never settle down.

And so on. It'd be nice if we could have a method converge faster than linear ("one bit at a time"), but avoid the problems of Newton's method.

## Secant Method

There's one way to avoid computing the derivative analytically before finding the root, and that's to compute it numerically. Recall the Taylor expansion for a function f(x+h) = f(x) + h*f'(x) + O(h**2), we can approximate the derivative as (f(x+h)-f(x))/h. Let h = x[n+1]-x[n], and we're in business!

This is the so-called "secant method". We can write its code down immediately:

var secantMethod = function secantMethod(fn, a, b) {
    var tmp, x, xPrime, y, yPrime;
    x = a;
    if (b)
        xPrime = b;
    else
        xPrime = a + sqrtEpsilon;
    y = fn(x);
    yPrime = fn(xPrime);
    for(var n = 0; n < 13; n++) {
        if (floatEquals(y, 0.0))
            return x;
        // avoid roundoff errors with this approach
        tmp = (x*yPrime - xPrime*y)/(yPrime - y);
        y = yPrime;
        x = xPrime;
        xPrime = tmp;
        yPrime = fn(xPrime);
    }
    return xPrime;
}

Observe that this is just Newton's method, if we used the finite difference approximation to the derivative instead of the honest derivative.

One subtlety worth noting is that we end up with a rational expression, which I have expanded out in tmp = (...);. The reason is we avoid problems with really small numbers this way (e.g., when dealing with sin(x) at x=1, we get the right answer with this approach, but not the naive way).

The order of convergence is phi = (1+sqrt(5))/2 (approximately 1.6), which is better than bisection but worse than Newton.

# Appendix: Miscellaneous Utility Functions

Number Parts

We have a helper function to get the number parts, specifically the bits for the mantissa, the exponent, and the sign bit. It turns out getting the exponent by itself occurs frequently, so I made that its own helper function.

/**
 * Taken from @link {http://stackoverflow.com/a/17156580/1296973}
 *
 * @todo Make this function endian-independent.
 *
 * @param {number} x, some double precision number
 * @returns {Object} a JSON object consisting of its mantissa, sign, and exponent
 */
function getNumberParts(x) {
    var float = new Float64Array(1);
     
    float[0] = x;
     
    var sign = float.buffer[7] >> 7,
    exponent = ((float.buffer[7] & 0x7f) << 4 | float.buffer[6] >> 4) - 0x3ff;
     
    float.buffer[7] = 0x3f;
    float.buffer[6] |= 0xf0;
     
    return {
        sign: sign,
        exponent: exponent,
        mantissa: float[0],
    }
}

function getExponent(x) {
    return getNumberParts(x).exponent;
}

Various Constants

I needed the machine epsilon, and squareroot of machine epsilon, frequently, so I opted to define them as constants.

I later found out that Node.js does not have machine epsilon defined as a static field in the Number class, so I had to write my own function to compute machine epsilon if I can't use the Number.EPSILON field.

function computeEpsilon() {
    if (Number.EPSILON) return Number.EPSILON;
    var k = 1.0;
    while (1.0 !== 1.0+k) {
        k = k*0.5;
    }
    return k;
}

/** Machine epsilon, either given as Number.epsilon or computed from scratch
 * @constant
 * @type {number}
 * @default
 */
const machineEpsilon = computeEpsilon();
 
/** The squareroot of machine epsilon
 * @constant
 * @type {number}
 * @default
 */
const sqrtEpsilon = Math.sqrt(machineEpsilon);

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Numerical JS

Introduction

Number Parts

Various Constants

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally