An Error-Compensating Digitial Differential Analyzer (EC-DDA)

Abstract

Bresenham's line drawing algorithm is the de-facto standard for efficient line drawing on rasterized output devices such as computer screens or plotters. The algorithm offers a number of benefits: it is solely based on simple integer operations, which are fast and cheap to implement, it minimizes the error that arises due to the discretization, and it is order-invariant. In contrast, a naive digital differential analyzer (DDA) requires complex floating-point arithmetic, accumulates rounding errors due to the finite representation of possibly infinite rational numbers, and is not order-invariant. However, on a modern desktop processor floating-point arithmetic is queued into the processor pipeline just as fast as integer operations, and a DDA requires a fewer number of instructions than Bresenham's algorithm.

An idea I publically described in 2003 and implemented shortly afterwards allows one to upper-bound the cumulative numeric error that may arise in the DDA and by that calculation adapt the differential such that the result is an "error-compensating" DDA (EC-DDA) with the property that all pixels along the line attain correct rounding provided that their cooridinate values are smaller than some predetermined maximum (e.g., 221 for IEEE 64-bit floating point numbers). The new algorithm has been shown to be significantly faster than traditional Bresenham line drawing on modern desktop processors.

Introduction

A naive way of drawing a line from point (x1, y1) to point (x2, y2) is based on computing the slope (also called differential) between the two endpoints and then on iteratively adding the slope to one coordinate of the starting point while progressing along the axis of the other coordinate. For example, one might compute the slope as

slope := dy/dx

where dx = x2 - x1 and dy = y2 - y1, and then evaluate the y-coordinate for each point with a given x-coordinate between x1 and x2 as

y := y1 + (x - x1)*slope.

Assuming that the two endpoints' coordinate values are integers one can consider x to be in the set of all integers between x1 and x2. Then each y will be some rational number, which must be discretized, that is, rounded to some integer, for display on the output device. Round-to-nearest (simply denoted as round) minimizes the error the discretization introduces, and we may assume the convention that half-way values (e.g., 0.5, 1.5, etc.) are always rounded up to the next-bigger integer. Now, and one can draw the line using some pixel routine as

for x := x1 to x2 y := y1 + round((x - x1)*slope) pixel(x, y) end for

The algorithm above assumes that x2 > x1 and |slope| ≤ 1; adapting it to the other three cases is a simple matter and omitted here for clarity.

What is more imporant is that the simple algorithm above has two notable problems that make it impractical. From a performance point of view, the algorithm is rather slow. This is because each iteration of the loop carries out two floating-point instructions, one multiplication and one rounding operation, both which are comparatively expensive operations. Floating-point instructions are often not available on resource-constrained devices such as digital signal processors where they would add considerably to cost and energy consumption, and they are much slower than integer operations even on high-end computing devices.

The second and more concerning problem of the algorithm is that it is incorrect: intermediate pixels are not guaranteed to be assigned their "best" coordinate values on the output device, that is, those values that minimize the discretization error. This is because the slope value, a rational value, can only be represented with finite precision as a floating-point value and thus a minimal error in the slope value may result in a wrong rounding towards one or the other integer. At the same time, the appearance of the line may be dependent on whether the line is drawn from one endpoint to the other or in the reverse order.

As an example, consider drawing a line from point (0, 0) to point (10, 1). The actual slope value is 0.1; however, in a "single-precision" floating-point representation the value is 0.099999904632568359375. Evaluating y at x = 5 gives 0.499999523162841796875 as an intermediate result, which gets rounded to 0 rather than 1. Reversing the endpoints and thus drawing a line from (10, 1) to (0, 0) results in an intermediate value of y at x = 5 as 0.500000476837158203125, which gets rounded to 1. It seems that one may adopt a convention for drawing lines only in a certain direction (e.g., only with positive slope values); however, a consistent rounding result for half-way values cannot be guaranteed that way.

The issue of rounding errors could be avoided by substituting the slope variable in the algorithm by its assignment and then carrying out multiplication before division. In detail, y could be computed as

y := y1 + round((x - x1)*(y2 - y1)/(x2 - x1)).

Unfortunately, this not only increases the number of floating-point instructions in the loop to three but also introduces a considerably slow operation, a floating-point division. A different solution is thus indicated.

Bresenham's Line Drawing Algorithm

One approach for avoiding the issue of rounding errors is to replace floating-point operations by integer operations. Looking at the simple algorithm above one can observe that the floating-point multiplication with the slope value can be easily avoided by replacing it with an iterative addition. That is, the algorithm can be rewritten as

y := y1 for x := x1 to x2 pixel(x, round(y)) y := y + slope end for

Such an algorithm is called a digital differential analyzer (DDA). While this algorithm is still conceptually simple the problem of rounding errors is aggrevated. This is because the error by which the slope value is affected accumulates with each iteration of the loop. On the other hand, the algorithm is faster than the previous one because it replaces the floating-point multiplication by a simpler addition.

One may now study the instances at which y results in round(y) being incremented by 1 compared to its previous value in the loop. (We may assume a positive slope < 1 without loss of generality, and y is thus a monotonically increasing value.) These instances occur precisely when y has crossed half-way between two integer values, that is, y - round(y) < 0 in the current iteration of the loop whereas y - round(y) ≥ 0 in the previous iteration of the loop. Let e := y - round(y); then e can be tracked by initializing it with 0, accumulating the slope values in the loop and decrementing it by 1 whenever e ≥ 0.5. This variable represents the error that is introduced by discretizing y to round(y).

Using e, one may rewrite the algorithm as

y := y1 e := 0 for x := x1 to x2 pixel(x, y) e := e + slope if e ≥ 0.5 then e := e - 1 y := y + 1 end if end for

Now, remember that e is defined as dy/dx. One can avoid the division in the computation of slope by multiplying e by dx, and so the code becomes

y := y1 e := 0 for x := x1 to x2 pixel(x, y) e := e + dy if e ≥ 0.5*dx then e := e - dx y := y + 1 end if end for

Finally, one can multiply the inequality by 2 to obtain the following result:

y := y1 e := 0 for x := x1 to x2 pixel(x, y) e := e + dy if 2*e ≥ dx then e := e - dx y := y + 1 end if end for

This algorithm is truly notable because it not only requires no floating-point operations at all (and thus avoids the issue of rounding errors), it also does not rely on any integer multiplications or divisions (the multiplication by two can be carried out via a trivial bit-shift operation). It is thus an ideal solution for resource-constrained devices, and it has consequently become the de-facto standard among line-drawing algorithms (or a basis for more complex algorithms). The algorithm was first presented in 1965 by Jack Bresenham and has since become known as Bresenham's line drawing algorithm.

Even though this algorithm is fairly efficient in terms of both computing resources needed and processing speed, it is obviously more elaborate than the simple digital differential analyzer. While integer operations are fast, modern CPU architectures are aggressively optimized for floating-point operations and massively pipelined with multiple arithmetic processing units such that instructions can be executed (and finished) even when previous instructions haven't finished yet. It is therefore tempting to ask whether the simple DDA can be adapted in such a way as to somehow compensate the discretization error and thus to obtain an algorithm that is both correct and faster than Bresenham's algorithm. For this aim it is imperative to understand the cause of rounding errors in the DDA algorithm.

Understanding Floating-Point Errors

Computers store numbers in the form of a binary (base 2) representation within some finite space of memory. This method is convenient for encoding integer values, but when it comes to encoding real numbers it turns out that the set of numbers that have a finite binary (or p-adic) representation is a diminishing set among all reals. Not only can no base exactly represent an irrational number, the number of rationals that do not end in a period of non-zero digits in a given base is diminishing as well. For example, consider the rational 1/10, which is 0.1 in base-10 representation but results in the following infinite string of digits in binary representation:

0.1(10) = 0.0001100110011...(2)

Because of these limitations it is important to choose some encoding strategy that minimizes the discretization error.

In a simple approach, reals are encoded by reserving a fixed number of digits past the decimal point within the available memory. For example, various decimal numbers can be represented (in base 10) as

0.23, -15.72, 1260.00, -0.01, ...

Such an encoding is called a fixed-point representation. While easy to implement, it is unsatisfactory. Small numbers cannot be represented accurately enough because the number of available digits after the decimal point is constant. For example, both 14/1000 and 6/1000 are encoded as 0.01 (with rounding), even though the former value is more than twice as large as the latter – and all digits on the left side of the decimal point are 0. Even smaller values (smaller than 0.005) can only be best approximated as 0. On the other hand, large numbers may not be known accurately enough. For example, the mass of Earth is roughly 5,972,190,000,000,000,000,000,000 kilograms, but what is the point of storing all the zeros when one is confident only about some number of digits at the beginning?

Usually, what one is interested in is a certain number of significant digits at the beginning of a real number. Scientists and engineers long ago learned to cope with this by using scientific notation, in which a number is expressed as the product of a mantissa and some power of base ten. The mantissa is a signed number with an absolute value greater than or equal to one and less than ten to obtain a unique representation. So, for instance, the mass of Earth can be sufficiently represented as 5.9722×1024 kilograms.

An encoding of scientific notation on computers generally uses a sign bit, a fixed number of bits (referred to as the precision) for a positive mantissa and a shorter number of bits for the exponent, denoting a power of base 2. Such a format is called a floating-point representation. IEEE 754 refers to an important (and pleasantly well designed) standard for floating-point representations, notably brought forward by Intel corporation. The standard includes specifications for the encodings of real numbers within various bit lengths, but also for the encoding of increasingly small (so-called "subnormal") numbers, infinities and special not-a-numbers (NaNs), and further specifies the details of floating-point operations, rounding rules, exception handling and format conversions.

Among the most popular encodings the standard specifies are a 32-bit and a 64-bit representation, widely referred to as single- and double-precision formats. The general representation of real numbers in these formats is of the following form:

(-1)s × 1.m × 2e

Here, s denotes a sign bit, m refers to the bits of the mantissa after the decimal point (the leading bit of the mantissa is assumed to be 1), and e denotes the exponent. In single-precision encoding, m takes 23 bits and e takes 8 bits, while a double-precision format reserves 52 bits for m and 11 bits for e.

Floating-point representation is a flexible solution for storing real numbers within limited space, but it does not avoid the issue of rounding errors. In IEEE 754, the error depends on the precision of the selected floating-point format, the exponent chosen and the rounding mode selected. It can be expressed in terms of the absolute error, which is defined as

eabs(x) = |x - x'|

where x' denotes the floating-point representation of some real number x. For a given precision p and some exponent e, the absolute error can be upper-bounded by 2-p+e for round-to-nearest and by 2-(p-1)+e for the floor or ceiling rounding operations. Can these upper bounds be used to calculate the worst-case discretization error that may arise in the DDA algorithm?

As an example, consider drawing a line from point (0, 0) to point (1000, 100). The differential is 0.1, whose IEEE 64-bit floating-point representation is affected by an approximate discretization error of 2.2×10-17, which is smaller than the upper bound 2-53+2 ≈ 2.8×10-17. However, when adding up this differential 1000 times (and using round-to-nearest) the error accumulates to 1.4×10-12 and thus becomes larger by more than a factor of 1000 of the initial error. This is because less and less digits become available after the decimal point as the values are summed up. The result is slightly smaller than 100, just as the floating-point representation of 0.1 is slightly too small. On the other hand, when adding up 0.1 a 100 times the result is slightly larger than 10 – this seemingly undeterministic behavior is a result of the round-to-nearest operation applied after each addition.

A trick can help to upper-bound the accumulation error that arises in the DDA...

The EC-DDA Algorithm

Enhancements

Sub-Pixel Precision

Parallelization

Performance Evaluation

Conclusion