Real-Time Shader Programming: RENDERMAN VS. REAL TIME

RenderMan is a continually evolving product, but it's never been one that could be considered real time. A good reason for this is the flexibility needed by the artists. In 1989, there was no hardware that could render a complex ray-traced scene in real time. You can think of ray tracing as tracing each pixel through a scene—through transparent and translucent objects—back to the location where the pixel is saturated. This works nicely for scenes that have lots of complex behavior, such as scenes with reflecting surfaces, water, shadows, glass, smoke, etc. And while that is the direction 3D consumer graphics is heading, we're not quite there yet.

In the consumer 3D graphics arena, we're dealing with objects that reflect light, and we can deal with objects that are some degree of transparent. But there's no support for the effect of other objects in a scene. There's support for the concept of "lights," but a "light" in OpenGL or Direct3D is just a source of light, not an object itself. An object only "knows" about the lights in a scene, but not if it's occluded from these lights by anything else in a scene.

WHAT YOU'LL LEARN FROM THIS BOOK

This book is intended for someone familiar with 3D graphics who wants a better understanding about just how some math can turn a bunch of data into a photorealistic scene. This book not only covers the basics of setting up and using vertex and pixel shaders, it also discusses in depth the very equations (and simplifications) that were used in the default graphics pipeline. You'll learn what the basic equations are for lighting and shading, what the standards are, and most important, how to toss them out and program your own.

The emphasis is on giving you the knowledge to understand the equations that are used in lighting and shading, what the alternatives are, and how you can write your own. The equations are discussed in detail, and a shader editing tool from ATI called RenderMonkey is included. RenderMonkey is a powerful tool with an easy to use interface with which you can dynamically edit vertex and pixel shaders and their parameters in real time and see the changes upon a variety of objects. Not only will you understand the differences between the standard Blinn-Phong lighting equations and others like Torrance-Sparrow and Oren-Nayar, but you'll learn how to make real-time shaders that implement these models. Anisotropic lighting, cartoon shading, Fresnel effects; it's all covered in this book.

Preliminary Math

CONVENTIONS AND NOTATION

First, let's agree to some notation since in latter parts of the book I'll be getting into some pretty heavy notation. Generally, any single-valued quantity is written in a normal, unbolded font, whereas any multivalued quantity like a vector or a matrix is written in bold font. It gets a little confusing since one of the things a shader is generally trying to compute is color and, in current graphics hardware, that means a three- or four-element value representing the rgb or rgba values that the graphics hardware uses to represent a color. Traditionally, this color value is treated as a vector value, but it's manipulated as individual elements (Table 2.1).

Table 2.1: OPERAND NOTATION
TYPE	EXAMPLE	DESCRIPTION
Scalars	a b	A single floating point number. Represented by a lowercase italicized letter.
Vectors	a c_s v	A three- or four-element array of floating point numbers representing a direction. Represented by a lowercase bold letter. A subscript indicates an individual element.
Unit vector		A vector with a little hat character (a "circumflex") represents a normalized or unit vector. A unit vector is just a vector that has a length of one.
Points	p x	A three- or four-element array of floating point numbers representing a position. Represented by a lowercase bold letter. A subscript indicates an individual element.
Matrices	M T	A three- or four-element array of floating point numbers representing a position. Represented by an uppercase bold letter.

In addition to notation for the various data types we will be manipulating, we'll also need notation for the various types of mathematical operations we'll be performing on the data (Table 2.2).

Table 2.2: OPERATOR NOTATION
TYPE	EXAMPLE	DESCRIPTION
Addition	a + b	Addition or subtraction between similar operand types.
Subtraction	c − d
Multiplication	MA pv	There is no explicit operator typically used in this text for multiplication between operands that can be multiplied together. The absence of an operator indicates implied multiplication.
Division	1/2 a/b	Division is represented by either a stroke or a dividing line.
Dot product	a • b	Also called the inner product. Represented by the • symbol.
Cross product	b × a	Represented by the × symbol.
Absolute value or magnitude	\| a \|	The absolute value is the positive value of a scalar quantity. For a vector, it's the length of the vector. In either case, a value divided by its absolute value is ±1.
Piecewise multiplication	i_d ⊗ c_s	Element-by-element multiplication. Used in color operations, where the vector just represents a convenient notation for an array of scalars that are operated on simultaneously but independently.
Piecewise addition	i_d ⊕ c_s	Element-by-element addition. Used in color operations, where the vector just represents a convenient notation for an array of scalars that are operated on simultaneously but independently.

You'll also see text in a paragraph in a monospaced font. This is used to indicate particular aspects of shader programming. For example, you might need to check the capabilities of the device you're currently rendering by checking the D3DCAPS structure, or we might want to render a triangle list with a call to DrawIndexedPrimitive().

Finally, you'll see colored blocks of code, either C or C++ or shader code:

// shader file
vs.1.0
//transform vertices by view/projection matrix

m4x4 oPos, v0, c0
mov r0, c5         // load color5
mov rl, c6         // load color6
dp3 r2, r0, r1     // combine using a dot product
mov oD0, r2        // output color

VERTICES

When you say "vertex," most people think that you're speaking of a location in 3-space—that is, an x, y, z triplet. In this book, that's called a point. A vertex in 3D graphics typically means all the properties that are used to describe that particular vertex—that is, all the information needed to describe a vertex so that it can be rendered. Of course, the most obvious one is its location—its "point." However, if I give you a vertex description consisting of just a position and tell you to draw it, what do you draw? No, it's obvious that you need more information in order to draw a vertex, be it the vertex color, material properties, texture coordinates, whatever. So in 3D graphics, a vertex is the vertex position and whatever other information is required to render that vertex. I should note that vertices themselves aren't rendered; they are typically grouped in threes and rendered as triangles—a triangle having enough information to describe a surface area.

This "other information" could be anything about that point that can be used to describe or calculate its final color value. In the simplest case, it's just the color information for that point (e.g., the point is red). Or the point may be part of a textured surface, in which case it might have texture coordinates instead of a color, in which case the color of the point is looked up from the texture. A vertex that's part of an illuminated surface (a "lit" surface in the vernacular) could have a set of material properties that are unique color values for diffuse, ambient, and specular properties plus the point's normal value. Thus when we talk about a vertex, we're not simply talking about the vertex's location, but also that we've specified enough information to draw it according to the situation we want to render it in, be that a textured surface, a lit surface, etc. In some parts of this text, I'll talk about a vertex and mean just the position part of the vertex; other times, I might be focused on the resultant color information, but the important thing is that a vertex is specified by more than just a point.

POINTS

A point is a location in space, typically described by an x, y, z location—in other words, a location coordinate. It has no other properties. It can be translated to different locations or translated by application of a transformation matrix. Points (or coordinates) in 3D graphics are usually represented by homogenous coordinates, which represent the point as a location in 4D space through the addition of the w coordinate. For our purposes, this is just a semantic nicety that is taken care of by the graphics API, so we usually don't have to worry about it. Points are represented by the same structure we use for vectors, and many of the same mathematical operators can be performed on each, but be aware that though they look the same, they are very different.

VECTORS

In the mathematical sense, a vector is an array of numbers. We're going to use them in the 3D graphics shorthand, where vector means a direction vector. In this section, I want to cover some basic mathematical properties of vectors. One of the reasons that vectors and points are important is that in order to understand how to program shaders, you'll need to understand the math that's required to program some of the basics of shader programming. In order to do that, you need to have a basic understanding of vector and point math and to understand how these types of values are used in 3D graphics.

A vector is written a columnwise matrix, with elements x, y, and z.

Although this is fine for performing element-by-element analyses of what's going on, it's a bit tedious to read and write, so we'll usually just represent a vector by a bold lowercase letter.

Since both points and vectors are represented by a three- or four-element array of floating point numbers, we'll occasionally use the same math for them—for example, multiplying a point or a vector by a transformation matrix will transform either one. The math is identical, but it's good to keep in mind that though they might look the same, and can usually be treated the same, they aren't strictly interchangeable.

Vector Magnitude and Unit Vectors

Not only do direction vectors contain a direction, but they also encode a length or magnitude, which, depending upon what your vector represents, you can think of as the force of the vector in that direction. To compute the length of a vector, you take the sum of the squares of the vector's elements, and then compute the square root of the sum. Thus, the length or magnitude, | a |, of our vector, a, would be computed as

Another way of writing vector v is to break it into the magnitude (which is a scalar) and then the normalized or unit vector. A unit vector is a vector in which the magnitude of the vector is one. We'll write unit vectors in this book as a vector with a hat on it, for example, . A unit vector is computed (or normalized) from an unnormalized one by dividing each element of the vector by the vector's magnitude.

In most cases in 3D graphics, we just want a unit vector since we're usually interested in direction and not magnitude. Many of the lighting equations are simplified by using normal vectors.

Dot Product

The dot product is one of the more common things that you'll be using in shaders (hence the reason they are part of the shader language), so it's important to understand what they can tell us. I won't go into the derivation of the equations^[2], but it's very useful to know how the dot product can be used to derive information about the relationship between two vectors. Now, given that we have two vectors, we can examine what the dot product tells us about the relationship between these vectors. Note that although the vectors in the illustrations are shown having a shared origin, this really isn't a requirement since a vector is a direction that's unrelated to any origin. This way it's easier to visualize the angle between the vectors, θ. To calculate a dot product, first we need two vectors, a and b .

For any two vectors, the dot product describes a function of the magnitudes of the vectors times the cosine of the angle between them.

Thus the dot product (also called the inner product or the scalar product) gives us information about the angle between the two vectors and can be computed as the sum of the product of each element of the vectors. Note that the result from the dot product is a scalar value. Even more interesting is if and are unit vectors (which is typically the way we try to set things up for 3D graphics calculations), then the magnitudes of the vectors are one (and can be factored out), and the dot product gives us the cosine of the angle between the vectors directly.

You can also use trigonometry to show that the dot product can also be computed by computing the sum of the product of the individual elements of the vectors.

The preceding equation also illustrates that the dot product is order independent, or commutative, a • b = b • a. Getting the angle between two vectors allows you to calculate some very important information.

Thus you can glean information about how vectors relate to one another. For example,

If the dot product is zero, then the vectors are parallel.
If vector b is the view direction, and vector a is the surface normal for a triangle, and if the angle is greater than zero, then you know the triangle is facing away from you—thus you could cull it or apply your back face material to it.
Another example would be if vector b were again the view direction, with the view origin at point o, and you have an object at point c. You can easily create a direction vector from the view origin to the object (assuming they are in the same coordinate system) by a=o−c. Thus if a • b <>, then the object is behind the viewpoint and you can cull it.
The dot product is used repetitively in lighting calculations to compute the intensity of a light shining on a surface.

Cross Product

The dot product is useful because it tells us about the relationship between two angles, but it doesn't tell us anything about how those angles are oriented in 3-space. If we have two nonidentical vectors, they define a plane in 3-space, and it's sometimes useful to know something about that plane. This is where the cross product comes in. Unlike the dot product, which gives a scalar value as a result, the cross product gives a vector as its result—hence it's also called the vector product.

The cross product of two vectors results in a third vector.

And this third vector c has the property that it's perpendicular to both a and b.

This means that the dot products of vector c with the original two vectors are zero.

When I first learned to use the cross product, I was taught in the determinant notation, which looks like this.

This gives you another way of calculating the cross product by multiplying it out.

The cross product is also antisymmetric, meaning that the direction of the result is order dependent. If you reverse the order of the cross product, you'll get two vectors that are equal in magnitude, but point in opposite directions.

This is another way of stating that the cross product follows the right-hand rule. we'd say that we're taking "a crossb." Thus if you take the fingers on your right hand and curl them from the first vector to the second—that is, sweep them from a toward b with your wrist at the junction—your thumb will point toward the c direction.

Or in math notation

The right-hand rule also means that you can generate the third vector given the other two.

If you take the length of the cross product of two vectors, you get the area of the parallelogram formed by those two vectors. Though we're usually interested in the area of the triangle, we can just take half that value to get the area of the triangle.

Thus if we have two vectors from point o on a triangle, we can compute the area of the parallelogram from the magnitude of the cross product of the vectors.

The area of the parallelogram is |b × a| = 2. (The area of the triangle formed by oab is thus 1.) It's also possible to calculate the area of the parallelogram by using the lengths of the vectors and the sine of the angle between them to get the following relationship:

which, if you square it and use the law of cosines, gives you what's called Lagrange's identity, which you'll note, now relates a cross product term to a dot product term.

Other useful formulas are the vector triple product

and the scalar triple product, v,

which gives you the area of the parallelepiped defined by vectors a, b, c . If v = 0, then this tells you that at least two of the three vectors lie in the same plane (assuming that none are zero). Note that the order is unimportant since the volume will remain the same no matter how you perform the multiplications.

For relating vectors using cross products, you have Jacobi's identity.

^[2]For a good explanation of the basics of the dot product, see [FARIN 1998], p. 27, or [HILL 1990] p. 152.

CREATING NORMALS OUT OF GEOMETRY

One practical aspect is to use the cross product to create normals for your objects if they don't have them. Suppose you have a triangle with three vertices: a, b, and c . You can create two direction vectors by calculating a − b, and a − c, and then calculate the normal vector, n, by taking the cross product of the two vectors.

So, first we create two direction vectors, v₁ and v₂, from the three points.

Then we create the normal from the cross product of these two vectors. Note that if we cross from ac to ab, we'll get the normal pointing out of the triangle, If we did it the other way, we'd get the normal pointing out of the bottom of the triangle.

Finally, you'll probably want to normalize the normal vector.

Vertex Normals vs. Face Normals

When creating normal vectors from geometry, you frequently don't want the normal that we've calculated here, which is called a face normal (since it's the normal of the triangle's surface or the face). Instead, you want normals for the individual vertices. This is a bit more work, but you'd start with face normals, then find all the faces to which a vertex is used in, then average all the normals from those faces, possibly with some sort of weighting function thrown in. Once you get an averaged normal, you store that as the vertex's normal.

This is fine for models with smoothly varying surfaces, but for models with sharp angles, there's more work to do. Typically, there's some sort of crease angle cutoff, which means that if a vertex is shared between faces in which the angle between the faces is too large, such as a crease or a point (imagine the vertex at the corner of a cube), then the vertex needs to be assigned to one set of faces. Or, the vertex needs to be duplicated into two or more vertices that share the same position, but are assigned normals from one set of face averages. Then others use the other normals from the faces that were over the crease angle.

Another creative example of using a cross product is given by [VERTH 2001]. If you are an object traveling on a reasonably level xy plane in direction d1 and you want to turn to direction d2, then you can examine the cross product's z value–if it's positive you'll turn left; if it's negative, you'll turn right.

Though you might not be using the cross product in a shader, you'll typically use it to calculate other vectors that are required at some point.

MATHEMATICS OF COLOR IN COMPUTER GRAPHICS

Now you might think it strange that I've added a section on the mathematics of color, but it's important to understand how color is represented in computer graphics so that you can manipulate it effectively. A color is usually represented in the graphics pipeline by a three-element vector representing the intensities of the red, green, and blue components, or for a more complex object, by a four-element vector containing an additional value called the alpha component that represents the opacity of the color. Thus we can talk about rgb or rgba colors and mean a color that's made up of either three or four elements. There are many different ways of representing the intensity of a particular color element, but shaders use floating point values in the range [0,1].

I should also point out that when dealing with colors, particularly with some of the subtleties that we'll be getting into with shaders, you should understand the gamut of the target device. This is the nasty edge where our beautiful clean mathematics meets the real world. The gamut of a device is simply the physical range of colors the device can display.

Typically, a high-quality display has a better gamut than a cheap one. A good printer has a gamut that's significantly different from a monitor's. One of the issues that I had to deal with in generating the images for this book was getting the printed materials looking like the displayed images generated by the shaders. If you're interested in getting some color images for printing, you'll have to do some manipulation on the color values to make the printed image look like the one your program generated on the screen. You should also be aware that there are color spaces other than the RGB color space shaders use. HSV (hue, saturation, and value) is one that's typically used by printers, for example.

One of the gods of 3D graphics is a guy named Mike Abrash. He's the guy to blame for sending many of us on the road to 3D graphics as a career. In one of his early magazine articles [ABRASH 1992], he tells a story about going from a 256-color palette to hardware that supported 256 levels for each RGB color–16 million colors! What would we do with all those colors? He goes on to tell of a story by Sheldon Linker at the eighth Annual Computer Graphics Show on how the folks at the Jet Propulsion Lab back in the 1970s had a printer that could print over 50 million distinct colors. As a test, they printed out words on paper where the background color was only one color index from the word's color. To their surprise, it was easy to discern the words—the human eye is very sensitive to color graduations and edge detection. The JPL team then did the same tests on color monitors and discovered that only about 16 million colors could be distinguished. It seems that the eye is (not too surprisingly) better at perceiving detail from reflected light (such as from a printed page) than from emissive light (such as from a CRT). The moral is that the eye is a lot more perceptive than you might think. Twentyfour bits of color really isn't that much range, particularly if you are performing multiple passes. Round-off error can and will show up if you aren't careful!

The CIE diagrams are the traditional way of displaying perceived color space, which, you should note, is very different from the linear color space used by today's graphics hardware. The colored area is the gamut of the human eye. The gamuts of printers and monitors are subsets of this gamut.

Multiplying Color Values

Since shaders allow you to do your own color calculations, you need to be aware of how to treat colors. The calculation of the color of a particular pixel depends, for example, on the surface's material properties that you've programmed in, the color of the ambient light, the color of any light shining on the surface (perhaps of the angle of the light to the surface), the angle of the surface to the viewpoint, the color of any fog or other scattering material that's between the surface and the viewpoint, etc. No matter how you are calculating the color of the pixel, it all comes down to color calculations, at least on current hardware, on rgb or rgba vectors where the individual color elements are limited to the [0,1] range. Operations on colors are done piecewise–that is, even though we represent colors as rgb vectors, they aren't really vectors in the mathematical sense. Vector multiplication is different from the operation we perform to multiply colors. We'll use the ⊗ symbol to indicate such piecewise multiplication.

Colors are multiplied to describe the interaction between a surface and a light source. The colors of each are multiplied together to estimate the reflected light color–this is the color of the light that this particular light reflects off this surface. The problem with the standard rgb model is just that we're simulating the entire visible spectrum by three colors with a limited range.

Let's start with a simple example of using reflected colors. In the section on lighting, we'll discover how to calculate the intensity of a light source, but for now, just assume that we've calculated the intensity of a light, and it's a value called i_d. This intensity of our light is represented by, say, a nice lime green color. Thus

Let's say we shine this light on a nice magenta surface given by c_s.

So, to calculate the color contribution of this surface from this particular light, we perform a piecewise multiplication of the color values.

You should note that since the surface has no green component, that no matter what value we used for the light color, there would never be any green component from the resulting calculation. Thus a pure green light would provide no contribution to the intensity of a surface if that surface contained a zero value for its green intensity. Thus it's possible to illuminate a surface with a bright light and get little or no illumination from that light. You should also note that using anything other than a full-bright white light [1,1,1] will involve multiplication of values less than one, which means that using a single light source will only illuminate a surface to a maximum intensity of its color value, never more. This same problem also happens when a texture is modulated by a surface color. The color of the surface will be multiplied by the colors in the texture. If the surface color is anything other than full white, the texture will become darker. Multiple texture passes can make a surface very dark very quickly.

Given that using a colored light in a scene makes the scene darker, how do you make the scene brighter? There are a few ways of doing this. Given that color multiplication will never result in a brighter color, it's offset a bit since we end up summing all the light contributions together, which, as we'll see in the next section, brings with it its own problems. But if you're just interested in increasing the brightness on one particular light or texture, one way is to use the API to artificially brighten the source–this is typically done with texture preprocessing. Or, you can artificially brighten the source, be it a light or a texture, by adjusting the values after you modulate them.

Dealing with Saturated Colors

On the other hand, what if we have too much contribution to a color? While the colors of lights are modulated by the color of the surface, each light source that illuminates the surface is added to the final color. All these colors are summed up to calculate the final color. Let's look at such a problem. We'll start with summing the reflected colors off a surface from two lights. The first light is an orange color and has rgb values [1.0,0.49,0.0], and the second light is a nice light green with rgb values [0.0,1.0,0.49].

Summing these two colors yields [1.0, 1.49, 0.49], .

So, what can be done when color values exceed the range that the hardware can display? It turns out that there are three common approaches [HALL 1990]. Clamping the color values is implemented in hardware, so for shaders, it's the default, and it just means that we clamp any values outside the [0,1] range. Unfortunately, this results in a shift in the color. The second most common approach is to scale the colors by the largest component. This maintains the color but reduces the overall intensity of the color. The third is to try to maintain the intensity of the color by shifting (or clipping) the color toward pure bright white by reducing the colors that are too bright while increasing the other colors and maintaining the overall intensity. .

As you can see, we get three very different results. In terms of perceived color, the scaled is probably the closest though it's darker than the actual color values. If we weren't interested in the color but more in terms of saturation, then the clipped color is closer. Finally, the clamped value is what we get by default, and as you can see, the green component is biased down so that we lose a good sense of the "greenness" of the color we were trying to create.

Clamping Color Values

Now it's perfectly fine to end up with an oversaturated color and pass this result along to the graphics engine. What happens in the pipeline is an implicit clamping of the color values. Any value that's greater than one is clamped to one, and any less than zero are clamped to zero. So this has the benefit of requiring no effort on the part of the shader writer. Though this may make the rendering engine happy, it probably isn't what you want. Intuitively, you'd think that shining orange and green lights on a white surface would yield a strong green result. But letting the hardware clamp eradicates any predominant effect from the green light.

Clamping is fast, but it tends to lose fidelity in the scene, particularly in areas where you would want and expect subtle changes as the light intensities interact, but end up with those interactions getting eradicated because the differences are all getting clamped out by the graphics hardware.

Scaling Color Values by Intensity

Instead of clamping, you might want to scale the color by dividing by the largest color value, thus scaling the rgb values into the [0,1] range. the final color values were [1.0,1.49,0.49] meaning our largest color value was the green, at 1.49. Using this approach, we divide each element by 1.49, yielding a scaled color of [0.671,1.0,0.329]. Thus any values greater than one are scaled to one, while any other values are also scaled by the same amount. This maintains the hue and saturation but loses the intensity. This might not be acceptable because the contrast with other colors is lost, since contrast perception is nonlinear and we're applying a linear scaling. By looking at the three results, you can see there's a large difference between the resulting colors.

Shifting Color Values to Maintain Saturation

One problem with clamping or scaling colors is that they get darker (lose saturation). An alternative to scaling is to maintain saturation by shifting color values. This technique is called clipping, and it's a bit more complicated than color scaling or clamping. The idea is to create a gray-scale vector that runs along the black-white axis of the color cube that's got the same brightness as the original color and then to draw a ray at right angles to this vector that intersects (i.e., clips) the original color's vector. You need to check to make sure that the grayscale vector is itself within the [0,1] range and then to check the sign of the ray elements to see if the color elements need to be increased or decreased. As you are probably wondering, this can result in adding in a color value that wasn't in the original color, but this is a direct result of wanting to make sure that the overall brightness is the same as the original color. And, of course, everything goes to hell in a handbasket if you've got overly bright colors, which leave you with decisions about how to nudge the gray-scale vector into the [0,1] range, since that means you can't achieve the input color's saturation value. Then we're back to clamping or scaling again.

ColorSpace Tool

The ColorSpace tool is a handy tool that you can use to interactively add two colors together to see the effects of the various strategies for handling oversaturated colors. You simply use the sliders to select the rgb color values for each color. The four displays in the composite, unmodified values of the resulting color (with no color square) and the clamped, clipped, and scaled color rgb values along with a color square illustrating those color values.

Negative Colors and Darklights

You may be wondering, if I can have color values greater than the range in intermediate calculations, can I have negative values? Yes, you can! They are called "darklights" after their description in an article [GLASSNER 1992] in Graphic Gems III. Since this is all just math until we pass it back to the graphics hardware, we can pretty much do anything we want, which is pretty much the idea behind programmable shaders! Darklights are nothing more than lights in which one or more of the color values are negative. Thus instead of contributing to the overall lighting in a scene, you can specify a light that diminishes the overall lighting in a scene. Darklights are used to eliminate bright areas when you're happy with all the lighting in your scene except for an overly bright area. Darklights can also be used to affect a scene if you want to filter out a specific rgb color. If you wanted to get a night vision effect, you could use a darklight with negative red and blue values, for example, which would just leave the green channel.

Real-Time Shader Programming

Thursday, December 13, 2007

RENDERMAN VS. REAL TIME