SHADERS AND THE EXISTING GRAPHICS PIPELINE
There are currently (as of DirectX 9) two flavors of shaders: vertex and pixel shaders. Vertex shaders operate on vertices, or in more precise terms, the output is assumed to be a vertex in homogenous clip space coordinates. A vertex shader (usually) produces one vertex as its output, (usually) from a set of vertices provided as input. A pixel shader produces the color of the pixel as its sole output, (usually) from color and/or texture coordinates provided as inputs.
I've added the (usually)'s since this is the way that the shaders are intended to be used when replacing the functionality of the fixed-function pipeline (FFP). The point is that when you're using either a vertex shader or a pixel shader (but not both), you're limited by the way that you can provide data to the other part. However, when you're using both a vertex and pixel shader, the vertex shader output is still sampled by whatever interpolating method is set as a render state (flat or Gouraud), and those values are provided as input for the pixel shader. There's nothing wrong with using this pipeline as a conduit for your own data. You could generate some intermediate values in the vertex shader (along with a valid vertex position), and those values will make their way to the pixel shader. The only thing the pixel shader has to do is specify the final color for the pixel. We'll go over this in depth when we talk about using shaders with the FFP.
the vertex and pixel shaders fit into the current hardware pipeline. It's been designed so that the current graphics pipeline can be replaced by the new functionality without changing the overall flow through the pipeline. Pixel and vertex shaders can be independently implemented; there is no requirement that they both be present. In fact, the first generation of hardware sometimes only implemented pixel shaders. Vertex shaders were available only as part of the software driver—a very highly optimized part, but still one that doesn't run on the hardware. This illustration also shows the addition of the higher order primitive section. This is going to become an ever-increasing part of the graphics pipeline in the near future since a higher order primitive means less data sent over the graphics memory bus. Why send over a few thousand perturbed vertices when you can just send over a few dozen perturbed control points and have the hardware generate the vertices for you?
VERTEX SHADERS: TECHNICAL OVERVIEW
Vertex shaders replace the TnL (texture and lighting) part of the Direct3D and OpenGL pipeline. This pipeline is still available and is now referred to (since DirectX8) as the fixed-function pipeline since it's not programmable as shaders are. Since vertex shaders are intended to replace the FFP TnL part of the pipeline, vertex shaders must produce essentially the same output as the FFP from the same input. Now among other things, the pipeline depends upon the current rendering state— thus it's possible to perform calculations on things like fog or textures. But if fog or texturing isn't turned on, the values that are generated will never be used. It is important to note that a vertex shader's only job is to take a set of input vertices and generate an output set of vertices. There is a one-to-one correspondence between an input vertex and an output vertex. A vertex shader cannot change the number of vertices since it's called once for each vertex in a primitive.
The input to a vertex shader is one or more vertex streams plus any other state information (like current textures) , The output is a new vertex position in clip coordinates (discussed in the next section), and any other information that the shader provides, like vertex color, texture coordinates, fog values, and the like.
PIXEL SHADERS: TECHNICAL OVERVIEW
Whereas vertex shaders replace the FFP in the traditional rendering pipeline, pixel shaders replace the pixel-blending section of the multitexture section of the pipeline. To understand how pixel shaders operate, you should be familiar with the dualistic nature of the texture pipeline. Traditionally, two paths are running simultaneously in the hardware—the color pipe (also called the vector pipe) and the alpha pipe (also called the scalar pipe). These two pathways handle the color and alpha operations of the texture processing unit. Frequently, you will have set up a mode so that the color operations are performed with one set of parameters, whereas the alpha operations are performed with a different set. At the end, the results are combined into a resulting rgba value.
Although the traditional texture pipeline allows you to specify a cascade of operations on sequential textures, pixel shaders allow you to specify these with more generality. However, the dual nature of the pipeline is still there. In fact, you'll probably be spending some time fine-tuning your pixel shaders so that you can get color and alpha operations to "pair," that is, to run simultaneously in the hardware.
Another dualistic nature of pixel shaders is they have two separate types of operations: arithmetic and texturing. Arithmetic operations are used to generate or modify color or texture coordinate values. The texture operations are operations to fetch texture coordinates or sample a texture using some texture coordinates. No matter the type of operations (coloring, texturing, or a blend), the output of the pixel shader is a single rgba value.
VERTEX SHADERS, PIXEL SHADERS, AND THE FIXED FUNCTION PIPELINE
So you might be wondering, how does all this stuff fit together? Well there are two cases to consider. If you're using a shader in conjunction with the FFP, then you'll have to consider what the FFP operations are. The FFP vertex operations are going to provide your pixel shader with two color registers: one diffuse, one specular. The FFP pixel shader is going to expect two color registers as input: one diffuse, one specular. (Note that using specular is a render state.) It gets interesting when you write your own vertex and pixel shader and ignore the FFP altogether.
Your vertex shader needs to provide a valid vertex position, so you'll need to perform the transformation step and provide a valid vertex position in the oPos register. Other than that, you've got the fog, point size, texture coordinates, and of course, the two color registers. (Note that the FFP pixel operations expect only one set of texture coordinates and the two color values.) Since your pixel shader operates on these values, you are free to stick any value into them you want (within the limits of the pixel shader precision). It's simply a way of passing data directly to the pixel shader from the vertex shader. However, you should be aware that texture coordinates will always be perspective correct interpolated from the vertex positions. The fog value will always be interpolated as well. The two color values will be interpolated only if the shading mode is Gouraud. The color interpolation in this case will also be perspective correct. Setting the shading mode to flat and placing data in the color registers is the preferred method of getting values unchanged to the pixel shaders.
Specular color is added by the pixel shader. There is no rendering state for specular when using pixel shaders. Fog, however, is still part of the FFP, and the fog blend is performed after the pixel shader executes.
VERTEX SHADERS
To quote the Microsoft DirectX 8.0 documentation,
Vertex processing performed by vertex shaders encompasses only operations applied to single vertices. The output of the vertex processing step is defined as individual vertices, each of which consists of a clip-space position (x, y, z, and w) plus color, texture coordinate, fog intensity, and point size information. The projection and mapping of these positions to the viewport, the assembling of multiple vertices into primitives, and the clipping of primitives is done by a subsequent processing stage and is not under the control of the vertex shader.
What does this mean? Well, it means that whatever your input from the vertex streams (because you will have specified these values), the output from a vertex shader for a single pass will be
-
One single vertex in clip-space coordinates
-
Optional color values (specular and diffuse)
-
Optional texture coordinates
-
Optional fog intensity
-
Optional point sizing
So, at the very least your minimal vertex shader needs to take the object's vertex positions and transform them into clip-space coordinates. The optional parts are determined by the rendering state. Since the object has to be rendered with some properties, you'll have texture coordinates and/or color specifications as output. But the constant and absolute requirement for every vertex shader is that you provide vertex positions in clip-space coordinates. Let's start with that.
Our first vertex shader will do just that, transform the untransformed input vertex to clip space. There are two assumptions in the following code. The first is that the input vertex position shows up in shader register vO. The actual register number depends on the vertex shader input declaration, which tells the shader the format of the input stream. The second is that we've stored the world-view-projection matrix in the vertex shader constants. Given those assumptions, we can write the following shader:
// v0 -- position
// c0-3 -- world/view/proj matrix
// the minimal vertex shader
//transform to clip space
dp4 oPos.x, v0, c0
dp4 oPos.y, v0, c1
dp4 oPos.z, v0, c2
dp4 oPos.w, v0, c3
This shader uses four of the dot product shader instructions to perform a matrix multiply using the rows of the world-view-projection matrix sequentially to compute the transformed x, y, z, and w values. Each line computes a value and stores the scalar result into the elements of the output position register. Note that this is usually how the first few lines of your shader will look. It's possible that you might not need to perform the matrix multiply (e.g., if you transform the vertex before the shader is run). In any case, the minimal valid vertex shader must set all four elements of the oPos register.
There are some tricky issues with performing transformations, so let's review what this section of the shader has to do and what pitfalls there are. Along the way, we'll discuss some DirectX C++ code to perform the setup the shader requires.
Transformations and Projections
Typically, you'll start out with your object in what are called "object" or "model" coordinates—that is, these are the vertex positions that the model was originally read in with. Most of the projects I've worked on have created them with the object centered about the origin. Static objects might get an additional step where their vertices are transformed to some location in world space and then never touched again—creating a group of trees from a single set of original tree vertex coordinates, for example. Each new tree would take the values for the original tree's position and then use a unique transformation matrix to move the tree to its unique position.
So, for every vertex in our tree example, we need to transform it from its local model space into the global world space coordinate system. That's done pretty simply with a matrix multiplication. For every point,
So, this set of vertices in world coordinates is what we assume we are starting with. What we need to do is get from world coordinates to clip coordinates.
The Trip from World to Clip Coordinates
The trip from world space to clip space is conceptually done in three separate steps. The first step is to actually get the model into the global, world coordinate system. This is done by multiplying the object's vertices by the world transformation matrix. This is the global coordinate system that makes all objects share the same coordinate system. If you are used to OpenGL, this is called the model transformation.
The second step is to get the object's vertices into the view coordinate system. This is done by multiplying by the view transformation. The result of this step is to place the object's vertices in the same space as the viewer, with the viewpoint at the origin and the gaze direction down the z axis. (DirectX uses the positive z axis, whereas OpenGL uses the negative.) Once the vertices have been transformed, they are said to be in eye space or camera space.
It should be noted that typically an optimization step is to concatenate the world and view matrices into a single matrix since OpenGL doesn't have a separate world matrix and instead makes you premultiply the viewing parameters into its modelview matrix. The same effect can be had in DirectX by leaving the world matrix as the identity and using just the view matrix as the equivalent of OpenGL's modelview. Remember, the object is not only to get the results we want but also to do it in as few steps as possible.
Now you might remember that there was a zNear and a zFar value and a field-of-view parameter that are used in the setup of the viewing parameters. Well, here's where they get used. Those values determined the view frustum—the truncated pyramid (for perspective projection) in which only those things that are inside get rendered. What actually gets calculated from those values is the projection matrix. This matrix takes those values and transforms them into a unit cube. An object's coordinates are said to be in NDC (normalized device coordinates) or, more practically, clip space. For a perspective projection, this has the effect of making objects farther away from the viewpoint (i.e., the origin in view coordinates) look smaller. This is the effect you want, that objects farther away get smaller. The part of this transformation that produces more problems is not that this is a linear transformation in the z direction, but that (depending upon how wide the field of view is) the actual resolution of objects in the z direction gets less the closer you get to the zFar value. In other words, most of the resolution of the depth value (the z value) of your objects in clip space is concentrated in the first half of the viewing frustum. Practically, this means that if you set your zFar/zNear ratio too high, you'll lose resolution in the back part of the viewing volume and the rendering engine will start rendering pixels for different objects that overlap, sometimes switching on a frame/frame basis, which can lead to sparkling or z-fighting
The output of the view transformation is that everything now sits in relation to a unit cube centered about the origin. The cube in DirectX has one corner located at (−1, −1, −1) and the other at (1,1,1). Everything inside this cube will get rendered; everything outside the cube will get clipped. The nice thing about not writing your own rendering engine is what to do about those objects that cross the boundary. The rendering engine has to actually create new vertices where the object crosses the boundary and render only up to those locations. (These vertices are created from the interpolated values provided by the FFP or the vertex shader—that is, the vertex shader isn't run for these intermediate vertices.) This means that it has to also correctly interpolate vertex colors, normals, texture coordinates, etc. A job best left up to the rendering engine.
To summarize: We have three different matrix transformations to get from world coordinate space to clip space. Since, for a single object, you usually don't change the world, view, or projection matrices, we can concatenate these and get a single matrix that will take us from model space directly to clip space.
We recalculate this matrix every time one of these original matrices changes—generally, every frame for most applications where the viewpoint can move around—and pass this to the vertex shader in some of the constant vertex shader registers.
Now let's look at some actual code to generate this matrix. In the generic case, you will have a world, view, and projection matrix, though if you're used to OpenGL, you will have a concatenated world-view matrix (called the modelview matrix in OpenGL). Before you load the concatenated world-view-projection matrix (or WVP matrix), you'll have to take the transpose of the matrix. This step is necessary because to transform a vertex inside a shader, the easiest way is to use the dot product instruction to do the multiplication. In order to get the correct order for the transformation multiplication, each vertex has to be multiplied by a column of the transformation matrix. Since the dot product operates on a single register vector, we need to transpose the matrix to swap the rows and columns in order to get the correct ordering for the dot product multiplication.
We do this by creating a temporary matrix that contains the WVP matrix, taking its transpose, and then passing that to the SetVertexShaderConstant() function for DirectX 8, or the SetVertexShaderConstantF() function for DirectX 9.
// DirectX 8 !
D3DXMATRIX trans;
// create a temporary matrix holding WVP. Then
// transpose and store it
D3DXMatrixTranspose( &trans ,
&(m_matWorld * m_matView * m_matProj) );
// Take the address of the matrix (which is 4
// rows of 4 floats in memory. Place it starting at
// register r0 for a total of 4 registers.
m_pd3dDevice->SetVertexShaderConstant(
0, // what register # to start at
&trans, // address of the value(s)
4 ); // # of 4-element values to load
Once that is done, we're almost ready to run our first vertex shader. There are still two items we have to set up—the vertex input to the shader and the output color. Remember that there are usually two things that the vertex shader has to output—transformed vertex positions and some kind of output for the vertex—be it a color, a texture coordinate, or some combination of things. The simplest is just setting the vertex to a flat color, and we can do that by passing in a color in a constant register, which is what the next lines of code do.
// DirectX 8!
// set up a color
float teal [4] = [0.0f,1.0f,0.7f,0.0f]; //rgba ordering
// specify register r12
m_pd3dDevice->SetVertexShaderConstant(
12, // which constant register to set
teal, // the array of values
1 ); // # of 4-element values
Finally, you need to specify where the input vertex stream will appear. This is done using the SetStreamSource() function, where you select which vertex register(s) the stream of vertices shows up in. There's a lot more to setting up a stream, but the part we're currently interested in is just knowing where the raw vertex (and later normal and texture coordinate) information will show up in our shader. For the following examples, we'll assume that we've set up vertex register 0 to be associated with the vertex stream. Most of the vertex shader code you'll see will have the expected constant declarations as comments at the top of the shader.
So with the vertex input in vO, the WVP matrix in cO through c3, and the output color in c12, our first self-contained vertex shader looks like this.
// v0 -- position
// c0-3 -- world/view/proj matrix
// c12 -- the color value
// a minimal vertex shader
// transform to clip space
dp4 oPos.x, v0, c0
dp4 oPos.y, v0, c1
dp4 oPos.z, v0, c2
dp4 oPos.w, v0, c3
// write out color
mov oD0, c12
Transforming Normal Vectors
In order to perform lighting calculations, you need the normal of the vertex or the surface. When the vertex is transformed, it's an obvious thing to understand that the normals (which I always visualize as these little vectors sticking out of the point) need to be transformed as well; after all, if the vertex rotates, the normal must rotate as well! And generally you'll see applications and textbooks using the same transformation matrix on normals as well as vertices, and in most cases, this is ok. However, this is true only if the matrix is orthogonal—that is, made up of translations and rotations, but no scaling transformations. Let's take a shape and transform it and see what happens so that we can get an idea of what's happening.
If we apply a general transformation matrix to this shape and the normals as well.
Although the shape may be what we desired, you can clearly see that the normals no longer represent what they are supposed to—they are no longer perpendicular to the surface and are no longer of unit length. You could recalculate the normals, but since we just applied a transformation matrix to our vertices, it seems reasonable that we should be able to perform a similar operation to our normals that correctly orients them with the surface while preserving their unit length.
If you're interested in the math, you can look it up [TURKOWSKI 1990]. But basically, it comes down to the following observations. When you transform an object, you'll be using one of these types of transformations.
-
Orthogonal transformation (rotations and translations): This tends to be the most general case since most objects aren't scaled. In this case, the normals can be transformed by the same matrix as used for vertices. Without any scaling in it, the transpose of a matrix is the same as its inverse, and the transpose is easier to calculate, so in this case, you'd generally use the transpose as a faster-to-calculate replacement for the inverse.
-
Isotropic transformation (scaling): In this case, the normals need to be transformed by the inverse scaling factor. If you scale your objects only at load time, then an optimization would be to scale the normals after the initial scaling.
-
Affine transformation (any other you'll probably create): In this case, you'll need to transform the normals by the transpose of the inverse of the transformation matrix. You'll need to calculate the inverse matrix for your vertices, so this is just an additional step of taking the transpose of this matrix.
In fact, you can get away with computing just the transpose of the adjoint of the upper 3 × 3 matrix of the transformation matrix [RTR 2002].
So, in summary,
-
If the world/model transformations consist of only rotations and translations, then you can use the same matrix to transform the normals.
-
If there are uniform scalings in the world/model matrix, then the normals will have to be renormalized after the transformation.
-
If there are nonuniform scalings, then the normals will have to be transformed by the transpose of the inverse of the matrix used to transform the geometry.
If you know that your WVP matrix is orthogonal, then you can use that matrix on the normal, and you don't have to renormalize the normal.
// a vertex shader for orthogonal transformation matrices
// v0 -- position
// v3 -- normal
// c0-3 -- world/view/proj matrix
// transform vertex to clip space
dp4 oPos.x, v0, c0
dp4 oPos.y, v0, c1
dp4 oPos.z, v0, c2
dp4 oPos.w, v0, c3
// transform normal using same matrix
dp3 r0.x, v3, c0
dp3 r0.y, v3, c1
dp3 r0.z, v3, c2
On the other hand, if you have any other kind of matrix, you'll have to provide the inverse transpose of the world matrix in a set of constant registers in addition to the WVP matrix. After you transform the normal vector, you'll have to renormalize it.
// a vertex shader for non-orthogonal
// transformation matrices
// v0 -- position
// v3 -- normal
// c0-3 -- world/view/proj matrix
// c5-8 -- inverse/transpose world matrix
// transform vertex to clip space
dp4 oPos.x, v0, c0
dp4 oPos.y, v0, c1
dp4 oPos.z, v0, c2
dp4 oPos.w, v0, c3
// transform normal
dp3 r0.x, v3, c5
dp3 r0.y, v3, c6
dp3 r0.z, v3, c7
// renormalize normal
dp3 r0.w, r0, r0
rsq r0.w, r0.w
mul r0, r0, r0.w
There are a series of macroinstructions (such as m4×4) that will expand into a series of dot product calls. These macros are there to make it easy for you to perform the matrix transformation into clip space. Do not make the mistake of using the same register for source and destination. If you do, the macro will happily expand into a series of dot product calls and modify the source register element by element for each dot product rather than preserving the original register.
Vertex Shader Registers and Variables
Shader registers are constructed as a vector of four IEEE 32-bit floating point numbers .
While hardware manufacturers are free to implement their hardware as they see fit, there are some minimums that they have to meet. Since vertex shaders are going to be passed back into the pipeline, you can expect that the precision will match that of the input registers, namely, closely matching that of IEEE 32-bit float specification with the exceptions that some of the math error propagation rules (NAN, INF, etc.) are simplified. On those output registers that are clamped to a specific range, the clamping does not occur till the shader is finished. Note that you'll get very familiar behavior from vertex shader math, which can lull you into a sense of security when you start dealing with the more limited math precision of pixel shaders, so be careful!
PIXEL SHADERS
A pixel shader takes color, texture coordinate(s), and selected texture(s) as input and produces a single color rgba value as its output. You can ignore any texture states that are set. You can create your own texture coordinates out of thin air. You can even ignore any of the inputs provided and set the pixel color directly if you like. In other words, you have near total control over the final pixel color that shows up. The one render state that will change your pixel color is fog. The fog blend is performed after the pixel shader has run.
Inside a pixel shader, you can look up texture coordinates, modify them, blend them, etc. A pixel shader has two color registers as input, some constant registers, and texture coordinates and textures set prior to the execution of the shader through the render states (Figure 4.8).
Figure 4.8: Pixel shaders take color inputs and texture coordinates to generate a single output color value.
Using pixel shaders, you are free to interpret the data however you like. Since you are pretty much limited to sampling textures and blending them with colors, the size of pixel shaders is generally smaller than vertex shaders. The variety of commands, however, is pretty great since there are commands that are subtle variations of each other.
In addition to the version and constant declaration instructions, which are similar to the vertex shader instructions, pixel shader instructions have texture addressing instructions and arithmetic instructions.
Arithmetic instructions include the common mathematical instructions that you'd expect. These instructions are used to perform operations on the color or texture address information.
The texture instructions operate on a texture or texture coordinates that have been bound to a texture stage. There are instructions that can sample a texture or you can assign a texture to a stage using the SetTexture() function. This will assign a texture to one of the texture stages that the device currently supports. You control how the texture is sampled through a call to SetTextureStageState(). The simplest pixel shader we can write that samples the texture assigned to stage 0 would look like this.
// a pixel shader to use the texture of stage 0
ps.1.0
// sample the texture bound to stage 0
// using the texture coordinates from stage 0
// and place the resulting color in t0
tex t0
// now copy the color to the output register
mov r0, t0
There are a large variety of texture addressing and sampling operations that give you a wide variety of options for sampling, blending, and other operations on multiple textures.
Conversely, if you didn't want to sample a texture but were just interested in coloring the pixel using the iterated colors from the vertex shader output, you could ignore any active textures and just use the color input registers. Assuming that we were using either the FFP or our vertex shader to set both the diffuse and specular colors, a pixel shader to add the diffuse and specular colors would look like this.
// a pixel shader to just blend diffuse and specular ps.1.0
// since the add instruction can only access
// one color register at a time, we need to
// move one value into a temporary register and
// perform the add with that temp register
mov r0, v1
add r0, v0, r0
As you can see, pixel shaders are straightforward to use, though understanding the intricacies of the individual instructions is sometimes a challenge.
Unfortunately, since pixel shaders are so representative of the hardware, there's a good deal of unique behavior between shader versions. For example, almost all the texture operations that were available in pixel shader 1.0 through 1.3 were replaced with fewer but more generic texture operations available in pixel shader 1.4 and 2.0. Unlike vertex shaders (for which there was a good implementation in the software driver), there was no implementation of pixel shaders in software. Thus since pixel shaders essentially expose the hardware API to the shader writer, the features of the language are directly represented by the features of the hardware. This is getting better with pixel shaders 2.0, which are starting to show more uniformity about instructions.
DirectX 8 Pixel Shader Math Precision
In pixel shader versions 2.0 or better (i.e., DirectX 9 compliant), the change was made to make the registers full precision registers. However, in DirectX 8, that minimum wasn't in place. Registers in pixel shaders before version 2.0 are not full 32-bit floating point values (Figure 4.9). In fact, they are severely limited in their range. The minimum precision is 8 bits, which usually translates to an implementation of a fixed point number with a sign bit and 7-8 bits for the fraction. Since the complexity of pixel shaders will only grow over time, and hence the ability to do lengthy operations, you can expect that you'll rarely run into precision problems unless you're trying to do something like perform multiple lookup operations into large texture spaces or performing many rendering passes. Only on older cards (those manufactured in or before 2001) or inexpensive cards will you find the 8-bit minimum. As manufacturers figure out how to reduce the size of the silicon and increase the complexity, they'll be able to squeeze more precision into the pixel shaders. DirectX 9 compliant cards should have 16- or 32-bits of precision.
As the number of bits increases in the pixel shader registers, so will the overall range. You'll need to examine the D3DCAPS8.MaxPixelShaderValue Or D3DCAPS9.PixelShader1xMaxValue capability value in order to see the range that pixel registers are clamped to. In DirectX 6 and DirectX 7, this value was 0, indicating an absolute range of [0,1]. In later versions of DirectX, this value represented an absolute range, thus in DirectX 8 or 9, you might see a value of 1, which would indicate a range of [−1,1], or 8, which would indicate a range of [−8,8]. Note that this value typically depends not only on the hardware, but sometimes on the driver version as well!
No, No, It's Not a Texture, It's just Data
One of the largest problems people have with using texture operations is getting over the fact that just because something is using a texture operation, it doesn't have to be texture data. In the early days of 3D graphics, you could compute lighting effects using hardware acceleration only at vertices. Thus if you had a wall consisting of one large quadrangle and you wanted to illuminate it, you had to make sure that the light fell on at least one vertex in order to get some lighting effect. If the light was near the center, it made no difference since the light was calculated only at the vertices, and then linearly interpolated from there—thus a light at the center of a surface was only as bright as at the vertices. The brute force method of correcting this (which is what some tools like RenderMan do) is to tessellate a surface till the individual triangles are smaller than a pixel will be, in effect turning a program into a pixel accurate renderer at the expense of generating a huge number of triangles.
It turns out that there already is a hardware accelerated method of manipulating pixels—it's the texture rendering section of the API. For years, people have been doing things like using texture to create pseudolighting effects and even to simulate perturbations in the surface lighting due to a bumpy surface by using texture maps. It took a fair amount of effort to get multiple texture supported in graphics hardware, and when it finally arrived at a fairly consistent level in consumer-level graphics cards, multitexturing effects took off. Not content with waiting for the API folks to get their act together, the graphics programmers and researchers thought of different ways to use layering on multiple texture to get the effects they wanted. It's this tradition that pixel shaders are built upon. In order to get really creative with pixel shaders, you have to forget the idea that a texture is an image. A texture is just a collection of 1D, 2D, or 3D matrix of data. We can use it to hold an image, a bump map, or a lookup table of any sort of data. We just need to associate a vertex with some coordinate data and out in the pixel shader will pop those coordinates, ready for us to pluck out our matrix of data.