The Painter's Algorithm

This article is the first of a series aimed to describe and formalize the different methods and tricks used in my demos for the JS1K Spring-2013 challenge. I had much fun participating, and learnt a lot, so I will try to spread my enthousiasm.
The main topics will therefore be 3D procedural generation, rendering, and how to get all this fit in 1024 bytes.

Today, we will deal with « how to easily render a 3D scene in a 2D canvas », presenting one of the simplest method for that: the Painter's Algorithm.

Through a mix of maths and code, we should end with a simple example.

Rendering Process

You have at hand your various 3D meshes, nicely placed in a scene, and now you wonder how to render that into a static picture or a dynamic demo... The global process is relatively simple to express:

Define your camera (position, orientation, field of view,...);
Project the 3D meshes of the scene on its viewfinder 2D plane, and adapt the result to your canvas (or whatever medium you use) size/ratio;
Draw the visible elements on.

Basically, it's only maths, maths, and some more maths. So let's get started.

Open your third eye

First you have to decide what you want to see, and how.

Where will be positioned your camera? Which direction will it be aiming? How far/near can it see? With which ratio? With a realistic perspective projection, an orthographic one or something more exotic? ...

There are many parameters with barbaric names (try to place « frustum » in a conversation...) to take into account. We won't address all of them right now, so for now, let's only deal with:

A perspective projection (the furthest, the smallest on screen);
A camera positioned at $(c_x, c_y, c_z)$ in the cartesian space, with its orientation defined as $({\theta}_{yaw}, {\theta}_{pitch}, {\theta}_{roll})$, the Tait–Bryan angles;
The screen ratio, while preserving the proportions in our scene. We call $W$ the screen's width and $H$ its height.

Flatten the world

The projection step is quite straightforward (as long as you don't try to find the formula by yourself). [1]

For a vertex at the position $(a_x, a_y, a_z)$, its projection would be:

$$\begin{bmatrix} \mathbf{d}_x \\ \mathbf{d}_y \\ \mathbf{d}_z \\ \end{bmatrix}=\begin{bmatrix} 1 & 0 & 0 \\ 0 & {\cos \mathbf{\theta}_{pitch} } & { - \sin \mathbf{\theta}_{pitch} } \\ 0 & { \sin \mathbf{\theta}_{pitch} } & { \cos \mathbf{\theta}_{pitch} } \\ \end{bmatrix}\begin{bmatrix} { \cos \mathbf{\theta}_{roll} } & 0 & { \sin \mathbf{\theta}_{roll} } \\ 0 & 1 & 0 \\ { - \sin \mathbf{\theta}_{roll} } & 0 & { \cos \mathbf{\theta}_{roll} } \\ \end{bmatrix}\begin{bmatrix} { \cos \mathbf{\theta}_{yaw} } & { - \sin \mathbf{\theta}_{yaw} } & 0 \\ { \sin \mathbf{\theta}_{yaw} } & { \cos \mathbf{\theta}_{yaw} } & 0 \\ 0 & 0 & 1 \\ \end{bmatrix}\left( {\begin{bmatrix} \mathbf{a}_x \\ \mathbf{a}_y \\ \mathbf{a}_z \\ \end{bmatrix} - \begin{bmatrix} \mathbf{c}_x \\ \mathbf{c}_y \\ \mathbf{c}_z \\ \end{bmatrix}} \right)$$

We can notice the three rotation matrices, one for each of our Tait–Bryan angles.

This matrix formula looks good, but can be laborious to use if you can't afford to include a library dealing with matrix operations. If we linearize it (thanks Wikipedia for saving me from this burden), we get:

$$\begin{bmatrix} d_x \\ d_y \\ d_z \end{bmatrix}=\begin{bmatrix} \cos {\theta}_{roll}\cdot(\sin {\theta}_{yaw}\cdot(a_y-c_y)+\cos {\theta}_{yaw}\cdot(a_x-c_x))-\sin {\theta}_{roll}\cdot(a_z-c_z) \\ \sin {\theta}_{pitch}\cdot(\cos {\theta}_{roll}\cdot(a_z-c_z)+\sin {\theta}_{roll}\cdot(\sin {\theta}_{yaw}\cdot(a_y-c_y)+\cos {\theta}_{yaw}\cdot(a_x-c_x)))+\cos {\theta}_{pitch}\cdot(\cos {\theta}_{yaw}\cdot(a_y-c_y)-\sin {\theta}_{yaw}\cdot(a_x-c_x)) \\ \cos {\theta}_{pitch}\cdot(\cos {\theta}_{roll}\cdot(a_z-c_z)+\sin {\theta}_{roll}\cdot(\sin {\theta}_{yaw}\cdot(a_y-c_y)+\cos {\theta}_{yaw}\cdot(a_x-c_x)))-\sin {\theta}_{pitch}\cdot(\cos {\theta}_{yaw}\cdot(a_y-c_y)-\sin {\theta}_{yaw}\cdot(a_x-c_x)) \end{bmatrix}$$

Nothing we can't implement here. Moreover, you can observe some recurrences in it, which means it is possible to refactore it to save some operations and bytes:

$$\begin{array}{lcl} temp_1 &= &\cos {\theta}_{yaw}\cdot(a_y-c_y)-\sin {\theta}_{yaw}\cdot(a_x-c_x) \\ temp_2 &= &\cos {\theta}_{roll}\cdot(a_z-c_z)+\sin {\theta}_{roll}\cdot(\sin {\theta}_{yaw}\cdot(a_y-c_y)+\cos {\theta}_{yaw}\cdot(a_x-c_x)) \\ \end{array}$$ $$ \begin{bmatrix} d_x \\ d_y \\ d_z \end{bmatrix}=\begin{bmatrix} \cos {\theta}_{roll}\cdot(\sin {\theta}_{yaw}\cdot(a_y-c_y)+\cos {\theta}_{yaw}\cdot(a_x-c_x))-\sin {\theta}_{roll}\cdot(a_z-c_z) \\ \sin {\theta}_{pitch}\cdot temp_2+\cos {\theta}_{pitch}\cdot temp_1 \\ \cos {\theta}_{pitch}\cdot temp_2-\sin {\theta}_{pitch}\cdot temp_1 \end{bmatrix}$$

We now can compute the positions of 3D elements with respect to the coordinates system defined by our camera (with $d_x$ and $d_y$ the projection position on the viewfinder and $d_z$ the perspective/depth seen from the camera). But it is not over yet: our final medium is the screen/canvas we want to display our scene on, not the camera's viewfinder. We thus have to apply a last transform to get the pixels positions $(b_x, b_y)$ on it:

$$\begin{bmatrix} b_x \\ b_y \end{bmatrix}=\begin{bmatrix} \frac{d_x}{d_z}\cdot min(H,W)+\frac{W}{2}\\ \frac{d_y}{d_z}\cdot min(H,W)+\frac{H}{2} \end{bmatrix}$$

The division by $d_z$ is the key for the perspective effect (the furthest from the camera, the bigger $d_z$, and thus the smallest the element).
Through the multiplication of $d_z$ and $d_y$ by $min(H,W)$, we apply the chosen ratio while preserving the proportions (try to use $max(H,W)$ or to use $W$ with $d_x$ and $H$ with $d_y$ to see how it influences the result).
Finally, the addition of half the screen's dimensions centers the elements.

So here is our final projection function, to be applied to every vertex of our scene:

/**
 * ===Project===
 * Computes the perspective projection for each vertex of an array of 3D faces.
 * Parameters:
 *	- sceneFaces 	(Array>):	Array of faces, defined by their vertices
 *	- camPosition 	(Vector3): 				Position of the camera
 *	- camOrientation(Vector3): 				Orientation of the camera - Tait-Bryan angles [yaw, pitch, roll]
 *	- canvasWidth 	(Int): 					Width of the canvas to project on
 *	- canvasHeight 	(Int): 					Height of the canvas to project on
 * Return: Array of projected faces (Array>)
 */
function Project(sceneFaces, camPosition, camOrientation, canvasWidth, canvasHeight) {
	var cosYaw = Math.cos(camOrientation[0]),
		sinYaw = Math.sin(camOrientation[0]),
		cosPitch = Math.cos(camOrientation[1]),
		sinPitch = Math.sin(camOrientation[1]),
		cosRoll = Math.cos(camOrientation[2]),
		sinRoll = Math.sin(camOrientation[2]),
		minDim = (canvasWidth < canvasHeight) ? canvasWidth : canvasHeight; // To avoid repeating the callings.

	var projectedFaces = [];
	var nbVertices;
	
	for (id in sceneFaces) {
		projectedFaces[id] = [];
		
		for (j = 0; j < (nbVertices = sceneFaces[id].length) ; j++) { // For each vertex of the face:
			// 3D Projection:
			var temp1 = cosYaw*(sceneFaces[id][j][1] - camPosition[1]) - sinYaw*(sceneFaces[id][j][0] - camPosition[0]),
				temp2 = cosRoll*(sceneFaces[id][j][2] - camPosition[2]) + sinRoll*(sinYaw*(sceneFaces[id][j][1] - camPosition[1]) + cosYaw*(sceneFaces[id][j][0] - camPosition[0])),
				
				dX = cosRoll*(sinYaw*(sceneFaces[id][j][1] - camPosition[1]) + cosYaw*(sceneFaces[id][j][0] - camPosition[0])) - sinRoll*(sceneFaces[id][j][2] - camPosition[2]),
				dY = sinPitch*temp2 + cosPitch*temp1,
				dZ = cosPitch*temp2 - sinPitch*temp1;
			
			// Transform to project on the screen:
			projectedFaces[id].push([dX / dZ * minDim + canvasWidth/2,
									dY / dZ * minDim + canvasHeight/2,
									dZ]); 
		}
	}
	
	return projectedFaces;
}

There is one important thing I haven't addressed yet : the clipping step. When you look at something with your own eyes or a camera, you don't see the whole scene, e.g. you can't see behind you. Your field of view, your frustum, is limited to a specific cone or pyramid (truncated, if you take into account the near plane and the far plane. Some cameras can't see objects too close or too far: the planes represent those limits).

In the above method, we project every element, without disgarding the ones out of view. We need to get rid of them, to clip them. In my examples, to do that, I simply rely on a property of the HTML canvas: its limits, and its pliability when it comes to draw out of them:
With our transforms, pixels of elements out of the frustum will have their position $(b_x, b_y)$ out of the canvas field, and when I blindly ask to draw them, the canvas API doesn't complain: they will just not appear.

It is far from an efficient solution, and easy improvements can be done (for instance by checking by yourself the values of $b_x$ and $b_y$ before drawing).
The best solution would be to clip the scene before the projection, in the 3D space, but here things get a bit more tricky.

Finally, if you want to truncate your field of view to a certain depth range (delimited by the near and far planes), you can simply discard the faces with a $d_z$ out of it.

Painter's Stroke

Importance of sorting by depth Objects beyond our perepheral vision are not the only on which are hidden.
Some can be totally or partially covered by others. So we don't want to draw our elements higgledy-piggledy, displaying faces which should be hidden under others. We want to preserve the depth order while rendering.

And that's exactly what the Painter's algorithm is for.

The idea is quite simple: before displaying our elements, we just sort them by decreasing depth (so from the furthest from the camera position to the closest).
By drawing them in this order, the front elements will naturally cover/overlay the ones behind. Like most painters would do.

Luckily for us, we've already computed the depth of the vertices, $d_z$, making the implementation of the rendering function really straightforward:

//	ComputeBarycentersDepths() is supposed to be called right after the projection step and
//	before Paint(), to set the Z-bar values of each vertex. I make the supposition that these
//	values can be used in other part of your rendering method (for instance to set the colors),
//	that's why I separated it from Paint().


/**
 * ===ComputeBarycentersDepths===
 * Computes the Z value of the given faces, pushing it into the array defining each face.
 * Parameter:
 *	- faces	(Array):	Array of faces
 */
function ComputeBarycentersDepths(faces) {
	for (id in faces) {
		var barDepth = 0;
		for (j = 0; j < (nbVertices = faces[id].length) ; j++) { // For each vertex of the face.
			barDepth += faces[id][j][2];
		}
		
		barDepth /= nbVertices;
		faces[id].push(barDepth);
	}
}

/**
 * ===Paint===
 * Computes the Z value of the given faces, pushing it into the array defining each face.
 * Parameter:
 *	- projectedFaces	(Array<Object>):	Array of faces, defined by their vertices, their
 *											average Z-value and their color ( face = [ 
 *											[X1, Y1, Z1], ..., [X2, Y2, Z2], Z_bar, color ] )
 *	- context			(CanvasContext):	Context of the output canvas
 */
function Paint(projectedFaces, context) {
	projectedFaces.sort(function(face1, face2) {
		return (face1[face1.length-2]-face2[face2.length-2]); // Sorting by desc Z_bar
	}); 			
	for (id in projectedFaces) {
		// Setting the color:
		context.fillStyle = context.strokeStyle =
			projectedFaces[id][projectedFaces[id].length-1];
		
		// Tracing the 2D shape:
		context.beginPath();
		context.moveTo(projectedFaces[id][0][0], projectedFaces[id][0][1]);
		for (j = 1; j < (nbVertices = projectedFaces[id].length-2) ; j++) {
			context.lineTo(projectedFaces[id][j][0], projectedFaces[id][j][1]);
		}
		context.closePath();
		
		// Drawing it:
		context.fill(); context.stroke();
	}
}

Now you only need to wrap everything in a Render() method such as the one below, and you're done.

/**
 * ===Render===
 * Example - Render a 3D scene in a canvas using the Painter's Algorithm.
 * Parameter:
 *	- sceneFaces 		(Array):	Faces defining the scene to render
 *	- context 			(CanvasContext): 	Context of the output canvas
 *	- camPosition 		(Vector3): 			Position of the camera
 *	- camOrientation	(Vector3): 			Orientation of the camera
 *												- Tait-Bryan angles [yaw, pitch, roll]
 *	- canvasWidth 		(Int):				Width of the canvas to project on
 *	- canvasHeight 		(Int):				Height of the canvas to project on
 *	- colorFunction 	(Function): 		Function which attributes a color to each face,
 *											depending on some given parameters
 *											(define your own signature and edit under)
 */
function Render(sceneFaces,
				context,
				camPosition, camOrientation,
				canvasWidth, canvasHeight,
				colorFunction) {
	// Projection:
	var projectedFaces =
		Project(sceneFaces, camPosition, camOrientation, canvasWidth, canvasHeight);
	
	// Preparing to sort the faces (we will use their barycenter for that - this isn't perfect,
	// and we can have some glitches, but let's keep it simple):
	ComputeBarycentersDepths(projectedFaces);
	// These values can maybe be useful for the color function, that's why we (can) save them.
	
	// Defining the color of each face (pass the parameters you need and do your own business
	// here - this is just an example):
	colorFunction(projectedFaces, sceneFaces);
	
	// Applying the Painter's Algo & Drawing:
	Paint(projectedFaces, context);	
}

Conclusion

That's all. We just implemented a really simple 3D renderer. You can find below a direct example using the exact functions described above:

/**
 * The rendering functions should be added here.
 */

// Shim layer with setTimeout fallback
// Thank to Paul Irish (http://paulirish.com/2011/requestanimationframe-for-smart-animating)
window.requestAnimFrame = (function(){
  return  window.requestAnimationFrame       ||
          window.webkitRequestAnimationFrame ||
          window.mozRequestAnimationFrame    ||
          function( callback ){
            window.setTimeout(callback, 1000 / 60);
          };
})();

// We define a simple 2x2x2 cube, made of quad faces
// (building the mesh procedurally isn't complicated, but not the subject of this article):
var cubeVertices = [[-1,  1,  1], [ 1,  1,  1], [ 1, -1,  1], [-1, -1,  1],
					[-1,  1, -1], [ 1,  1, -1], [ 1, -1, -1], [-1, -1, -1]];
var cubesQuadriFaces = [[cubeVertices[0], cubeVertices[1], cubeVertices[2], cubeVertices[3]],
						[cubeVertices[4], cubeVertices[5], cubeVertices[6], cubeVertices[7]],
						[cubeVertices[1], cubeVertices[5], cubeVertices[6], cubeVertices[2]],
						[cubeVertices[4], cubeVertices[0], cubeVertices[3], cubeVertices[7]],
						[cubeVertices[4], cubeVertices[5], cubeVertices[1], cubeVertices[0]],
						[cubeVertices[3], cubeVertices[2], cubeVertices[6], cubeVertices[7]]]; 

// And we display it:
var canvas = document.getElementsByTagName("canvas")[0];
canvas.width = canvas.height = 200;
var context = canvas.getContext("2d");
var timer = 0;

(function animloop(){
	// We request the next frame first to assure ~60fps:
	requestAnimFrame(animloop);
	
	// We compute the new state: 
	//	- we make the camera turn and roll around the cube while still focusing on its center:
	var camPosition = 	[	10*Math.cos(timer/87)*Math.sin(timer/57),
							10*Math.sin(timer/87)*Math.sin(timer/57),
							10*Math.cos(timer/57)];
	var camOrientation =[	-Math.PI/2+timer%(Math.PI*87*2)/87,
							-timer%(Math.PI*57*2)/57,
							Math.cos(timer/33)/3];
	
	//	- we set the colors over time:
	var colorFunction = function colorFuncT(projectedFaces, originalFaces) {
		var cosLum = Math.cos(timer/37), sinLum = Math.sin(timer/23);
		for (id in projectedFaces) {
			var hue = (timer/5)%360,
				saturation = 75+3*projectedFaces[id][0][2],
				luminosity = 50+(id-projectedFaces.length/2)/projectedFaces.length*20;
			
			projectedFaces[id].push("hsl("+hue+","+saturation+"%,"+luminosity+"%)");	
		}
	}
	
	// We clear the previous frame:
	context.clearRect(0, 0, canvas.width, canvas.height);
	
	// We render the new one:
	Render(cubesQuadriFaces, context, camPosition, camOrientation,
			canvas.width, canvas.height, colorFunction);
	
	timer++;
})();

Example - Cube

You can also check my two 1Kb demo using this algorithm, to see it live: Morphose and Loom.

But don't be fooled: when I say « simple renderer », I especially mean « too simple ». The Painter's algorithm has many drawbacks, such as:

It draws stuff which finally won't be displayed (hidden under). Far from optimal... (unless you have semi-transparent objects in your scene: this flaw becomes a cheap way to automatically manage the transparency);
It is face-wise, ie. it draws face per face, and not pixel per pixel like some other methods. We thus lose in precision (the pixel-wise solutions give you the possibility to compute textures more precisely - pixel per pixel), and we can also face some rendering glitches, for instance in the case of overlapping or piercing polygons (how to order thoses faces?);
As implemented here, we don't know which element is above/under which one. This information could be useful to apply some effects, such as shadows (see for instance the Shadow Z-buffer algorithm [2]).

That being said, this method can still be applied to many cases. It is easy to implement and only needs a 2D drawing library.

Who needs performance?

References

^ Wikipedia: 3D projection - en.wikipedia.org/wiki/3D_projection
^ Chris Bentley: « Two Shadow Rendering Algorithms » - web.cs.wpi.edu/~matt/courses/cs563/talks/shadow/shadow.html

Aldream

Article