Shadow mapping nightmare


Originally, my idea of an elegant method of rendering a scene with shadow maps resulted to be flawed:

ShadowNode shadowNode;
shadowNode->render();
// Normal rendering using the shadow node's content when binding textures
sceneManager->renderScene( shadowNode, ... );

That looked perfect until I stumbled with the problem that shadow maps’ directional lights need bound data from the original scene (namely an aabb enclosing all visible objects by the viewer’s camera, and an aabb with all those objects, filtered with only receivers). In both Ogre 1.x & 2.x, this is the good ol’ structure VisibleObjectsBoundsInfo

Of course, we could cull all objects against the viewer’s frustum & calculate such Aabb data independently in the the shadow node. And then cull again while doing the normal rendering pass.

Except for the fact that frustum culling is quite an expensive operation.

If we could reuse such information so that we only cull the frustum once, that would be great. Ogre 1.x in fact reuses such data by interrupting right in the middle of the normal rendering sequence. But the execution flow is ugly as hell. At the high level, it looks like this:

void SceneManager::renderScene()
{
	if( shadowmapsEnabled() && !isShadowMapPass )
	{
		if( shadowMapsNotInitialized() )
			initializeShadowMaps();
		isShadowMapPass = true;
		this->renderScene(); // Recursive call!
		isShadowMapPass = false;
	}
	/* * Rest of normal rendering pass * */
}

The code is much more verbose of course, but you get the point. One of the things I wanted to do with the compositor is to lift the responsibility of handling shadow maps away from Scene Manager, and go into a more OO approach because right now the SceneManager is a god object. A compositor node named “CompositorShadowNode” (genius name, I know) would be responsible for handling the shadow map related stuff.

Separating rendering into two phases

Fortunately, I was able to realize the bounds info is calculated in frustum culling, and the frustum cull does not depend on almost anything (particularly, doesn’t depend on API, like render states, etc) so the solution ended up looking like this:

normalPass->_cullPhase01();
	shadowNode->setupShadowCamera( normal->getVisibleBoundsInfo() );
	shadowNode->_cullPhase01();
	shadowNode->_renderPhase02();
normalPass->_renderPhase02();

In other words, we render the shadow maps in the middle of a normal scene update, but without hacks, and without the scene manager needing to know anything about shadow map rendering. Cool.

A few implementation caveats

Whether dividing the rendering into two stages or interrupting the render sequence (which is essentially the same thing in practical terms), there’s the problem that we leave the scene manager in an incomplete state.

The brightside of separating into phases is that we can easily spot what is left in such inconsistent state. In our case, that would be the list of culled objects (mVisibleObjects to be exact).
It’s a vector containing all the frustum-culled objects from the normal render, but this vector is going to be cleared and filled with objects culled against the shadow map’s internal camera. Ouch.

As a result, we need to save the state of mVisibleObjects and restore it after we’re done. Luckily, we can just swap pointers (std::vector::swap) rather than performing a hard copy. In other words, saving and restoring the state is virtually free.

The code ends up looking more or less like this:

normal->_cullPhase01();
saveCulledObjects( normal->getSceneManager() );
	shadowNode->setupShadowCamera( normal->getVisibleBoundsInfo() );
	shadowNode->_cullPhase01();
	shadowNode->_renderPhase02();
restoreCulledObjects( normal->getSceneManager() );
normal->_renderPhase02();

My render queue is longer!?

The flexibility of the compositor means one can render the RQs #0 to #5 into the shadow map, but just draw RQ #1 into the main window.

The separation into phases allows to reuse the visibility bounds information from RQ #1. But we don’t have those from RQ 0, 2, 3 4 & 5. We need to calculate them independently, while reusing as much as we can.

Of course, the user may later want to render RQs #0 to #5 in subsequent passes, but they’ll come too late, as we needed the visibility info earlier for the shadow node being used for correctly rendering RQ #1.

 

Well, that’s all for now. Let’s go back to coding!