You may have seen on Mike Acton’s twitter account that he posted a review code on OgreNode.cpp.
Seems like some community members are waiting for my response, I made a few comments on bounceapp but due to my inexperience with that site I didn’t realize changes had to be saved. So here goes my reply:
OgreNode.cpp was written 13 years ago when OO programming was all the rave (still is?) everyone had a single core, caches didn’t matter and most efficient way to cull the world was to use an Octree or a BSP. The world believed that “if( dirty )” was a magical, no-cost expression that is immediately a performance improvement wherever used to avoid the execution of more than 3 instructions.
13 years later, Moore’s law kicked us in the butt and everyone is multicore. You probably know that story already.
Mike Acton reviewed the 1.9 version. Perhaps it would’ve been more interesting to see a review of the 2.0 file which has been refactored to better fit Data Oriented Design principles (and I’m sure there are things I wrote to criticize). Many of the things he criticizes of 1.9 have been fixed. Nevertheless there are things we can learn. Note that if he weren’t right, then it would be hard to explain why there was a 5x performance increase between 1.9 and 2.0.
We used to trash the data cache bad, very bad. There may be room for further improvements, but I’m now focusing on getting a decent release to package*. And I hope he doesn’t review OgreAnimation.cpp from 1.9; the whole file should be censored from the internet (a replacement is in the works).
*And besides, there are big hotspots waiting to be fixed like skeletal animation and the render queue.
What is fixed in 2.0
- Most of the memory sensitive data is now taken care by a memory manager to ensure they’re friendly to the cache, and to allow several SIMD and multi threading approaches (you couldn’t even update the hierachy or set positions from multiple cores in 1.9!)
- The constructor that accepts a string (the node’s name) has been removed. The name is empty by default (in 1.9 it had to be unique and that caused the name generator crap).
- Update has no branches anymore but we still use the two “inherit orientation/scale” flags for features and flexibility. However this is branchless (though perhaps the compiled C version may contain a few branches; we now use SSE2/NEON when available and by default, to implement branchless selection).
- The “queued for update” BS has been removed. There is no need for this anymore. My WTF meter exploded the first time I took a look at it.
- All the “if( mCachedTransformOutOfDate )” code and variable BS has been removed (except in debug mode for a few assertions). We now update the hierarchy only once per frame (or on demand), and use the cached results (since THAT’s the general case). For convenience, users may use an alternate function that forces the recalculation of the derived transform for that single node (but the user has to know it is needed because it may be out of date; that’s why we placed asserts in debug mode); though this function is discouraged by the docs that it shouldn’t be used frequently on many nodes (the user should refactor your code so that it happens after the hierarchy has been updated, or perform another update on-demand).
- The way children and parents are recalculated has been completely rewritten. Like Mike Acton says, how the transform will actually be computed was hidden because every function can be overridden. This math in the end was equal to all derived classes (including bones), and thus many virtual keywords have been removed, and the main hierarchy update function is now static. Can’t get clearer and more transparent than that.
- The atrocities he mentions of the various getChild and removeChild overloads have been fixed (i.e. string lookup, O(N) iteration when the integer handle is provided, etc) and many redundant overloads have also been removed. We store all children in an plain old array, where it used to be stored in a std::map (omg!).
- We’re in the process of removing all uses of getSquaredViewDepth calls and though it still remains for legacy reasons, it has been wiped from the major hotspots.
- needUpdate has been removed.
What we can learn from 1.9 review and/or still applies
- I agree that mInitialPosition/Orientation/Scale don’t belong to Node at all. They’re used by animated nodes to remember their starting position (i.e. when subclassed by bones, they contain the binding pose). Awful. We’re still working on animation, but these three variables will eventually go away.
- At first I disagreed, but the more I look at setParent, the more I think he’s right about it. Note however, it’s a protected member function and the doc says it’s for internal use. This is the reason it’s not defined what happens when parenting to a node that already has a parent: This behavior is well defined in the public interfaces addChild/removeChild (throw an exception).
- He’s got quite a point in that we provide createChild functionality, but no method to mass-create children. There’s also no method for deferred creation of nodes so that they can be made from multiple threads. I haven’t even thought of that (currently creation, destroying, attaching and detaching of nodes must be done from one thread)
- I didn’t notice how inefficiently we set the initial position and orientation in createChild.
- I agree on the DebugRenderable thing from nodes. It doesn’t belong there and only increases compilation time due to additional dependencies. I think it’s probably broken in 2.0 though I haven’t tried.
- Some of listener’s bad practices stuff was removed, but it’s still around.
All in all, we thank Mike for taking the time to review our code; we welcome these harsh reviews. It’s been useful in identifying quite a bunch of issues.