[OpenWalnut-Dev] Common sources of Module crashes and how to avoid them

Robin Ledig mai02igw at studserv.uni-leipzig.de
Mon Aug 23 20:47:36 CEST 2010


Hey all,

I've put quite a lot of time into some module errors and finally got 
most of them fixed or found out where they originate.
So if you want your modules to run stable, get correctly deleted and 
sleep when they are not doing anything those points could be helpful.

1.
Creating and adding an OSG::Node (mostly in the modules create function)
A common error is to add the node with its "safeUpdateCallback" without 
previously initializing ALL the variables used.
The most common error I noticed with that was adding the Node but not 
setting up the Uniforms.
Reason for crash: the Node traversal runs in another thread then our 
module, so it "might" happen that after adding the node with the 
callback the node is traversed before everything got initialized.
So we got random errors.
Note: the node traversal is not limited be the ready-state of your module.

2.
In the module main loop always keep the following order:
...
if( !m_shutdownFlag() )
{
    m_moduleStates.wait(); /// I
    if ( m_shutdownFlah() ) /// II
    {
       break;
    }
    ... // rest of your loop condition changed stuff
}
WKernel::getRunningKernel()->getGraphicsEngine()->getScene()->remove( 
m_myNode ); /// III
...


I :
always wait at the beginning of your loop so you wont waste any 
computations.
Forgetting this will make your loop active waiting and therefore 
consuming a complete core.
I know some of you are blessed with up to 16 hardware threads but i'm 
only with 2-4 and sometimes I got to instances of OW running.

II:
directly leaving the loop prevents you from accessing data that might 
already got freed.
This prevents errors and unneeded computation.

III:
sometimes forgotten but very important, because the module is deleted 
afterwards and this will keep the node traversal active if forgotten.

3.
This error will occour mostly when your module is derivied from 
osg::Referenced.
Because when you do so you are tempted to use your this pointer as 
userdata for your root node.
This does not seem so bad, and it isn't while your module is running.
But if you attempt to delete your module, either from the module browser 
or from another source ( program close and cleanup) you might get stuck 
in an endlos loop or simply get a segfault.
This is what happens:
setting your this pointer to node->setUserDate( this ); gives the 
ownership to the node.
Mostly your node is a member variable of your module class.
So in your destructor the member variable is trying to get destructed 
which causes in reverse your module to get destructed and so on.
To break this cycle simple add a subclass containing a shared pointer to 
your module (passed by the constructor).
To avoid problems friend it with your module and write wrapper function 
for those functions you need in your node traversal.
This will be your userData.
So now what happens when your node (and so the userdata) is deleted now?
The shared pointers ref count is decreased but your module still endurse 
until its destructor is finished and so all deletion of modules will 
work now.

I hope this helps to make OpenWalnut a lot more stable.
I'll try to find and remove all those occourences but that can't prevent 
them in the future.
So maybe someone can update the explanation inside the WMTemplate.

Further if you find any reproducable crashes in OW I'd be happy to find 
the source.

regards,
Robin


More information about the OpenWalnut-Dev mailing list