It is currently Sat Aug 22, 2020 1:34 pm


All times are UTC




Post new topic Reply to topic  [ 8 posts ] 
Author Message
 Post subject: Multithreading and concurrent blocking.
PostPosted: Mon Oct 08, 2012 1:13 pm 

Joined: Fri Sep 14, 2012 10:54 pm
Posts: 15
Greetings!

I got a little problem i'd like to hear your opinions about:
My application generates VolumeData from Perlin Noise and extracts the meshes depending on where the Camera is located. Works fine and ( Even on my really crappy hardware ) fast enough to interact with everything.
The application works Multithreaded.

Some information ahead:
I use two TaskQueues where one is computed on the main Thread ( Adding extracted meshes to the scenegraph, initiate Prefetching, ... ), the other on the worker Thread which does time expensive tasks like Extracting meshes and generating new VolumeData.
Since LargeVolume is not even slightly threadsafe i decided to use mutexes to lock the volume which works fine so far. Things get slightly slower when i change position into a new chunk and shitloads of new Tasks are setup, but it still keeps interactable.

Now i work on making the world really interactable: Physics.
I decided to use a voxeldata based approach leaving out Triangles completly. I tried out to use MeshDecimator but.. hell. It used up to 48s per Chunk to simplyfy the meshes! ( 64x64x64 Chunk with Marching Cubes ) It used up to 48s per Chunk...

So i tried out and simply added a gravity simulation to the camera. It works fine as long as the volumeMutex isnt locked. Its getting pretty messy and i end up at 1-4FPS since the threads are blocking each other...

Thats why i wonder how to handle that kind of problem?
One Idea coming to my mind was to have two Volumes to be used in some kind of "DoubleBuffering". Tasks are computed as a package till they are all finished, storing the results in the "BackBuffer". After that we switch the BackBuffer and FrontBuffer, making the FrontBuffer our newly generated data. We then copy the new Data into the backbuffer so we got an updated Basis.
That way we could write only on the backbuffer, read only on the front Buffer and got expensive copies occasionaly. ( How does one copy LargeVolumes anyway? oO )


How would you go about that?
Im gratefull for every advice. :)


Top
Offline Profile  
Reply with quote  
 Post subject: Re: Multithreading and concurrent blocking.
PostPosted: Tue Oct 09, 2012 8:09 am 
Developer
User avatar

Joined: Sun May 04, 2008 6:35 pm
Posts: 1827
Yeah... multithreading is pretty tricky. I don't have any firm answers but here are some thoughts.

You say you are making use of mutexes to lock the volume, but how many to you have? I think one per volume is the only safe approach? The important question then is how often do you lock it? For example if you lock it every time you want to access a voxel and then release it afterwards then you'll have a lot of locks/unlocks going on. So you should lock it once, generate a whole region of data, and then unlock it. Or lock it once, extract a large part of the surface, and then unlock it.

KuroSei wrote:
I tried out to use MeshDecimator but.. hell. It used up to 48s per Chunk to simplyfy the meshes! ( 64x64x64 Chunk with Marching Cubes ) It used up to 48s per Chunk...


It's not very good, and we'll pull it out of PolyVox at some point. We'll probably recommend using an external library instead but we need to do some investigation into that.

KuroSei wrote:
One Idea coming to my mind was to have two Volumes to be used in some kind of "DoubleBuffering".


I don't know how well that would work, but if you want to try it then maybe it could be implemented using the underflow/overflow handlers? Your second volume could have an underflow handlers which reads data from your main volume and an overflow handler which wroites it back (locking once in both cases). I think this would save you making a complete copy of the whole volume? Well I'm not really sure but it seems like an interesting idea...

KuroSei wrote:
How does one copy LargeVolumes anyway?


As I recall you can use the VolumeResampler. It has an explicit test for whether the source and dest are the same size and so skips the interpolation.


Top
Offline Profile  
Reply with quote  
 Post subject: Re: Multithreading and concurrent blocking.
PostPosted: Wed Oct 10, 2012 12:33 am 

Joined: Fri Sep 14, 2012 10:54 pm
Posts: 15
David Williams wrote:
Yeah... multithreading is pretty tricky. I don't have any firm answers but here are some thoughts.

You say you are making use of mutexes to lock the volume, but how many to you have? I think one per volume is the only safe approach? The important question then is how often do you lock it? For example if you lock it every time you want to access a voxel and then release it afterwards then you'll have a lot of locks/unlocks going on. So you should lock it once, generate a whole region of data, and then unlock it. Or lock it once, extract a large part of the surface, and then unlock it.


I got only one mutex. I wrote Mutexes having in mind that i use mutexes too, to lock my Taskqueues. When i generate a chunk i lock the volume for the whole duration. Problem is it needs ~0.33 seconds for it per Chunk :P The extraction is fairly faster. But i dont think i can optimise that even better. I made it from ~1.5s to ~0.33 so im kinda happy. ;)

David Williams wrote:
It's not very good, and we'll pull it out of PolyVox at some point. We'll probably recommend using an external library instead but we need to do some investigation into that.

Sorry to hear that. I found the results pretty appleaing. Only downside is its runtime behavior.


David Williams wrote:
I don't know how well that would work, but if you want to try it then maybe it could be implemented using the underflow/overflow handlers? Your second volume could have an underflow handlers which reads data from your main volume and an overflow handler which wroites it back (locking once in both cases). I think this would save you making a complete copy of the whole volume? Well I'm not really sure but it seems like an interesting idea...


It could work. But its not a thing to test just like that... Ill think about it for sure. Seems neat but somehow tricky.

Reading your idea about the over- and underflow handlers one idea stroke my mind:
Is it safe to assume the largeVolume did not change until the over or underflow callback got called? To be more precise: I guess it would suffice to say the overflow handler? If i got a boolean i set to true everytime the underflow header finished working wouldnt several concurrent read-operations be safe?
Or does the adding of new data invalidate the content, too? I dont know exactly how LargeVolume handles its content and if there is some sorting or similar things going on...

EDIT: Ah... as long as either over- or underflow handler isnt called. When new data is stored int he volume invalidation may occure of course due to the compression going on. My bad. I thought it over and well..duh. :P

David Williams wrote:
As I recall you can use the VolumeResampler. It has an explicit test for whether the source and dest are the same size and so skips the interpolation.


Thanks for the hint. :)


Warm regards,
Kuro


Top
Offline Profile  
Reply with quote  
 Post subject: Re: Multithreading and concurrent blocking.
PostPosted: Wed Oct 10, 2012 8:36 am 
User avatar

Joined: Wed Jan 26, 2011 3:20 pm
Posts: 203
Location: Germany
have you thought about a message passing approach?
Changes to the volume are done by the perlin noise thread and by the game thread.
the noise thread would not actually have a volume, it just generates chunks and sends them to the other threads.
the extraction thread receives new chunks by the noise thread and changes by the game thread.
the game thread receives new chunks by the game thread.
the extraction thread sends newly extracted meshes to both the graphics thread and the physics thread.

you'd have two copies of the volume (game and extraction) which might be async for short periods of time, but most likely that will not be noticable.


Top
Offline Profile  
Reply with quote  
 Post subject: Re: Multithreading and concurrent blocking.
PostPosted: Wed Oct 10, 2012 12:12 pm 
Developer
User avatar

Joined: Sun May 04, 2008 6:35 pm
Posts: 1827
ker wrote:
the noise thread would not actually have a volume, it just generates chunks and sends them to the other threads.

I think this might be the most important point. From the description it seems that the chunk generation is much slower the extraction, so maybe the extraction can even be left on the main thread if the generation is moved off of it.

You probably don't even need a whole other LargeVolume... the generation could create small RawVolumes and the VolumeResampler could copy them into your main LargeVolume when they are ready.

Actually you can probably do your surface extraction on this thread (against the RawVolume) as well. So you could generate a RawVolume and run the surface extractor on the seperate thread, and then the main thread could copy the mesh to the GPU and also copy the generated volume data from the RawVolume into your LargeVolume. Then everything is back in sync. But maybe in this case you hit problems on the boundaries of the mesh?

Some ideas to think about anyway...


Top
Offline Profile  
Reply with quote  
 Post subject: Re: Multithreading and concurrent blocking.
PostPosted: Wed Oct 10, 2012 4:28 pm 

Joined: Fri Sep 14, 2012 10:54 pm
Posts: 15
David Williams wrote:
ker wrote:
the noise thread would not actually have a volume, it just generates chunks and sends them to the other threads.

I think this might be the most important point. From the description it seems that the chunk generation is much slower the extraction, so maybe the extraction can even be left on the main thread if the generation is moved off of it.

Thats right. The generation is lots slower than the extraction. ( I never expected that, but perlin noise with 7 octaves is rather slow. But it gives the best results for the cheapest investment that way... :( )


David Williams wrote:
You probably don't even need a whole other LargeVolume... the generation could create small RawVolumes and the VolumeResampler could copy them into your main LargeVolume when they are ready.

Seems like a neat idea! I would have to intelligently create the stuff when needed, but well: Thats not that hard, since i prefetch, generate and extract that way anyway. :P ( I presume its rather intelligent, since i got such nice speed improvements. From ~5FPS to 15-30Fps! With pretty dense meshes which my graphics card cant handle... )

David Williams wrote:
Actually you can probably do your surface extraction on this thread (against the RawVolume) as well. So you could generate a RawVolume and run the surface extractor on the seperate thread, and then the main thread could copy the mesh to the GPU and also copy the generated volume data from the RawVolume into your LargeVolume. Then everything is back in sync. But maybe in this case you hit problems on the boundaries of the mesh?

Some ideas to think about anyway...



That wont work i tried a similar approach which worked okayish, but the extractor generates seams on the edges that way.


I'm going to give the asynchron generation of volumedata a try. That sounds pretty promising. I guess copying takes around 0.05seconds at max, that would be a huge improvement! :)


I'll give you message when i got some benchmarking. :)


Top
Offline Profile  
Reply with quote  
 Post subject: Re: Multithreading and concurrent blocking.
PostPosted: Thu Oct 18, 2012 3:27 pm 

Joined: Fri Apr 06, 2012 3:11 pm
Posts: 3
Not having seen what your code is actually doing this is all speculative.

What you don't want to do is work on the data under lock. Since you are generating meshes and such, do you really need to lock while this operation is going on? I can understand if there is no data to begin with but after there is data can't you use what's there while you are generating a chunk?

You might want to pre-generate neighboring chunks while you are interacting with the current one. Why wait until your physically in that chunk? You can easily just add it to a processing que and let the worker thread get to it when it's able without holding up the current chunk's job.

So pregenerate the starting volume, then just move on from there.

You should do your best to only lock when you are replacing data sets. For example, if you have an array of triangles and indices, only lock when you need to replace that array, not while you are generating it. Then the lock is negligible. There's nothing wrong with having visual data still loading in while you are actually in the game. Just have an unprocessed volume handler that draws it empty.


Top
Offline Profile  
Reply with quote  
 Post subject: Re: Multithreading and concurrent blocking.
PostPosted: Sat Oct 20, 2012 4:30 pm 

Joined: Fri Sep 14, 2012 10:54 pm
Posts: 15
RevenantBob wrote:
Not having seen what your code is actually doing this is all speculative.


I tried to give a brief explanation. The sourcecode is rpetty complex and everythings wired together so i cant really show "portions" of code. It makes maintenance etc heavier to do, but it got siginifcant performance boosts...


RevenantBob wrote:
What you don't want to do is work on the data under lock. Since you are generating meshes and such, do you really need to lock while this operation is going on? I can understand if there is no data to begin with but after there is data can't you use what's there while you are generating a chunk?

That's what was proposed above. I tried around a bit. It reduces lag a lot but got some flaws ( mostly because of my design... I tweak around with this atm :) )

RevenantBob wrote:
You might want to pre-generate neighboring chunks while you are interacting with the current one. Why wait until your physically in that chunk? You can easily just add it to a processing que and let the worker thread get to it when it's able without holding up the current chunk's job.

I pregenerate everything around me. The problem is the time it takes to reach it in relation to the time it consumes to generate the new chunks.

RevenantBob wrote:
So pregenerate the starting volume, then just move on from there.

I pregenerate 127 chunks around the starting chunk and go forth from there.


RevenantBob wrote:
You should do your best to only lock when you are replacing data sets. For example, if you have an array of triangles and indices, only lock when you need to replace that array, not while you are generating it. Then the lock is negligible. There's nothing wrong with having visual data still loading in while you are actually in the game. Just have an unprocessed volume handler that draws it empty.

I am at it right now as stated above.
Its a simple design flaw i made and which is pretty hard to fix now.


Thanks for the hints. :) I'll grab some of it and try to implement it. Atm i really think about redesigning my generation and interaction from scratch... :(


Top
Offline Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 8 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
Theme created StylerBB.net