It is currently Sat Aug 22, 2020 2:02 pm


All times are UTC




Post new topic Reply to topic  [ 10 posts ] 
Author Message
 Post subject: Speed difference between RLE / shared-block PolyVox
PostPosted: Fri Apr 15, 2011 2:30 pm 

Joined: Sun Jan 23, 2011 6:06 am
Posts: 92
Hello,

I have been doing some testing between the version of my code that uses PolyVox from January (shared-block), and the April 05 version (RLE). I am using a fixed volume size with paging disabled.

I have noticed a very big speed difference between the two. I am aware that RLE will compress/de-compress blocks. However even with uncompressed block count in PolyVox set higher than my actual block use count, it is still much slower.

Is this purely because of changing block storage from std::vector to std::map ? Or are there other optimizations that would improve the speed of PolyVox in retrieving uncompressed blocks?

I am looking for any suggestions to speed things up, I don't mind making the changes myself.


Top
Offline Profile  
Reply with quote  
 Post subject: Re: Speed difference between RLE / shared-block PolyVox
PostPosted: Fri Apr 15, 2011 2:40 pm 

Joined: Sun Jan 23, 2011 6:06 am
Posts: 92
Actually it might not be as bad as I thought. I really need some accurate testing to see what the real difference is.

Does anyone know a good application for profiling CPU time / memory use? I have tried using Very Sleepy but it crashes....


Top
Offline Profile  
Reply with quote  
 Post subject: Re: Speed difference between RLE / shared-block PolyVox
PostPosted: Fri Apr 15, 2011 3:23 pm 
Developer
User avatar

Joined: Sun May 11, 2008 4:29 pm
Posts: 198
Location: UK
For CPU profiling and bottleneck-finding, I've used Callgrind in the past (along with KCacheGrind) for visualising the logs. I guess they might only work on Linux.

_________________
Matt Williams
Linux/CMake guy


Top
Offline Profile  
Reply with quote  
 Post subject: Re: Speed difference between RLE / shared-block PolyVox
PostPosted: Fri Apr 15, 2011 3:38 pm 

Joined: Sun Jan 23, 2011 6:06 am
Posts: 92
Thanks, I'll look into it. Might have to install Cygwin...


Top
Offline Profile  
Reply with quote  
 Post subject: Re: Speed difference between RLE / shared-block PolyVox
PostPosted: Fri Apr 15, 2011 8:34 pm 
Developer
User avatar

Joined: Sun May 04, 2008 6:35 pm
Posts: 1827
There's also Google Perftools. I'm planning to try integrating this with the PolyVox examples at some point in the future.

Oh, and to answer your question, I think the main slow down wil indeed be the use of the map. I imagine it would be possible to swap it for an std::hash_map quite easily.

Also, as mentioned in another thread, in the future I might bring back the old volume and let it co-exist with the paged volume. Might be good to have more than one volume available.


Top
Offline Profile  
Reply with quote  
 Post subject: Re: Speed difference between RLE / shared-block PolyVox
PostPosted: Sat Apr 16, 2011 5:16 pm 
User avatar

Joined: Wed Jan 26, 2011 3:20 pm
Posts: 203
Location: Germany
I'm curious about those speed tests...
I believe that for the SurfaceExctractor there should be no noticable difference because the VolumeSampler rarely calls getUncompressedBlock.

The Cubic Extractors don't use the VolumeSampler, but most calls to getUncompressedBlock should return immediately due to it having been called with the same coordinates the last time it was called.

but yes, profiling should show where the trouble lies.


Top
Offline Profile  
Reply with quote  
 Post subject: Re: Speed difference between RLE / shared-block PolyVox
PostPosted: Sat Apr 16, 2011 11:35 pm 
Developer
User avatar

Joined: Sun May 04, 2008 6:35 pm
Posts: 1827
Unfortunatly the surface extractors aren't particularly smart about the order they traverse the voxels. Ideally they would process all the voxels in a given block before moving on to the next block, but actually they just process the whole requested region one slice at a time.

If you have a large region, and within it a block of size 32x32x32, then the surface extractor will enter that region, process 32 voxels, and then leave out the other side. It will then reenter that block 32x32 = 1024 times, and each time it has to find the block in the map. Hmmm... actually that's pretty horrific :-)

Anyway, the fact that the volume is made of blocks is an implementation detail which shouldn't be exposed to the surface extractors. However, the VolumeSampler does have knowledge of this. I think the VolumeSampler could have a 'moveToNextVoxelInRegion()' function which iterates over the voxels in the optimal order, and then the surface extractors should make use of this.

Actually I think this functionalilty used to exist, but it got lost at some point...


Top
Offline Profile  
Reply with quote  
 Post subject: Re: Speed difference between RLE / shared-block PolyVox
PostPosted: Sun Apr 17, 2011 5:49 am 

Joined: Sun Jan 23, 2011 6:06 am
Posts: 92
Unfortunately it seems that google perftools isn't fully ported over to windows yet, according to the documentation. Some of the tests failed on my computer.

Overall it seems like linux users are far better off when it comes to profiling. Valgrind, callgrind, google perftools, etc. Most of the windows packages I have found are commercial, and I don't feel like paying hundreds of dollars.

Maybe I should dust off my linux machine and get porting.... such a pain :(


Top
Offline Profile  
Reply with quote  
 Post subject: Re: Speed difference between RLE / shared-block PolyVox
PostPosted: Sun Apr 17, 2011 6:24 am 
User avatar

Joined: Wed Jan 26, 2011 3:20 pm
Posts: 203
Location: Germany
if you have a nice testing setup, it would be great if you posted the sources and commands here.
I'd like to run that on my 2 machines. so we can get a few results on different types of machines.


Top
Offline Profile  
Reply with quote  
 Post subject: Re: Speed difference between RLE / shared-block PolyVox
PostPosted: Sun Apr 17, 2011 9:59 am 
Developer
User avatar

Joined: Sun May 04, 2008 6:35 pm
Posts: 1827
AMD CodeAnalyst is free and works on Intel processors as well I believe (though without some instrumentation features).


Top
Offline Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 10 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 7 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
Theme created StylerBB.net