Dynasty wrote:
I briefly read through some of the material related to ambient occlusion and cone tracing. I don't really see how implementing AO on the CPU is viable.
Well it's certainly an open research problem, and as I said before we didn't manage to get it working well enough for Voxeliens. So I can't really give you an answer. But I do still think it has potential and hope to do some more work in this area in the coming year.
Dynasty wrote:
I've been thinking about this problem a lot and I think that I have an elegant solution for per-face ambient occlusion.
.
.
.
I think my gut feeling is that this is quite difficult. I think you'll still need to touch a lot of voxel data and (as you note below) it's these memory accesses which can be quite slow.
Dynasty wrote:
Edit 3: Also the current architecture of CubicSurfaceExtracter, etc is not particularly conducive to implementing this algorithm.
I wouldn't really expect to implement AO within the CubicSurfaceExtracter. If you want per-vertex AO I'd run the CubicSurfaceExtracter first and then compute AO value for each of the vertices, or if you want per-voxel AO I'd compute that before running the CubicSurfaceExtracter. But per face AO? I'm not exactly sure but I guess after extraction.
That said, there's another kind of AO which I think does belong in the CubicSurfaceExtractor. You show it in one of the screenshots on the Build 'n' shoot thread. It's where extreamely local AO is computer per-vertex from just the immediate surrounding voxels.
This is the image I mean:
http://i.imgur.com/no0YrjN.pngI've been thinking about this for a while so I added a task so that it doesn't get forgotten. It won't happen soon though:
https://bitbucket.org/volumesoffun/poly ... al-ambientDynasty wrote:
The algorithm I discussed previously probably has no advantage over the ray casting method. The primary bottleneck in the current algorithm is the large number of reads/writes made to memory. In the current implementation, anytime the raycaster wants to know if a voxel is solid or not it needs to pass an object to a callback, which in turn needs to get the pointer to the SimpleVolume, and then needs to get the specific voxel object, which calls another function which checks if the block is solid, and then even that block may call another function....
Yes, this could probably be made faster. But it's a trade off between speed and flexibility really, as the raycast is also useful for other task such as picking. I believe the callbacks should be inlined (I did do some testing here) and it's probably the memory access which is slow... but really some more profiling is needed.
Dynasty wrote:
To reduce memory requests I suggest that a boolean array be created which marks with a simple true and false whether or not a block is solid. Hopefully parts of this array will be stored in the CPU cache.
You can copy your volume data into a seperate volume with just one byte per voxel but PolyVox won't let you store a single bit per voxel. You'd need you own representation for that. Your new volume could also be downsampled which would speed things up a lot. When we were doing the experiments in Voxeliens we downsampled by a factor of four in each direction.
Dynasty wrote:
Also the path of all the rays needs to be computed only once.
Yes, it should be possible to do something like this. You can precompute the path for say 100 rays and then reuse these same rays for every vertex but with different starting points. You can also trace rays in parallel, so that you start say 10,000 rays at different points in the scene (all pointing in the same direction), compute the next step, and then apply that step to all rays. There are some research papers on this - try searching for 'Parallel Ray Traversal'.