While optimizing my current project (really my first project), a procedural world generator, I noticed something weird: I use the following expression to calculate an index from 3D-Coordinates in a Grid:
y + z * size + x * size * size
The order of the coordinates is because of how I populate my data-array, it shouldn't matter here. The point is, if I use the following function instead of the raw expression to make my code look better, the performance of my code is decreased by quite a bit:
By quite a bit I mean that the time it takes to generate a chunk of the world goes from around 80msto190ms! I really don't know if it is the use of a function itself that causes this performance drop or if it something more specific, but here are some extra infos thatmight be relevant:
I use GD-Script for my project.
During Mesh-Generation the index-calculation is required a lot, so maybe a small delay caused by calling a function results in a massive delay later on...
I use threading to generate multiple chunks at once, so one thread works on one chunk. (no idea if threads could actually be a reason)
I use the "Run Project" feature from the editor to test out performance. I haven't testet it with an exported build, so maybe it isn't that bad there.
Otherwhise I really just swap the expressions for the function and then the performance goes down. Any information or suggestions as to what is/could be happening would be greatly appreciated.
Doesnât seem to make a real difference in performance.
Using C# is on my list at some point, but I wanted to see how far I can go with GDScript. I know, Iâm propably wasting a lot of time doing it that way, but thats my problem.
After doing some more extensive research it appears that indeed the combination of GDScript, repeated function call and threading is the reason for the performance drop. If that can be âfixedâ by just using C# I donât know, but propably.
Other than that I should now really consider to start using C# as continuing to use GDScript would seemingly require me to use less functions for performance sake.
@hexgrid I havenât really optimised that part for now and I know there are a few things I could do on that end.
I wanted to focus more on the mesh generator part for now, as I use the lesser known âDual Marching Cubesâ approach and I wanted to optimize that first.
But here, if you are interested:
for x: int in range(size):
for z: int in range(size):
for y: int in range(size):
var pos := Vector3(x, y, z);
var value := getDensityAt(pos + position)
densityData[y + z * size + x * size * size] = value
âgetDensityAtâ right now returns some 2D-Noise from FastNoiseLite.
One optimisation I will do is to compute the 2D-Noise before the âyâ loop, as that is only dependant on âxâ and âzâ with âyâ only being there to subtract from it to give it âheightâ.
When I come around to where the data generation might become a problem, Iâm gonna make another topic for that.
Fair enough. The main thing is, I donât think gdscript has any strength reduction, so your loop there is going to recalculate every step of the array offset every loop iteration, which is three mults and two adds.
Youâd probably see much better results with:
var index: int = 0
var pos: Vector3 = position
for z: int in size:
pos.y = position.y
for y: int in size:
pos.x = position.x
for x: int in size:
densityData[i] = getDensityAt(pos)
i += 1
pos.x += 1.0
pos.y += 1.0
pos.z += 1.0
Whatâs going on:
We loop over z on the outermost and x on the innermost because it means our array index (i) is equal to x + (size * y) + (size^2 * z). If you want to do the yzx ordering you have in your source, you can do this by doing x, z, y (outermost to innermost) to get that mapping. Looping this way with the index value means weâre doing an increment by one each loop iteration rather than three multiplies and two adds.
Rather than regenerating pos entirely each loop iteration and then doing a per-iteration vector add, we init the vertex component-wise at the appropriate loop level, and increment as part of loop maintenance.
In addition to this, if you can move whatever is in getDensityAt() inline, youâll save whatever function setup/teardown costs multiplied by size^3.
I should end this here with the message that I rewrote my code in C# and now it went from ~80ms per chunk to ~2ms, including using a function for the index calculation, so I think thatâs a win.