(ADVANCED HELP) chunk generation in a minecraft clone is INCREDIBLY slow

Godot Version

godot 4.4.1

Question

just as the title says, chunk generation in a minecraft clone im working on is incredibly slow, despite all i’ve done to try to make it faster. i know very little about how to do any of this efficiently, and im trying to learn as i go, but there arent many resources out there that i can understand very well (i learn best with very specific overexplained instructions)

i dont really know what to even provide here other than the project itself for people to take a look at, but im not sure how i’d even do that really


little snippets of some of the more important code (not the entire script files):

world script:

func generate_chunks(pos : Vector3i) -> void:
	var chunk : MeshInstance3D = chunk_node.instantiate()
	chunk.position = pos
	chunk.name = "chunk" + str(pos)
	
	add_child(chunk)
	
	chunks[pos] = chunk # "chunks" is a dictionary containing the chunk positions as the keys
	
	gen_block_data(chunk)

func gen_block_data(chunk : MeshInstance3D) -> void:
	for x : int in range(chunk_size.x):
		for y : int in range(chunk_size.y):
			for z : int in range(chunk_size.z):
				var pos : Vector3i = Vector3i(x + roundi(chunk.position.x), y + roundi(chunk.position.y), z + roundi(chunk.position.z))
				blocks.get_or_add(pos, gen_base_blocks(pos))
				# gen_base_blocks is returns the block id based on some noise at the block position

chunk script:

func gen_chunk() -> void:
	clear_data()
	
	for x : int in range(get_parent().chunk_size.x): # parent is the world, chunk_size is a vector3
		for y : int in range(get_parent().chunk_size.y):
			for z : int in range(get_parent().chunk_size.z):
				if is_block_covered(Vector3i(x, y, z) + Vector3i(position)) == false and get_parent().blocks[Vector3i(x, y, z) + Vector3i(position)] != 0:
					# is_block_covered checks the six neighbor blocks and returns false (block NOT covered) if
					# any of the six blocks are transparent in some way, and skips air blocks (block id 0)
					visible_blocks.append(Vector3i(x, y, z) + Vector3i(position))
					
	gen_block_mesh()

func gen_block_mesh() -> void:
	for i : Vector3i in visible_blocks:
		if checked_block_ids.has(get_parent().blocks[i]) == false:
			if IDs.block_ids[get_parent().blocks[i]].has("class"):
				checked_block_ids.append(get_parent().blocks[i])
				# adds the id of the block to an array to prevent blocks from getting checked if that id has already been checked

				var new_block : world_block = IDs.block_ids[get_parent().blocks[i]]["class"].new()
				# block ids are stored in an autoload as a dictionary with id being the key and all
				# block parameters being the value (as another dictionary). one of the parameters is
				# the block's script, and i have blocks as separate scripts to more easily control the generation
				new_block.gen_block(visible_blocks, self)

one of the block scripts:

extends world_block # script that just extends Node. idk
class_name block_grass

func gen_vertices(where : Vector3i, offset : Vector3) -> PackedVector3Array:
	var vertex : PackedVector3Array
	if where == Vector3i.LEFT:
		## NEG X
		vertex.append(Vector3(0 + offset.x, 0 + offset.y, 0 + offset.z))
		vertex.append(Vector3(0 + offset.x, 1 + offset.y, 0 + offset.z))
		vertex.append(Vector3(0 + offset.x, 1 + offset.y, 1 + offset.z))
		vertex.append(Vector3(0 + offset.x, 0 + offset.y, 1 + offset.z))
	
	elif where == Vector3i.DOWN:
		## NEG Y
		vertex.append(Vector3(1 + offset.x, 0 + offset.y, 1 + offset.z))
		vertex.append(Vector3(1 + offset.x, 0 + offset.y, 0 + offset.z))
		vertex.append(Vector3(0 + offset.x, 0 + offset.y, 0 + offset.z))
		vertex.append(Vector3(0 + offset.x, 0 + offset.y, 1 + offset.z))
	
	elif where == Vector3i.FORWARD:
		## NEG Z
		vertex.append(Vector3(1 + offset.x, 0 + offset.y, 0 + offset.z))
		vertex.append(Vector3(1 + offset.x, 1 + offset.y, 0 + offset.z))
		vertex.append(Vector3(0 + offset.x, 1 + offset.y, 0 + offset.z))
		vertex.append(Vector3(0 + offset.x, 0 + offset.y, 0 + offset.z))
	
	elif where == Vector3i.RIGHT:
		## POS X
		vertex.append(Vector3(1 + offset.x, 0 + offset.y, 1 + offset.z))
		vertex.append(Vector3(1 + offset.x, 1 + offset.y, 1 + offset.z))
		vertex.append(Vector3(1 + offset.x, 1 + offset.y, 0 + offset.z))
		vertex.append(Vector3(1 + offset.x, 0 + offset.y, 0 + offset.z))
	
	elif where == Vector3i.UP:
		## POS Y
		vertex.append(Vector3(1 + offset.x, 1 + offset.y, 0 + offset.z))
		vertex.append(Vector3(1 + offset.x, 1 + offset.y, 1 + offset.z))
		vertex.append(Vector3(0 + offset.x, 1 + offset.y, 1 + offset.z))
		vertex.append(Vector3(0 + offset.x, 1 + offset.y, 0 + offset.z))
	
	elif where == Vector3i.BACK:
		## POS Z
		vertex.append(Vector3(0 + offset.x, 0 + offset.y, 1 + offset.z))
		vertex.append(Vector3(0 + offset.x, 1 + offset.y, 1 + offset.z))
		vertex.append(Vector3(1 + offset.x, 1 + offset.y, 1 + offset.z))
		vertex.append(Vector3(1 + offset.x, 0 + offset.y, 1 + offset.z))
	return vertex

func gen_indices(face : int) -> PackedInt32Array:
	var index : PackedInt32Array
	
	index.append(00 + face * 4)
	index.append(01 + face * 4)
	index.append(02 + face * 4)
	
	index.append(00 + face * 4)
	index.append(02 + face * 4)
	index.append(03 + face * 4)
	
	return index

func gen_block(blocks : PackedVector3Array, chunk : MeshInstance3D) -> void:
	
	for i : Vector3i in blocks:
		if chunk.get_parent().blocks[i] == 2:
			
			## NEG X
			if chunk.get_block(i + Vector3i.LEFT) == 0 or IDs.block_ids[chunk.get_block(i + Vector3i.LEFT)]["block_type"].has(IDs.block_type.solid) == false:
				chunk.vertices_solid.append_array(gen_vertices(Vector3i.LEFT, i - Vector3i(chunk.position)))
				chunk.indices_solid.append_array(gen_indices(chunk.face_count_solid))
				chunk.face_count_solid += 1
				
				chunk.vertices.append_array(gen_vertices(Vector3i.LEFT, i - Vector3i(chunk.position)))
				chunk.indices.append_array(gen_indices(chunk.face_count))
				chunk.face_count += 1
				
				chunk.gen_uvs(Vector4(16, 16, 0, 32))
			
			## NEG Y
			if chunk.get_block(i + Vector3i.DOWN) == 0 or IDs.block_ids[chunk.get_block(i + Vector3i.DOWN)]["block_type"].has(IDs.block_type.solid) == false:
				chunk.vertices_solid.append_array(gen_vertices(Vector3i.DOWN, i - Vector3i(chunk.position)))
				chunk.indices_solid.append_array(gen_indices(chunk.face_count_solid))
				chunk.face_count_solid += 1
				
				chunk.vertices.append_array(gen_vertices(Vector3i.DOWN, i - Vector3i(chunk.position)))
				chunk.indices.append_array(gen_indices(chunk.face_count))
				chunk.face_count += 1
				
				chunk.gen_uvs(Vector4(0, 16, 0, 16))
			
			## NEG Z
			if chunk.get_block(i + Vector3i.FORWARD) == 0 or IDs.block_ids[chunk.get_block(i + Vector3i.FORWARD)]["block_type"].has(IDs.block_type.solid) == false:
				chunk.vertices_solid.append_array(gen_vertices(Vector3i.FORWARD, i - Vector3i(chunk.position)))
				chunk.indices_solid.append_array(gen_indices(chunk.face_count_solid))
				chunk.face_count_solid += 1
				
				chunk.vertices.append_array(gen_vertices(Vector3i.FORWARD, i - Vector3i(chunk.position)))
				chunk.indices.append_array(gen_indices(chunk.face_count))
				chunk.face_count += 1
				
				chunk.gen_uvs(Vector4(16, 16, 0, 32))
			
			## POS X
			if chunk.get_block(i + Vector3i.RIGHT) == 0 or IDs.block_ids[chunk.get_block(i + Vector3i.RIGHT)]["block_type"].has(IDs.block_type.solid) == false:
				chunk.vertices_solid.append_array(gen_vertices(Vector3i.RIGHT, i - Vector3i(chunk.position)))
				chunk.indices_solid.append_array(gen_indices(chunk.face_count_solid))
				chunk.face_count_solid += 1
				
				chunk.vertices.append_array(gen_vertices(Vector3i.RIGHT, i - Vector3i(chunk.position)))
				chunk.indices.append_array(gen_indices(chunk.face_count))
				chunk.face_count += 1
				
				chunk.gen_uvs(Vector4(16, 16, 0, 32))
			
			## POS Y
			if chunk.get_block(i + Vector3i.UP) == 0 or IDs.block_ids[chunk.get_block(i + Vector3i.UP)]["block_type"].has(IDs.block_type.solid) == false:
				chunk.vertices_solid.append_array(gen_vertices(Vector3i.UP, i - Vector3i(chunk.position)))
				chunk.indices_solid.append_array(gen_indices(chunk.face_count_solid))
				chunk.face_count_solid += 1
				
				chunk.vertices.append_array(gen_vertices(Vector3i.UP, i - Vector3i(chunk.position)))
				chunk.indices.append_array(gen_indices(chunk.face_count))
				chunk.face_count += 1
				
				chunk.gen_uvs(Vector4(32, 16, 0, 48))

			
			## POS Z
			if chunk.get_block(i + Vector3i.BACK) == 0 or IDs.block_ids[chunk.get_block(i + Vector3i.BACK)]["block_type"].has(IDs.block_type.solid) == false:
				chunk.vertices_solid.append_array(gen_vertices(Vector3i.BACK, i - Vector3i(chunk.position)))
				chunk.indices_solid.append_array(gen_indices(chunk.face_count_solid))
				chunk.face_count_solid += 1
				
				chunk.vertices.append_array(gen_vertices(Vector3i.BACK, i - Vector3i(chunk.position)))
				chunk.indices.append_array(gen_indices(chunk.face_count))
				chunk.face_count += 1
				
				chunk.gen_uvs(Vector4(16, 16, 0, 32))

apologies for the walls of code, i just wanted to provide something for people to be able to see the main stuff going on to see where i might be able to optimize (probably many areas)

im just grasping at straws here man im just so lost. any help, tips, direction, etc would be greatly appreciated!

Hi!

From what I read, I would say that the generation is slow because there are a lot of loops everywhere. Loops are fine, but let’s say you want to generate a world of 100x100x100 chunks (which, for a Minecraft clone, is very few), just iterating over those chunks with a triple for loop would result in 1.000.000 iterations.
Iterating a lot can be okay though, it all depends on what you do inside of your loops.

Just a quick example, but if you were to print coordinates at each iterations, like this:

for x in 100:
    for y in 100:
        for z in 100:
            print('%s, %s, %s' % [x, y, z])

That would probably be a very slow loop as printing done a million time is not cheap. However, doing something like this:

var counter: int = 0
for x in 100:
    for y in 100:
        for z in 100:
            counter += 1

Would surely be way faster, as add 1 to an integer is fast.

So it all depends on what you do in your loops, and what I want to put the emphasis on is, once you’re iterating thousands and thousands of times, you need to be extra careful of every instruction as any of it could be the reason of extreme lag.

I’m absolutely not sure about it, but maybe calling get_parent() is expensive, so doing it a lot may affect performances. So, what you could try, is caching some variables, like this:

var parent = get_parent()
for x : int in range(parent.chunk_size.x):
    for y : int in range(parent.chunk_size.y):
        for z : int in range(parent.chunk_size.z):
            # do something

That’s the first thing.

Second, I’ve never looked any Minecraft procedural generation resources as I’ve never done any implementation myself, but I doubt it’s using straightforward loops on 3 axis, seeing how big Minecraft is compared to the speed of its generation. So there are probably some techniques to re-use that you could find on videos explaining the overall concepts (you may not like it, as you said, but with such a complicated feature, you don’t really have a choice tbh).

Third, gdscript is definitely not the best option if you’re looking for algorithm efficiency, as it’s not a language known for its speed (compared to C++, typically). At some point, even with the most optimized code possible, you’d still have a slow generation on a large map if the tool you’re using cannot go faster, and you could do nothing about it. You can read a bit more info here: https://www.reddit.com/r/godot/comments/ih9bst/how_fast_is_gdscript/
Sometimes, algorithms like this are ran on the GPU instead of the CPU (not diving too deep here, but the difference may be summarised as follows: CPUs are smart but slow, GPUs are dumb but fast). However, it’s a very different topic and if you’re still learning on the fly, you would probably not look at it right now, but just know that implementing such a complex world generation in gdscript may be a lost battle already.

What’s your current world sizez? Maybe you could start with a smaller world generation, just to learn how to implement a working algorithm, and try to optimize later on as much as you can.

PS: I’m no Godot expert so if anybody reads this and disagrees with anything I’ve said, particularly on the gdscript part, please correct me!

2 Likes

thanks for the in depth reply!

i would try to change the i guess sort of three dimensional for loop, but im not sure how i would when it comes to what im actually doing, since its not as simple as just printing something (also i’d lower the amount of loops there are if i could figure out how to have block/chunk gen without the loops)
i’d probably be using c++, but i only know gdscript, and i cannot understand any other coding languages at all really
as for the world size, ive been mainly using a render distance of 2, or 5 total chunks across in all axes (player chunk in the center with 2 more going in all directions, which is relatively fast with only a little hiccups, but the performance absolutely tanks as soon as i increase it by really any amount

i’d probably try checking out a bunch of videos of people making minecraft clones, but ive already tried that and i just cant follow, not only due to nobody using godot and gdscript but also due to the fact that i just dont learn well from simply watching someone talk about something

quick edit: i just replaced all instances of get_parent() with a variable equal to get_parent(), and i think its slightly faster. however, all of these are just getting the world node, so i could probably have a variable of the world node in the autoload script and use that

1 Like

since its not as simple as just printing something

That’s precisely what I meant: even printing is not that simple performance wise. So with all your logic, it’s not that surprising you’re facing performance issues.

quick edit: i just replaced all instances of get_parent() with a variable equal to get_parent(), and i think its slightly faster. however, all of these are just getting the world node, so i could probably have a variable of the world node in the autoload script and use that

I suppose you could try that, yes. I’ll let you do some researches but there are ways of measuring the time an algorithm took to run; you should rely on the computer measurement instead of what you think is faster or not when playing.
If optimizing something makes you gain 10 seconds, of course you’ll see it without needing the computer to tell you, but when it comes to hundredths of seconds, maybe even thousandths, you’ll not be able to tell yourself.

To be perfectly honest with you, I think you may be tackling a task too complex for you. :sweat_smile:
I understand you want to work on something you like, but really there’s no ready-to-use solution here, as this is a complicated topic so if you don’t feel capable of learning with generic explanations but only step by step guides, well… you’ll get stuck a lot.
You do as you want, but I hope you get my point.

i do agree and admit that this is almost certainly too complex for me, but its the only thing i actually want to do and feel motivated to do (ive been wanting to make this at least since october last year, as i think minecraft in concept is really good but just not quite to my personal liking in practice)

That’s precisely what I meant: even printing is not that simple performance wise. So with all your logic, it’s not that surprising you’re facing performance issues.

what i meant was simple to do, not simple in performance. idk how i’d get the chunk/block positions without looping through a bunch of numbers three times to get a vector3. the only thing i can think of would be to loop through the positions and add them as vectors to an array, but then i’d have to loop through each array element afterwards anyway, which wouldnt be any better probably

1 Like

Here’s a single loop that’s 128 x 128 x 128; we use ** 3 to raise 128 to the third power, which is the same thing.

I’m using 128 because it’s a power of two (2 ** 7), which means I can do bitwise operations on it to speed things up:

for index in 128 ** 3:
    var x = index & 0x7F
    var y = (index >> 7) & 0x7F
    var z = (index >> 14) & 0x7F
    print("x: %d y: %d z: %d index: %d" % [x, y, z, index])

The shifting (>>) and masking (&) are bitwise operations.

# Shifting
128 >> 1 == 64
128 >> 2 == 32
128 >> 3 == 16
[...]

The bitwise and operation & sets only the bits that are set in both arguments:

00000001 & 00000001 == 00000001
00000010 & 00000001 == 00000000
00000011 & 00000001 == 00000001
[...]
11111111 & 00000001 == 00000001
10101010 & 11110000 == 10100000

In the loop scheme above, we’re using the bottom 7 bits of the index to represent the x coordinate, the next 7 bits to represent the y coordinate, and the top 7 bits for the z coordinate. We isolate those by shifting them down to the bottom of the integer (effectively dividing by 128 or 128 ^ 2 to get y or z, respectively) and then mask off all but the bottom seven bits to isolate the value we want.

The bottom seven bits are 01111111 in binary, which is 7F in hexidecimal, with the 0x prefix to indicate that we’re using hex, so 0x7F. Hex is useful for this because there exactly four binary digits per hex digit. 7 here is representing the leading 0111, and F the trailing 1111:

0000 -> 0x0    0100 -> 0x4    1000 -> 0x8    1100 -> 0xC
0001 -> 0x1    0101 -> 0x5    1001 -> 0x9    1101 -> 0xD
0010 -> 0x2    0110 -> 0x6    1010 -> 0xA    1110 -> 0xE
0011 -> 0x3    0111 -> 0x7    1011 -> 0xB    1111 -> 0xF

You can do the same thing with modulo and integer division, but the shift/mask technique is cleaner and faster if you don’t mind working with power of two sizes.

3 Likes

Neat project! Minecraft clones are surprisingly tricky precisely because of the voxel meshing. With the caveat that I’m not very familiar with Minecraft’s chunk meshing, at a glance I don’t see anything that jumps out at me as an avoidable problem spot. Because of the nature of voxel worlds, you can’t really get away from the 3-dimensional coordinate traversal (at least in principle), though there are structures that can accelerate this substantially (bitfields and octrees come to mind as possibilities, though the latter are likely going to be difficult to use efficiently here without some very specific techniques) so that you can avoid visiting empty voxels. For the moment I would suggest sticking to your 3D array representation even though it may mean iterating through empty space more than is needed.

I like to think of it as: CPUs are good at doing one thing at a time very, very fast, GPUs are good at doing a bunch of things at a time rather slowly. If your algorithm can be parallelized (able to do a bunch of things at once), you can often get a very nice speedup by writing it for the GPU. Of course, this depends on your algorithm being parallelizable. If your algorithm is not a parallel one, the GPU will most likely be much slower. I’ve never studied it in particular, but due to the nature of mesh generation I would not expect it to parallelize much (if at all) except perhaps in very specific cases. Now that being said, multi-core CPUs are able to do more than one thing at once - a hyperthreaded quad-core CPU can do 8 tasks at the same time, which won’t compete with a GPU on parallel algorithms but it’s fantastic for avoiding killing your FPS by doing heavy calculations in your main loop. You can take advantage of this with multithreading, which Godot supports in GDScript. Basically what you’ll do is do all the mesh generation in other threads (for simplicity, you can launch 1 thread per chunk being generated), and then hand their results off to your node script to attach into the scene. This means that you’ll see some empty chunks while you wait for the generator threads to complete, but your game will remain playable while this happens (Minecraft does this too - most often seen when distant chunks sort of “tile in”). GDScript is quite a bit slower than C++ for a comparable algorithm, but with the generation moved out of the game loop the only side effect you’ll see from that is that it will take longer to fill in all the chunks - you won’t freeze up your game by going from 5 chunks to 20, 40, 100, etc.

Edit: looks like I stand corrected in that there do exist fast ways to generate voxel meshes on the GPU: GitHub - artnas/UnityVoxelMeshGPU: GPU voxel mesh generation and drawing in Unity HDRP

2 Likes

To throw my opinion into this one, as someone who has attempted similar in other engines. I think there are a lot of good points in existing answers. Gdscript is not optimised for this sort of task, and if this is really something you are set on exploring, your time would be better spent learning how to use C++ with godot engine, which will be capable of keeping up with the demanding compuations needed for such a project.

You could also consider using an existing voxel library, which could handle these functions for you, such as GitHub - Zylann/godot_voxel: Voxel module for Godot Engine

1 Like

Perhaps this is of interest to you; it concerns a thread made by somebody who built a voxel-game using GDNative C++. The section on the game’s architecture mentions the use of sparse octrees.

As people have mentioned, this is a big problem. You are running into an issue we in programming call Big-O Notation. There are things you can do to mitigate it, like @sixrobin explained. @hexgrid’s bit shifting explanation is also an excellent help. Doing that will save a TON of processing cycles.

Ultimately, the more you understand, the more you can optimize. Each loop inside another loop makes your processing a magnitude larger. Learning to minimize your loops can save a bunch of processing time. I recommend you do some reading on Big-O notation. That will help.

Also keep in mind that Minecraft’s code is open and available. It’s written in Java. You can go look how Mojang actually did it.

I recommend watching RachelFTech’s video which is a DevBlog on running into all the issues you have (and will) run into, and how she solved them. Even though it’s old I think it will help you immensely. Keep it mind it’s like 3 or 4 years old and uses Godot 3.

You can also see the project she made on itch and try it out.

been playing around with the example loop you sent for a bit (like under an hour kind of “a bit”), and i dont quite get it, but i was able to get a cluster of 4x4x4 nodes placed properly with some experimentation. i found an ongoing voxel terrain tutorial series a couple days ago, and i’ll be watching that (as videos come out, so likely over the next few weeks) to see if i can do any better with the actual mesh generation itself while also keeping this new single-loop 3d position information in mind. thank you!

i havent done any kind of research on this bitwise stuff, and ive never even heard of it (i have no even remotely formal education in programming, all just self taught with the occasional tutorial video since about a year ago), but i think i get it after some brute force (i also just now got it working in 8^3, and it seems pretty simple to do now). i might do some research on bitwise operations so i have a better understanding of it, but only if i feel like i’d need a deeper understanding of it


1 Like

Hi, I was mentioned above by @TokyoFunkScene as the one, who created voxel based game with chunk system.
Let me be direct in my answer. Do not use voxels.
Whatever idea of a game, design concept, etc. You have in mind, there is always some other, more suitable and easier way to implement it.
Just think what actually special in Your game concept and how to convey it without voxels.
Voxels have their special place, but only in small subset of games.
There is a good reason, why so few games are actually using voxels, including those “survival craft” ones.
I think all those videos of people building voxel survival look-alikes in 5 minutes are highly misleading, unfortunately.
It’s easy to build something similar on a surface, primitive level. Especially if You already know how to.
But quite hard to implement an actual game with it, with proper gameplay, etc.
Hear my advice, avoid using complex systems, such as voxels and You will end up with a much better game at the end.
Games are already notoriously hard to make, imagine having to spend additional thousands of hours tweaking voxels.
Nevertheless, it is, of course, possible to do. And indeed, also possible with Godot.
This requires a lot of low level optimizations, tricks, etc.
If You really want to spend next 3-5 years doing it, I can give You some exact advice.
But You would be much happier implementing the actual game in that time instead.

2 Likes

Some ideas.

Like already mentioned, get_parent() is probably slow. However, all dictionary lookup is quite slow. In my small tests dictionary lookup is 30-50 % slower than array lookup. Object member lookup is faster than dictionary as well.

resize() arrays before using them. Seems to like the size of all your arrays should be known ahead of time, yet I see no resize anywhere.

Avoid object creation.

It also seems strange that you can write all that, but cannot write Time.get_ticks_msec(). I mean… You could easily time yourself what is taking so long, instead of having others guess.

2 Likes

my game idea relies on editable voxel terrain (its specifically meant to be like minecraft but with different gameplay), so im gonna spend my time doing that since thats what im wanting to do

One thing that might be slowing it down is calling append all the time. Now your arrays are getting resized for every vertex, but you could just set the size outside all the loops in advance. I don’t know how big of a deal this is.

All right, if You are sure about it.
There is plenty of great advice mentioned here already. You can use all of those.
I will try to make it simple and short, what You can do quickly and dirty, right now, to make it work.
Keep in mind that it’s only a small fraction of what could and must be done.
Be sure to read Godot documentation, if You are not sure about any of this.
First, as already mentioned, consider starting with measuring what exactly is slow and how slow it is.
A simple way to do it is print(Time.get_ticks_msec()) before and after the code You want to measure.

Reduce the size of Your chunk to 64 or 32. Less work, less time required.
Limit chunk updates to N per frame. Less work per frame, more FPS.
Use Array instead of Dictionary, to store Your data. Godot Array is slow. Dictionary is even slower.
Do not use Godot Node system to traverse Your data, e.g. get_parent, etc. It’s not designed for such heavy use.
Okay to call a few times, not okay to call for 100k voxels.
Avoid frequent Object.new calls. Each requires Godot to bother OS for object allocation. Which is very slow and taboo in hot, rendering paths.
Move all block gen stuff to chunk script, to avoid Object.new.
Do as less operations per voxel, as possible. Precalc and reuse stuff, instead of calc each time on the fly.
It’s ugly and GDScript fault, but reduce amount of function calls. It’s a death by thousand cuts in case of 100k voxels.
Instead of Array.append/append_array, Array.resize first and access elements by index with [] operator.
Use typed Array[T] instead of Array, if possible. Use PackedArray, instead of Array[T], if possible.
Avoid any additional vars, while processing each voxel, instead, insert data directly to Your chunk data struct.

Sooner or later You will need to implement greedy voxel meshing.
You can use run-length encoding to skip empty voxels and optimize Your data storage.
Those are more advanced topics.

Take a look at this voxels implementation in GDScript.
It is slow, but maybe would be enough for Your use case.
At least You can use it as an example.
https://github.com/ClarkThyLord/Voxel-Core

Wish You all the best with Your project and insane resolve, to make it to the end)

2 Likes

That was a really informative post. I learned some things about Godot’s architecture.

Even though this wasn’t directed at me, thanks!

1 Like