Best practices for spawning large number of objects?

Godot Version

I’m using 3.6 but this question applies to any version (same issues AFAIK)

Question

I want to spawn a large number of entities. Whenever I do that there is a significant stutter and slowdown that gets even worse in web builds. For the sake of example consider this loop:

func _ready():
	yield(get_tree().create_timer(1.0), "timeout")
	for i in 3000:
		var g = $Ref.duplicate()
		$YSort.add_child(g)

By mostly trial and error I realized I can distribute that spawning over a few frames with something like this:

func _ready():
	yield(get_tree().create_timer(1.0), "timeout")
	spawn()
func spawn():
	if spawned: return
	spawned = true
	for i in 1000:
		yield(VisualServer, "frame_post_draw")
		for j in 3:
			var g = $Ref.duplicate()
			$YSort.add_child(g)

I have observed that spawning over time works really well.
This second code provides a much smoother experience and reduces the stuttering but there is still some, not at the start but towards the end of the spawning. I captured two graphs from the profiler. Notice how there is a spike in frame time of over 500ms if I spawn everything in one frame but if I do it over time, the worst spike is around 23ms.
I still want to know how much I should wait. So I wait the right amount of time but no more than is necessary.

My question is, is “waiting” (as in the second approach) the only way? Is there a better way? I suspect using call_deferred may help, will it be better? If not, what signal should I wait for to avoid the spike towards the end?

Adding or removing from the scene tree is a very taxing operation, there may be a better way to optimize what you want depending on what you want.

I understand that. I still want to spawn objects, what is the best way to do it? And, is there a signal that indicates when the current objects are properly added to the tree? Looks like I am not waiting enough as there is this slope in the second graph and I know I can further smooth it by waiting more, but how much more is the minimum?

Deferred call will not help as it just delays the actual call to another frame.

I would setup a thread. Create all the scenes and then call_deferred to give them to the main thread.

I suspect the largest drag is allocating the memory.

1 Like

Yes, I suspect the same. Is there any signal to know when an object is properly allocated, or when add_child has finished allocating and initializing the nodes?

I still need to explore threads, that’s a good idea. I will try to report back once I get some data around that approach.

The notification cycles, _init is called once memory is allocated and the script/scene can now initiate its state. That is probably the earliest signal.

When you call duplicate this will invoke the memory allocation and init will be called then the node will be assigned to var g =.

For what its worth here is a video that does some hacky things to squeeze out performance from godot that could give sone extra options.

1 Like

I am using duplicate for this example, I saw similar behavior instancing a packed scene. Are suggesting to just yield a frame after each instancing?

I had seen that video already and I have in my list to learn about the different servers.
I am not destroying nodes at all but the idea of just disabling nodes and not removing them from the tree is very counterintuitive. My nodes are pretty lean and they do include an area2d with appropriate collision layers, I can now handle 3000 nodes on screen at 90fps, before doing that I was having issues with 500 nodes dipping below 1fps. I was surprised at how well physics objects performed. I was expecting them to be really bad but I can still handle around 1000 at 60 fps.

Not really, well it is a solution that would prevent blocking the main thread for very long. I think the only down side is it makes for complicated code. You can easily just through up a thread and get all your nodes allocated then start adding them to the tree. I think adding is pretty cheap but it will call any _ready or _enter_tree function you may have. So maybe chunking the number of nodes you add per frame is a good idea regardless.

I dont think there is much difference, although there is an implied shallow copy step when you duplicate vs instantiate. My guess that duplicate may take extra CPU but its probably not a big deal as the memory allocation is the bigger problem.

1 Like