Callable/lambda performance

qqsu_azure · February 27, 2025, 1:42pm

Godot Version

4.3

Question

Just try to have some authentic information on the performance of Callable/lambda in GDScript.

Just until recently, I use lambda to filter node’s children frequently like following:

func for_child(filter: Callable) -> Variant:
	for ch in get_children():
		if is_instance_valid(ch) and *any other general checks*:
			var r = filter.call(ch)
			if r:
				return r

func _process(delta: float):
	var r = for_child(func(node):
		* Do whatever I want!*)

But now I found the performance is very bad comparing to a non-lambda loop. So I rewrite these lambda mostly when they’re in _process() and _physics_process(). The FPS increased from 20~30 to 90+ (I know the FPS depends on a lot of things so just want to show how it differs in my case).

I used to be a decade-long Java developer. In Java, lambda is very slow when it’s first introduced in Java. About 1-order slower as I can remember? Later, it got improved but still 2 or 3 times slower in completing same task. I’m wondering if it’s a similiar story in GDScript, or even in c# or c/c++?

(I really love lambda and it’s a great way to reuse a large chunk of code with some small customized behaviors. Sadly I need to be more careful in practice)

jesusemora · February 27, 2025, 6:06pm

this is premature optimization. stop.

qqsu_azure:

func for_child(filter: Callable) -> Variant:
	for ch in get_children():
		if is_instance_valid(ch) and *any other general checks*:
			var r = filter.call(ch)
			if r:
				return r

func _process(delta: float):
	var r = for_child(func(node):
		* Do whatever I want!*)

this code makes no sense. I would put the child nodes or any nodes I need to access frequently into an array.
and running the function in process is… just… stupid.

avoid optimizing code as the only situation where it can matter is with algorithms, and even then there many more solutions of engineering the way the system works instead of these useless tests.
such as… not running code in process that does get_children()

you can’t solve all problems with a hammer. you need to use the other tools too.

qqsu_azure · February 27, 2025, 11:53pm

Thanks. This was exactly what I did until I hit another weird issue. I described it here: Trying to assign invalid previously freed instance - #6 by qqsu_azure

I do like to discuss a bitter further here. The above code is just an example to say what I mean about Callable/lambda I’m practising. My main concern is that the any other general checks (algorithm, you mean?) can be complicate so it should be reused. Another example in my game:

# General algorithm
func search_target(area: Area3D, prioritizer: Callable) -> Node3D:
	# tons of logic to compare all overlapping objects in the area depends on distance and other status. However, I do need the second parameter to allow different callers have their own priority settings.

# One of caller for example:
func _process(delta: float) -> void:
	var result = Helper.search_target(%SearchArea, func(obj):
		if obj is Weapon:
			return HIGH
		if obj is Storage:
			return LOW
		return PASS

Another caller can have a very different prioritizing. If the search_target is called in many places frequently. What is the best way to do when considering performance? If I use class instead of lambda, will the performance much better? I probably will do some tests when I have time.

class Prioritizer:
	func prioritize(obj: Node3D) -> int:
		return PASS

# The caller will be like:

class MyPrioritizer extends Helper.Prioritizer:
	# ...

func _process(delta: float) -> void:
	var result = Helper.search_target(%SearchArea, MyPrioritizer.new())

(The performance of lambda is not a big issue when I worked for ERP softwares as non of them is supposed to be called hundreds of times in a few seconds in a single device. Clearly it’s not the case in game world )

gertkeno · February 28, 2025, 12:01am

The general advice is to do as very little per-frame as possible, _process should be avoided at all costs and kept trim.

For the over all topic Yes Callables are fairly slow in GDScript, adding type hints helps and it’s always being improved. Chances are there are other better ways to fix your performance concerns but if there isn’t then you should move critical parts of your code to a GDExtension, which can leverage lower-level optimizations in languages like C++.

jesusemora · February 28, 2025, 2:04am

there is a lot to unpack here.

that sounds more like you using arrays wrong, and then instead of fixing the issue switching to lambdas for no reason.

the problem with that sounds very simple, you are doing is_instance_valid(plat) and at the same time calling a variable inside of plat.

have you tried:


if plat:
	if plat.grid_position == pos:
		return plat

but also this looks like redundant code since it’s so simple it should not be its own function.

also you are deleting a node but not removing it from the array, resulting in an empty element that causes problems.
you are supposed to remove the element from the array first and then use the temp to delete the node.
and don’t erase elements from an array while iterating through it. that’s why in my case I created a second array to queue elements that have to be deleted.

but also also you want to check if pos is equal inside an element of array. array is a basic data type, I suggest you read a book or two on data structures, because there are better ways to handle this.
one would be by using a dictionary to store the positions with references to the nodes


var installed_platforms : Dictionary = {}

var plat : Platform = installed_platforms.get(pos, null)
if plat:
	#do whatever with the plat
	pass

well I don’t know what you are trying to achieve, if these things can move, but I can tell you two things:
1 - there is a billion different ways to do things and they depend on the features needed. so narrow those down and choose one.
2 - there is a reason why games have constraints, it is to make life easier. one example would be using a grid and having all elements the same size. another would be having platforms that can’t move or occupy the same space, so you can store them by position as in the example.

this is, again, premature optimization. stop.
make a prototype. if there are performance issues, you look into optimization. not the other way around.

don’t consider performance.

also there’s no difference, even if one node can hang and slow down the game, there are ways to fix it without having to rewrite everything.
I know because I did it, not in the best way but I did it.
I used await to pause the execution of code in the node every N “steps”, it then resumed in the next frame. as a result, I was running complex algorithms in the node while the game ran smooth and animations continued.


func eval_move() -> bool:
	if moves_evaluated > max_moves:
		moves_evaluated = 0
		return true
	else:
		moves_evaluated += 1
		return false

func evaluate_next_move(body : ControllableUnit, enemies : Array[ControllableUnit], ai_orders : Dictionary) -> void:
	...
					for h in enemy_moves[k]:
						if eval_move():
							await body.next_frame#wait for next frame

signal next_frame

var is_selected : bool = false
func _physics_process(_delta):
	if is_selected:#only works with AI_turn
		next_frame.emit()

this is just an example of how I optimized the code AFTER making something that was working.

it IS very much the case in game dev, it’s just common sense. don’t call a function thousands of times per frame, unless it’s something very simple or has the proper conditionals to prevent code from being executed more than it should.

I honestly don’t understand how you can get performance issues with this. As I said before, don’t call get_children() in process, prepare an array or something with the nodes that must be removed and remove them when a condition is met like collision with an Area3D.

if this is something like a preview for building, it’s the same thing, but you have a start where things are placed, and a condition which could be the mouse moved, when something is recalculated, and you don’t remove the preview blocks you keep them there and move them.
or a main node with the previews that holds the other blocks and that moves to the position of the mouse every physics frame, and if there is a collision with area3D for connection you enter a connected/disconnected state and use a different way of calculating the final position.

qqsu_azure:

class Prioritizer:
	func prioritize(obj: Node3D) -> int:
		return PASS

# The caller will be like:

class MyPrioritizer extends Helper.Prioritizer:
	# ...

func _process(delta: float) -> void:
	var result = Helper.search_target(%SearchArea, MyPrioritizer.new())

finally, don’t do this. this code is confusing and has no reason to be.
just make a prioritize function.

use inheritance, create a main named class and inherit from it and reuse the common functions.

I would say if you can’t fix the performance issues in gdscript, there is something wrong with the way you are doing things and won’t be magically fixed by moving to C++.

gertkeno · February 28, 2025, 2:09am

Yeah when I say “Chances are there are other, better ways to fix your performance” the chances are like 99% and you are totally right. You need a plan to leverage moving to C++, bad code is still the same bad code in another language, but C++ gives you more tools that could help (foot guns aside lol)

qqsu_azure · February 28, 2025, 2:39am

Really appreciate for the patient and LONG reply!

I haven’t read how await works in gdscript so might check later. However, my approach (off the lambda topic), is using Timer. Whenever I feel a func has some costs and doesn’t need more than ten times per frame, I add a Timer and set a proper wait_time. I prefer this way but, if there’s a lot of dependencies on internal state like physics staff, maybe I have to use your way otherwise async problem would hit me hardly. Correct me if I’m wrong.

The game I’m building is a space platform player can build with basic blocks anytime (no specific build screen, think of minecraft in space) and move/rotate in space freely. Also they fight with other blocks-built spaceship. Each spaceship has own turrets can shoot each other so there’re a lot of target searching/prioritizing and raycasting/hitting checks.

I usually turn off v-sync to keep an eye on the FPS so I will know what change I made dramatically impacts the performance. That’s how I found the lambda performance issue.

Finally I do agree with you that we should not jump into c# too quickly. As a long-time Java developer, I know most of general performance issue are caused by wrong doing instead of language.

Thanks again.