Data-oriented design, Object-oriented programming, and GDScript

Edit1: Removed some unused lines of code

Array of Classes vs Class of Arrays

What is this?

The title would in other cases be Array of Structs vs Struct of Arrays but since GDscript don’t have structs I use class in this benchmark. But as I write this, I realize using a Resource would probably be closer to a struct and probably faster too, - but I’ll leave that benchmark for another time as this benchmark could still be useful for those who needs to use classes over resources for any reason.

There are some examples of this online if you want to read or watch some videos on this topic, but to give a brief overview of this concept:
Instead of having an array holding large objects which consists of many members that needs to be looked through in memory even though you only want to extract one member variable from that object, - you create multiple arrays where each array element points to the same individual member variable (or individual components of member variables) across many objects, - for instance the Position property of multiple Node2D scene objects. The main benefit of this concept is speed, and it is particularly useful when batching many objects in a for loop or similar.

In this example I use arrays that points to the properties of the objects, rather than breaking down the data further for simplicity sake. There may or may not be further benefits of taking this further.

Benchmark

Overview

This benchmark instantiates scene node objects in the amount of the property ‘length’. Benchmark_1 goes through an array ant_objects which holds Node2D objects and adjusts each object’s position.y. Benchmark_2 goes through an array objects within a Termite object which holds the position properties of many Node2D objects and adjusts each property’s (position.)y element.

I’ve tried keeping the syntax similar in benchmark_1 and benchmark_2, but there is another for loop commented out if you want to switch it and see if there is any difference yourself.

Settings

Benchmark cycles: 100
Array length: 1000

In other words, there are 1000 objects in each array that gets processed each time benchmark_1 and _2 runs, - and this is repeated 100 times. The results shown in ms is the sum of running benchmark_1 and benchmark_2 100 times, so the actual processing time is somewhere around 100 times shorter.

Conclusion

The benchmark is run 10 times to find an average.

  1. benchmark_1: 27, benchmark_2: 8, difference:19
  2. benchmark_1: 25, benchmark_2: 10, difference:15
  3. benchmark_1: 29, benchmark_2: 7, difference:22
  4. benchmark_1: 27, benchmark_2: 10, difference:17
  5. benchmark_1: 32, benchmark_2: 7, difference:25
  6. benchmark_1: 28, benchmark_2: 8, difference:20
  7. benchmark_1: 29, benchmark_2: 9, difference:20
  8. benchmark_1: 28, benchmark_2: 10, difference:18
  9. benchmark_1: 31, benchmark_2: 6, difference:25
  10. benchmark_1: 30, benchmark_2: 8, difference:22

(sum of difference) / amount of benchmark runs =
203 / 10 = 20.3ms.

In summary, the class of array benchmark is somewhere around 3 times faster than the array of classes.

Code

extends Node2D

@export var benchmark_cycles: int = 100
@export var length: int = 1_000

var benchmark_1_results: Dictionary
var benchmark_2_results: Dictionary
var termites: Termites
var node_termites: Node2D
var node_ants: Node2D
var ant_objects: Array


func _ready() -> void:
	termites = Termites.new()
	node_termites = Node2D.new()
	node_ants = Node2D.new()
	add_child(node_termites)
	add_child(node_ants)

	for i in length: 
		var ant: Node2D = Node2D.new()
		node_ants.add_child(ant)
	ant_objects = node_ants.get_children()

	for i in length: 
		var termite: Node2D = Node2D.new()
		node_termites.add_child(termite)
		termites.positions.append(termite.position)
	termites.number_of_termites = length

	for i in benchmark_cycles: 
		benchmark_1_results[i] = benchmark_1()
		benchmark_2_results[i] = benchmark_2()

	measure_results()


func measure_results() -> void:
	var sum_1: float = 0.0
	var sum_2: float = 0.0
	
	for key in benchmark_1_results:
		sum_1 += benchmark_1_results[key]
	for key in benchmark_2_results:
		sum_2 += benchmark_2_results[key]
	
	print("benchmark_1: " + str(sum_1))
	print("benchmark_2: " + str(sum_2))
	print("difference:" + str(sum_1 - sum_2))


func benchmark_1() -> float: 
	var time_start_benchmark: float = Time.get_ticks_msec()
	#for ant in ant_objects:
		#ant.position.y = 10
	for i in ant_objects.size():
		ant_objects[i].position.y = 10
	return Time.get_ticks_msec() - time_start_benchmark


func benchmark_2() -> float: 
	var time_start_benchmark: float = Time.get_ticks_msec()
	for i in termites.number_of_termites: 
		termites.positions[i].y = 20
	return Time.get_ticks_msec() - time_start_benchmark


class Termites:
	var positions: Array[Vector2]
	var number_of_termites: int

2 Likes