This would require to loop over the whole array anyways, thats why dictionaries are preferred, but if you want to use arrays that might be the best solution
As others already noted, the best would be to just use a dictionary as a set.
If you’re stuck with the array, removing duplicates using a temp dictionary is probably the fastest way to do it from script:
func array_remove_duplicates(array: Array) -> Array:
var output := {}
for element in array:
if not element in output:
output[element] = true
return output.keys()
For context, I’m adding/joining multiple arrays, so I wont’ be able to always guarantee I will avoid duplicates.
Possibly the easiest approach would be to perform the check within the joining function and call it done.
I just hoped there was an easy in-built to spare me that part
Is using a dictionary faster than just using .has():
func array_remove_dupes(array:Array)->Array:
var output :Array = []
for element in array:
if !output.has(element):
output.append(element)
return output
How about using filter?
func array_append_no_dupes(array1:Array, array2:Array)->Array:
var output :Array = []
output = array2.filter(func(n):
if !array1.has(n):
return n)
return array1 + output
Yes. Should be considerably faster. has() is a linear search. It iterates the whole array. The way you’re using it will perform close to O(n²), while dictionary will do it close to O(n).
Yes it is. When optimizing, always measure. Here is a simple test program for testing @sancho2’s, @normalized’s and my unique array functions. Results:
unique_array_normalized: 160.998 ms
unique_array_dizzycaterpillar: 119.757 ms
unique_array_sancho2: 2105.69 ms
I’m quite shocked that @normalized made a newbie mistake of unnecessarily checking if element already exists in the result dictionary
func _ready() -> void:
const COUNT := 2000
const SIZE := 1000
var test_array: Array[int]
test_array.resize(SIZE)
for i in SIZE:
test_array[i] = randi()
if true:
var t0 = Time.get_ticks_usec()
for i in COUNT:
unique_array_normalized(test_array)
var t1 = Time.get_ticks_usec()
prints("unique_array_normalized: ", (t1 - t0) / 1000.0, "ms")
if true:
var t0 = Time.get_ticks_usec()
for i in COUNT:
unique_array_dizzycaterpillar(test_array)
var t1 = Time.get_ticks_usec()
prints("unique_array_dizzycaterpillar: ", (t1 - t0) / 1000.0, "ms")
if true:
var t0 = Time.get_ticks_usec()
for i in COUNT:
unique_array_sancho2(test_array)
var t1 = Time.get_ticks_usec()
prints("unique_array_sancho2: ", (t1 - t0) / 1000.0, "ms")
func unique_array_normalized(array: Array) -> Array:
var output := {}
for element in array:
if not element in output:
output[element] = true
return output.keys()
func unique_array_dizzycaterpillar(array: Array) -> Array:
var output := {}
for element in array:
output[element] = true
return output.keys()
func unique_array_sancho2(array:Array)->Array:
var output :Array = []
for element in array:
if !output.has(element):
output.append(element)
return output
And I made a newbie mistake with the test data. In the previous test the array didn’t have any unique values. Here is better test data and the test results:
const COUNT := 4000
const SIZE := 2000
var test_array: Array[int]
test_array.resize(SIZE)
for i in SIZE:
test_array[i] = randi_range(0, 100)
unique_array_normalized: 205.98 ms
unique_array_dizzycaterpillar: 184.523 ms
unique_array_sancho2: 1188.271 ms
A good performance test would contain multiple test arrays, at least: