Do signal connections prevent RefCounted garbage collection?

Godot Version

4.4.1

Question

Will the garbage collector free RefCounted objects from memory if they are only referenced via a signal?

Imagine I have the following two classes:

class_name Listener
extends Node

func _ready() -> void:
    DataStructure.new(_on_notify)

func _on_notify() -> void:
    pass
class_name DataStructure
extends RefCounted

signal notify

func _init(new_callable : Callable) -> void:
    notify.connect(new_callable)

Now imagine I create a new Listener i.e. :

var temp = Listener.new()

The Listener creates a new DataStructure object @on_ready, but has no internal reference to the DataStructure itself. Is the signal connection between the callables DataStructure.notify and Listener._on_notify enough to prevent the DataStructure from being freed automatically? I assume either the DataStructure must call:

queue_free()

or otherwise disconnect the signals for it to be freed in memory.

I actually wrote some code to test this out, so I ended up answering my own question. But since the answer wasn’t what I was expecting (or hoping for), I figured I would leave this here in case anyone has the same issue.

I expected the internal signal reference to be enough to prevent the object from being freed, however IT IS NOT enough.

Even if an object’s signal is connected somewhere, if there are no other references to the object, it will be freed, and the signals will be automatically disconnected.

A workaround solution I found is to manually use the increment() and decrement() functions inherited from RefCounted to prevent the object from being freed. I.e. the class should increment/decrement its internal reference counter when signals are connected and disconnected.

Example:

class_name TestClass
extends RefCounted

signal notify

func connect_signal(_to_callable) -> void:
notify.connect(_to_callable)
increment()

func disconnect_signal(_to_callable) -> void:
notify.disconnect(_to_callable)
decrement()
2 Likes

This honestly sounds like a design issue to me, to which you’re coming up with workarounds. If Listener is creating DataStructures, Listener knows about DataStructures, so why doesn’t Listener simply keep a reference to the data structure it needs?

What you’re basically saying is you want to create a thing, throw away that thing but still let people use the thing. Realistically speaking: if your application has no reference to a thing, how would that thing be used?

There’s no garbage collector in Godot it uses reference counting. If you don’t keep a reference alive then the object will be free() when leaving the scope. A signal connection is not considered a reference.

I’m with @makaboi I don’t see the usefulness of your example. How are you going to emit() the notify signal?

2 Likes

This is a good thing!

You could just keep a reference to the object, that is much less hacky. Put it into an array and you are done :slight_smile: It’s very strange you are creating objects and not keeping a reference to them anyway.

1 Like

So I’m actually using this as part of a state machine.

The way it works is that I have a stack of type

Array[State]

Where individual states are script instances that manipulate an injected slave. Calls propagate from the bottom to the top of the stack.

Individual states are responsible for pushing the next state onto the stack. The state machine has no idea what states could potentially end up in the stack. The scripts are initially loaded from the disk, but I want to keep them in memory so I can just duplicate them after they’ve been loaded once.

However, I can’t keep them in the stack because every state in the stack is treated as an active state.

Hence, what I really would like to do is pop them from the stack, keep it in memory, and just push them back onto the stack as needed.

Hence the usefulness of the signals, which I already use to push and pop the stack.

The states have a

signal pop
signal push(state : State) 

Some people think that’s dirty so I could inject a callable instead I.e.

var pop : Callable
var push : Callable

but I do like that signal arguments can be typed.

I was messing around with having State inherit from Resource so I wouldnt have to worry about the memory nonsense, but then you run into the issue of circular dependencies since some States may reference each other.

There are some posts about resolving the circular dependency issue, there might be a patch that ships with the next version of godot. An alternative solution is to use UIDs but I think it’s kind of gross.

These were dummy classes for the question. If you want you can see my reply to the other user to see why I was asking this question.

Your explanation still includes no reason for why you cannot just keep a reference to the thing around. You’re essentially building a refcounting mechanism on top of the existing one for no reason (is my opinion). But, I won’t insist.

Just, based on what I’m reading…

On game load, load the scripts into a dictionary with load or preload and hand them out as needed.

You say you’re duplicating something, which makes it sound like these things are not mutable, and exist only to provide logic by acting on the state of another object. In this case there’s no need to duplicate anything; you can simply stack however many references to the same thing on your stack.

But still, I don’t understand… How do you duplicate something you don’t have a reference to? Oh well.

Whilst the process of ref counting is different to classic GC, the effect is the same, unreferenced items are removed from memory. Which can lead to the same issues (i.e frame drop etc.)

So whilst GDscript maybe not have a GC its in effect doing the same thing, with potentially the same drawbacks.

As a side note if you are using C# then GC will stll be a factor.

1 Like

You’re mistaken. There are references. They are connected by signals.

If you call

Signal.get_connections[]

The resulting dictionary provides both the object reference and method name.

Also, I’m not duplicating anything directly. I call the

load()

function, which duplicates a resource only if it already exists in memory somewhere.

I was not really looking for particular “solutions” to my niche application, I was just looking to have my question answered.

If you just want to tell me that I’m wrong then I’d be happy to send my scripts over and you can scrutinize them with a magnifying glass and tell me everything you would do differently. As it is, you haven’t even seen my code so I really don’t know how you’re making these assumptions. If ragebait, well done.

Also I already explained why I don’t want to push them into the stack you absolute… Nice person. Ever heard of DRY? Dont repeat yourself. Duplicate references = bad. Although in my case the issue actually goes beyond that.. not that you would know, you haven’t seen the code :slight_smile:

Cool post. You could’ve easily just made a better design and avoided all your raging and namecalling, or not post on a forum at all.

ok. Cool ragebait.

Reference counting doesn’t deal well with cyclic references, and signals would set up cyclic references. I don’t know for sure that’s the reason, but I’d strongly suspect that signals are treated as weak to avoid generating reference cycles.

Consider: A signals B, B signals A:

Let’s say we free(A) and free(B) at the same time. A still has a signal connected to B, so that keeps B’s refcount from dropping to zero. B still has a signal connection to A, so that keeps A’s refcount from dropping to zero.

Remember, there’s nothing magic in free() that can intuit that you really mean it this time, don’t just decrement the refcount, really kill this thing. It can’t tell. All it knows is, if refcount > 0 then keep the thing alive.

Both of them are orphaned; nothing is looking at them any more, they just look at each other, a cycle adrift in the void, but the refcount system has no way of knowing that because all it knows is that for each of them refcount != 0. So they leak, and hang around as zombie objects eating your computer’s brains.

The same thing happens (more slowly) if you free one first, and the other later.

The hacky awful fix for this is weak references; making it so that some forms of reference don’t affect the reference count, so cycles can be broken.

The more robust fix is tracing GC. Tracing garbage collection has no problem with cycles. It is, however, difficult to implement in a way that doesn’t result in frame rate hiccups. Since most people are pretty unforgiving of their game stuttering, tracing GC is a bit of a rarity in game engines.

1 Like

I appreciate you taking the time to think about this seriously, but I guess the thing that is throwing me off is that I didn’t intend to suggest that this is a complete or even a good solution.

My reply was more a general statement about “a possible solution could potentially be implemented using these function calls..”

If I were to fully implement my “workaround”, I could possibly override free() to make sure that all signals are disconnected.

As long as you manage the references properly, there’s no reason that the references must be cyclic. Even if they are called “simultaneously”, they are actually sequential, as you already know. You could use if conditions or probably many other conditional checks to solve the problem.

The main point is to make sure that there are no references and that the internal reference count is zero when you ultimately free() and completely dereference the object so that there is no memory leak.

If you know that you are going to be manually using signals to keep an object referenced via it’s internal reference counter, then you also know you need to manually deal with that reference counter when you’re finally done with the object and ready to free it.

I didn’t know these were called “weak references” (I don’t see that in the documentation so I’m curious whether this is a term you’ve borrowed from somewhere else!) but I originally just had a technical question about whether weak references would keep an object in memory or not, and instead I was getting criticism for what was essentially psuedocode designed only to communicate a question!

If we were writing in C, nobody would be especially warning me about the dangers of memory leaks. It’s just expected that you have to deal with it!

In any case, thanks for sharing your perspective! I had a friend who suggested a more node based approach to States as I wasn’t aware of the significant performance cost that custom functions have relative to built in node functions (due to the latter being C overrides and the former being interpreted at runtime).

Intuitively I assumed that Scripts would be cheaper since they inherit more directly from RefCounted, but I was mistaken!

Reference counting goes back a long way, and academic papers with it; at one point it was considered as a garbage collection method in some versions of LISP (and may still be; the LISP family tree is vast…). Reference counting’s inability to deal with cycles was identified relatively early, and most LISP stayed with (or went back to) full tracing GC.

I’m not sure if weak references were invented for reference counting; some languages have a similar idea of weak pointers, and in program linking there is a concept of weak symbols, which are somewhat related. The term “weak reference” is old, though, as is the cycle problem it tries to solve.

For a more recent example, consider:

Reference counting (or garbage collection) is typically added to a language to allow the programmer to think less about memory management, and to reduce memory management bugs; LISP (with GC) was a “memory safe” language decades before anyone thought of Rust, and lots of other languages have tried to offer the same; the JVM languages and .Net are at least in part an attempt to build programming environments where memory safety isn’t a primary concern. Ideally, this uses full GC.

Full GC, however, is somewhat difficult to make work well in a high-performance interactive environment; it’s relatively easy to implement if you can afford occasional hard pauses while the GC system cleans up memory, but amortizing that work over time so it doesn’t cause latency spikes is difficult.

Reference counting is much cheaper to implement, nearly as cheap as manual memory management (but not quite…) but it has flaws. Those flaws have to be manually worked around by the programmer, which isn’t ideal, but the idea is that it’s less work than fully manual memory management like in C.

1 Like

I appreciate you for contextualizing this.

I’m reminded that I need to be more careful when making objective/declarative statements.

I don’t precisely understand what you’re suggesting when you say that the RefCounting is cyclic, or more specifically how that would be an issue in this situation. I have the most experience writing code in Java, which has its own GC, so it’s true I never had to worry about memory management. I’m guessing this “cyclic” problem is some kind of provable mathematical truth that I haven’t quite wrapped my head around.

This confusion might be related to my having misspoke or poorly communicated what I was trying to accomplish.

I don’t know how Godot’s

load()

function is implemented internally, but according to the documentation it will somehow be able to either clone() or otherwise retrieve the packed scene/resource in memory if it exists.

Assuming this works correctly, I don’t actually need a reference to the object at all.
I just need it to exist in memory, so that I don’t have to retrieve the script from disk every time.

When I want to clone it, I just call

load("some_resource.gd")

and because it’s instanced in memory somewhere, it clones it instead of getting it from the disk.

That object could then have some kind of “kill signal” that it listens for so that when the data structure using it as a template to copy from doesn’t need it anymore, it can

func _on_kill() -> void:
    self.free()

Again, I’m not suggesting this would be an ideal solution. I would hope that with some careful thought I could think of something better.

Even so, I hope this clears up my imagined use case. I don’t want an explicit reference because the idea is that I don’t know what objects I will need at compile time. I could also just feed them into a dictionary but that seems like a waste of processing, while admittedly being relatively cheap.

At the end of the day it’s kind of moot since I went with a different approach anyway, but I do think this is technically interesting

That’s exactly the sort of situation reference counting is for; conceptually, it looks something like:

func load(path: String) -> Variant:
    var thing: Variant

    if is_in_cache(path):
        thing = get_from_cache(path)
    else:
        thing = really_load(path)

    thing.refcount += 1
    return thing

func free(thing) -> void:
    thing.refcount -= 1
    if thing.refcount < 1:
        really_free(thing)

That’s not enough, though; if you do something like:

bar.thing = foo.thing

Then you need to bump the refcount on thing so that it if foo gets deleted with its children, bar doesn’t wind up pointing at a stale value. Likewise, if you point bar.thing at something else, it needs to decrement the refcount of what it was pointing at before it got changed. All this stuff typically gets added under the hood in things like the assignment operator, variable declaration and scope closing so you don’t have to deal with it yourself.

The cyclic problem happens because of this; if you have two objects referring to each other (that is, a cycle, in the graph sense), the refcounts will not reach zero unless the programmer takes explicit action to force it, or unless there’s something like a weak reference to break the cycle.

You can have far more complex cycles; A refers to B which refers to C… and Z refers back to A. The problem is that as long as A is alive, it keeps the refcount of everything it’s looking at to at least 1. If it’s looking at something that’s looking back at it, A’s refcount will never go to zero either, so the cycle is mutually immortal.

This is a solvable problem, but it turns out that you solve it by doing cycle detection and reference tracing, and… now it’s full GC or close enough, with all the costs associated.

Reference counting can be incredibly lightweight; at the simplest it’s just:

typedef struct
{
  void    *data;
  uint64_t refcount;
} REFCOUNTED_POINTER;