Why Scenes and Nodes in Engine Architecture

:information_source: Attention Topic was automatically imported from the old Question2Answer platform.
:bust_in_silhouette: Asked By kewp
:warning: Old Version Published before Godot 3 was released.

This might sound stupid or obvious. Even so, I’d love to hear an explanation.

I started reading the tutorials and docs to get an understanding of the Godot Game Engine and how it worked, but it starts of with Scenes and Nodes and doesn’t explain why it has Scenes and Nodes in the first place.

What is the point of a Scene ? Is it … something that can be loaded all in memory at once ? How would you determine what parts of your game belong in which scene ?

Ok, and Nodes - I guess this comes from the idea of an Object, right ? It looks a lot like Unity. Make everything an Object. Ok, but why ? Why does everything need to be an object ? Is it just so that it can be referenced ? Is that just a way of “importing” things into your game ?

I hope this isn’t coming off too vague and strange. My point is, I’ve been programming for a long time and so coming to a new system I want to know - what problem are we trying to solve ? I get that games can be 2d or 3d, and they involve graphics and sounds, and you pretty much always have some kind of event or frame loop (btw where is this discussed for Godot?) …

Maybe this is a stupid question. I dunno. Right in the start of the docs it says “It’s like a recipe ! You need ingredients !”. I’m just coming from a programmer perspective where you have some API into graphics and audio. And here there is this IDE and a node tree … Just hoping someone can get me to understand why this solution is General and Useful …


The whole cooking and ingredients analogy didn’t make much sense to me either. I just glossed over it, said “meh… whatever” and went on with the rest of the docs.

First of all, 99.99% of engines out there use an object-oriented paradigm. Reference counting is just one benefit. Reflection is a benefit. Instancing is another benefit.

I’ve worked with a few engines (well, more like frameworks) lacking these features in the past and it got really ugly real fast.
Sure you could get a “hello world” up and running fast, but beyond that… God help your soul…

It would be great if someone could disrupt the mold with a more efficient, revolutionary paradigm but until that day comes this is what we have.

Back to the scenes: think of each scene as a “tree” made up of “sub-trees” of objects rather than Unity’s concept of a scene.
Not only can you reuse any “sub-tree” in other scenes, but independently, as an actual scene itself.
Personally, I prefer this approach over Unity’s prefabs.

Still, if you’d rather work 2000’s style there’s a somewhat modern library (without ANY fancy features) called raylib over here:
Gamesfromscratch covered it a while ago:
raylib 1.6 released – GameFromScratch.com
It’s also open source with a BSD-style license:
GitHub - raysan5/raylib: A simple and easy-to-use library to enjoy videogames programming

But believe me when I say that once you’ve seen just how much these newer engines can help, you won’t want to go back.

eye776 | 2017-02-25 21:16

Thanks for considering my question.

You know, I got excited about Godot because I felt an itch. I wanted to create something awesome, some new tech or new experience. It seemed like Godots creators had the same idea, but when I started reading about it the first thing I wanted to know was ‘What is this?’. And the answer that came back was Game Engine which doesnt really answer the question. I think that because, as you say, most of these ‘game engines’, whatever that means, use the same approach they all kinda assume this is how you solve the problem (which I think is something like ‘how do you make games better’). I would love to hear what approaches one could take, and where Godot falls into that.

Raylib looks very cool. Thanks for the reference. Def gonna check it out.

kewp | 2017-02-26 11:07

:bust_in_silhouette: Reply From: avencherus

I didn’t find scenes to be terribly descriptive either, nor was that cooking analogy in the tutorials very useful. But they are just containers of nodes that exist to be instanced. They can, and generally are, purposed to be a game scene, but they can also just contain an item or character and all of it’s objects. Scenes can also be instanced inside of scenes, giving you a variety of organizational options.

My advice is to just start making small projects with them, and their functionality will become clear.

More directly, the game operates with a scene tree, with a viewport on it that isn’t visible in the editor panel. Then scenes can be loaded and unloaded on to this scene tree. A scene then is a collection of nodes in a hierarchy. The nodes are more directly comparable to classes or objects. With nodes come all your functionality.

If you’re using a Node2D and follow it’s inheritance, it ultimately is a CanvasItem, giving you a 2D canvas to operate on. Spatials giving you a 3D space.

For me, it felt natural and got quickly the idea of scenes as representation of objects ordered in a tree.

This part, about design, may be a bit less vague than the actual explanation

eons | 2017-02-25 00:52

Thanks for your answer. I think I’ll do what you suggest and play around with it.

I like what eons said about objects being ordered in a tree. That was my initial assumption about the design. My thoughts really were why do things that way - why must everything be objects. I suppose it makes creating an IDE easy: it allows you to create reuseable, drag-and-drop components. Which I suppose makes for a solidly (simply) composeable system …

kewp | 2017-02-25 06:01

Apologies, yes - I just read your answer again. The part you refer to is exactly what clarified it for me. I’ve answered what I think it amounts to above.

kewp | 2017-02-25 08:15

:bust_in_silhouette: Reply From: kewp

I’ve been reading the docs and it does try lay this out. In the 3rd part of the step-by-step tutorial when it talks about instancing http://docs.godotengine.org/en/latest/tutorials/step_by_step/instancing_continued.html

The basic idea, I think, is that Scenes are the logical elements of your game (Player, HUD, Enemy) and Nodes are the atomic building blocks.

So you sort of follow a domain-driven design approach. I can’t find a good definition online, but we used it at my previous company - break everything down based on the words you would use to describe it to a user. “The player runs across the terrain into the building and picks up some ammo”. Each of these is then made into a Scene (which is such a misleading word…) which is pulled in through Instancing.

What do you guys think ?

No offense, but that isn’t “sort of following a domain driven design approach”, that is object orientation.

Warlaan | 2017-02-25 08:38

I’ll write a longer answer later (haven’t been near the laptop so far). This is the short answer: what you see in the godot editor is a visual declarative object-oriented domain-specific language, so scenes are basically classes. They aren’t called classes though because in addition to being classes they have the properties that are typical for objects in a game engine. More details later, smartphone keyboards suck. :wink:

Warlaan | 2017-02-25 09:10

:bust_in_silhouette: Reply From: Warlaan

There are different ways to answer that question, so I’ll just pick one and start from there: “I’m just coming from a programmer perspective where you have some API into graphics and audio.”
The kind of APIs you are used to often refer to themselves as “frameworks” to make a distinction from engines like Godot or Unity.

So what’s the difference between frameworks and engines and what does that have to do with having to use objects?

The difference is that frameworks facilitate typical workflows like creating a window, loading a mesh or texture from diverse file formats, playing audio etc. That helps a lot in getting assets on screen, but it doesn’t do much to improve performance, because it basically just wraps the calls you would be making if you were using DirectX or other more low-level APIs.
Engines on the other hand improve performance by making assumptions about the kind of game you want to create. The more specific these assumptions are the less viable an engine becomes for different kinds of games, but the easier it is to add performance enhancing algorithms that make use of the fact that they can expect a certain kind of scene.

That was a bit theoretical so what does this mean for Godot?

Pretty much any mainstream engine today uses one such assumption, and that is that your game is made of “objects” (sometimes called GameObjects, Entities, Nodes etc.)
That means that every element of your game is expected to have a position and a size. This is the most important assumption you can make, because it allows the engine developers to implement so called “culling” mechanisms. Culling means that objects that aren’t visible / can’t be heard / can’t affect physics / won’t interest other players on the network are removed from the output stream that is sent to the graphics card / audio device / physics engine / network server.
So if everything is an object and everything has a position and a size it’s easy to detect whether or not the bounding box of an object intersects with the area that is visible from the camera. Anything that doesn’t is not relevant for the graphical output and is not sent to the graphics card.

So that explains “objects”, but why “nodes”?

Let’s say in your game there is a ship. On the ship there are sailors. In your game the sailors can’t leave the ship, so it’s safe to assume that as long as you can’t see the ship you won’t need to check if you can see any of the sailors, you can just assume that all of them are invisible too. That’s how scene graphs work. Objects that are expected to move together are structured in a tree, so that the tree can be culled beginning with the root node and ending with the leaf nodes, and if the root node of a branch is invisible (inaudible, not relevant for physics etc.) all of the child nodes don’t need to be considered. (disclaimer: it’s a little more complicated than that, but the details won’t help understand the basic idea.)

That explains “nodes”, but why make a distinction between “nodes” and “scenes”?

The most intutive way to write a program is to use the paradigm of “procedural programming”. A procedure is a sequence of commands that alter the state of its environment, like a cooking recipe. A cooking recipe won’t tell you to get a stove first, it will just assume that there is one available. It’s not written in a way that makes sure that you can perform multiple recipes at the same time in the same kitchen. It just gives you a series of tasks that will result in there being a cake or something in your kitchen when you are done.
This is easy to understand, but it is problematic once you have multiple things going on, since you can’t always rely on the state of the environment being what you need, since you can’t know who touched a resource last and what state it has left it in. That’s why the concept of object orientation came up.
In object oriented code you define objects (e.g. a car) and have your commands only access resources that are part of that object (e.g. the combustion process only uses the engine that is part of the car, fuel that is taken from the car’s fuel tank etc.). That way you can easily make sure that two objects can perform their methods without interfering.

Godot is an object oriented engine and scenes are classes. The root node is the base class of the scene and the immediate children are member variables (I am leaving out scripts for now, which can add members as well).
You can tell that scenes are classes because they

  • can be instantiated
  • can be exchanged as long as they have the necessary base class (e.g. you can exchange any sprite with any scene that has a sprite as a root node, i.e. polymorphism)
  • can inherit from each other

So the reason why they are called scenes is because they are more than just nodes and they are more than just classes as well. They are classes that inherit from Node, so they combine the idea of encapsulation and that of common culling criteria.

If you think about it like that the name “scene” does make sense, because it’s a very unspecific term that describes any group of things that are adjacent and somehow belong to each other.

Wow, thank you for your detailed answer.

What you say about optimisation makes sense, specifically chosing assumptions that allow more tighly controlled mechanisms in construction. That seems like a valuable thing to keep in mind when approaching a game engine.

One thing that confuses me with the scene / nodes approach is that everything goes in there. Yes, the game will have physical objects with position and child association but what about code ? Global variables ? Events and the game loop ? What happens on bootup ? Godot solves this with scripts attached to nodes … I dunno. Thats why I said this is like domain driven design. So a node or scene isnt just a physical object but an idea, some concept like the score … Which plays to classes.

I need to go. Will circle back a bit later.

kewp | 2017-02-26 11:21

Further to your answer, this page makes the distinction between Library, Framework and Game Engine. GameDev Glossary: Library Vs Framework Vs Engine – GameFromScratch.com

They say that a game engine must have a scene graph and a level editor … Which fits precisely with what you are saying.

The other thing I’ve thought about is that Godot has a lot of features. I was reading the latest devblog posts about the new renderer. I suppose something like that wouldn’t be included in a framework ? Or if not - if a game engine is just about scenes and an editor - why not separate out the features like physics and rendering and make them pluggable into the editor ?

kewp | 2017-02-27 07:11

This is all rhetorical. I need to re-read the docs several more times (and use the engine!) to get a sense of what it does and why …

Thanks again.

kewp | 2017-02-27 07:12

I just added a comment to the gamesfromscratch-article. It’s nice to see that I am not the only one fighting the windmills and trying to keep up that distinction. It’s not out of pedantry, it’s because knowing about this distinction allows you to understand what you are sacrificing by using an engine or by using a framework, since both approaches have their benefit.

As you said you are experienced with frameworks, so you’ll know that there isn’t much of a difference between painting an image on screen once and painting it 10 times, you just call draw…() a couple of times more before clearing the screen.
But since engines need to have some concept that allows them to detect what is visible and what is not they need to have some additional information, so most engines will wrap something like an image in an object, which means that painting 10 copies of an image means that you have to create 10 objects which of course does introduce a certain overhead.

There are two things that the article got wrong though: you don’t need to have an editor in an engine (the article mentions OGRE as a framework, which actually is an engine - the name stands for object-oriented graphics rendering engine - and which doesn’t have a mandatory editor) and scene trees aren’t the only container suitable for culling. But it’s true that something like a scene tree is a vital part of an engine.

There’s one more distinction we didn’t mention yet. There are graphics engines and game engines. Actually there are even more than that, there are sound engines, physics engines, network engines etc., but as people tend to focus on what you see they usually only care about the distinction between graphics and game engines.
Game engines combine the different types of engines in one, which makes sense because the concept of having objects works about equally well for all the different kinds of engines. An object that is very far from the camera’s location will most likely be not visible, not audible, not relevant for physics and not relevant for network clients.
So making it possible to turn off features like physics doesn’t change much in the workflow, it’s not going to make the work with a game engine a lot easier.

The other reason to make them pluggable would be so you can switch physics engines, audio engines etc.
There are game engines that are built in a way that allow you to exchange these feature engines, but unlike graphics engines those other engines aren’t as strictly targeted at a special use case. For example there are graphics engines that are optimized for 2d rather than 3d, for lots of simple objects rather than a few complex ones, for flat open-world-like environments rather than indoor environments that extend in all three dimensions…
There aren’t as many factors that influence which physics, audio or network engine to use. So being able to exchange these engines is something that helps programmers when after a couple of years there is a better alternative, but it’s not something you would do on a per-project basis.

Warlaan | 2017-02-27 09:07

Very interesting.

Yes - with the frameworks distinction I wonder why not just pull together all the best libraries out there - graphics, physics, sound, networking. What is the point of Godot or Unreal ?

I suppose having everything all under one roof makes it cleaner / quicker. But that all probably happened because people started with several libraries, developed their own best practices and eventually conglomerated it into Their Engine.

To be honest I’ve lost why exactly I was looking at all this in the first place. Godot seemed like a good tool to use. (I especially liked how small it was in MB). Honestly, until I use it and see things through I’ll never really know what it is and what it’s limitations are …

kewp | 2017-02-27 09:22

I don’t really have anything to add to this conversation but I just wanted to mention I enjoyed reading the questions and responses. I was some interesting context that we don’t normally think about but is useful to have. Thanks!

cowhand214 | 2017-02-28 22:09

Thank you for saying so !

kewp | 2017-03-01 06:15