Crash when running a project that loads a big dataset (word game)

Godot Version

v4.5.1.stable.official [f62fdbde1]

Question

I'm developing a word game that has a pool of ~60.000 words and, for each one of them, various values, such as length, POS, and so on.

The data is organized in a .res file with the keys being the words.
I set up an Autoload that loads it in a work thread and "put" the values in a Dictionary.

func start_loading():
	WorkerThreadPool.add_task(load_words)

func load_words():
	var config = ConfigFile.new()
	config.load("res://files/lexo.res")
	lexo_db = config.get_value("data", "words")
	call_deferred("emit_signal", "data_loaded")

I made sure to never ask for this data in the main thread before it's fully loaded in.

The problem is that, sometimes (can't figure what is different), when I run the project, the application closes while the .res is loading (after calling "start_loading" and before emitting the signal)

I don't get any error messages. I figure it must be a memory problem, but I want to be sure first. How can i explore that?

If it is indeed a memory problem, does anyone have tips on how to manage this data properly? I used an Autoload because I need access to the whole list at run-time, because the player can type any combination of letters and the game needs to check if it's valid.

Afaik load() is not thread safe. You shouldn’t be doing it like that. Instead use threaded loading mechanism provided by ResourceLoader via ResourceLoader::load_threaded_request() and related functions. Look at ResourceLoader class reference for details.

1 Like

How large is your data set lexo.res? Is it really a config file, or are you trying to load a resource? If it’s a resource you should try load() or as @normalized said ResourceLoader.load_threaded_request. If it’s a config file maybe there are better formats for dense data you could utilize, such as resources.

Do you get an error if you remove the threading i.e. run load_words normally?

@normalized

Right, I’ll try doing that. Thx.

@gertkeno

7.684 kb. It has arround 60.000 lines. Each key has an array with size 20.

I don’t fully understand the question because I got into “config file” recently, but I’ll try to explain what I did. I originally had a .csv which I would read using “FileAcess” and store it’s lines (“get_csv_line” returns an array) in a Dictionary. I read the documentation and some posts online and figure that transforming the .csv in a .res structured as a config file would be better. I have the same problem with the two approaches. The .res file is organized like this:

[data]

words={
"ABACAXI": ["ABACAXI", "7", "15.0185", "1", "1", "1", "1", "1", "1", "1", "0", "0", "0", "0", "1", "0", "0", "1", "0"],
"ABACAXIS": ["ABACAXIS", "8", "0.7693", "1", "1", "1", "1", "1", "1", "1", "0", "0", "0", "0", "0", "1", "0", "1", "0"]
}

I bet you’re right. It would be something that I “load up” into a variable or I would just reference the resource and interact with it directly (like what you can do with a Texture resource in a Sprite node)?

If i do that, I get the same problem (the application closes sometimes), but still no error message.

You could do some of your own error checking too, currently your code assumes every step works perfectly fine but config.load can fail and get_value could find nothing.

func load_words():
	var config = ConfigFile.new()
	var success := config.load("res://files/lexo.res")
	if success != OK:
		push_warning("Failed to load file! error: ", error_string(success))
		return

	var words: Dictionary = config.get_value("data", "words", {})
	if words.is_empty():
		push_warning("Failed to get 'words' from config file")
		return

	lexo_db = words
	data_loaded.emit.call_deferred()

A resource would be easier to load and reference, very alike Textures for Sprites. You may have to make a importer plugin, and this also allows you to keep your data in any format, preferably one you like to edit. If CSV was working well for you then you’re in luck, the example Resource importer plugin is a CSV!


This looks like a config file, so you should save it as a .cfg, the .res extension is for binary Resource files, not config files. It will probably load all the same in debug but you could run into yet more issues when exporting your game.


This is not a terribly large database, it should load within a second even with a line-by-line CSV format. Still a importer plugin could speed up that loading and allow easy background loading through ResourceLoader.load_threaded_request

1 Like

Hey, coming back here to close the discussion.

The crashes were actually not at all related to this data. I followed the tip from this reddit post and tried running the project through the terminal, following the steps on this page. The output hinted that the problem was related to referencing nodes that no longer exist. I figure it happened when methods were called by “call_deferred”, but by the time they were actually called, the nodes they were referencing were no longer available.

Despite that, @normalized and @gertkeno’s showed me how to properly load resources in secondary threads and that the file is actually not that big, and the “line-by-line” method should work fine.

I ended up following the “line-by-line” method, reading each .csv line as an array and storing them in a dictionary. For anyone that will be working with .csv, I would add that maybe you’ll have to change godot’s default import setting so that the file is imported as is and not as a “CSV Translation”

1 Like