FastNoise2 in godot

Godot Version

v4.5.1.stable.mono.official [f62fdbde1]

Question

For a voxel game I am currently making as a hobby project I am using the built in FastnoiseLite for my world generation.

My game is chunk based and I would like to make the chunk generation as fast as I can get it. One of the solutions I found is replacing FastnoiseLite with FastNoise2 which I have read can be about 10X faster and provide a lot more control.

The only extension for godot I could find seems to crash godot when I enable it and I have tried this on 2 seperate computers with the same result.

Does anyone know of any addons or ways that I could implement FastNoise2 in godot?

Are you perhaps optimizing too early? Is the noise generation really a bottleneck in your project? Have you done profiling? I would recommend not bothering with this until you see it’s an actual issue.

sory that this isn’t an answer to your actual question…

2 Likes

I might be optimizing a bit early haha. But I mostly want to see how far I can push my voxel world. But there is also the advantage of the node editor in fastnoise2 which can give me a lot more control about the noise I am generating. It’s not just a matter of optimising but also features :slight_smile:

What I am mostly trying to see is if there is a “plug and play” ready extension that could allow me to use this method because I don’t want to do unnecessary work. If not I will probably write an extension myself that implements: GitHub - Auburn/FastNoise2: Modular node graph based noise generation library using SIMD, C++17 and templates

I have no experience with this so it will take a bit of time but there seems to be some good documentation in the godot docs at least.

1 Like

Have you looked at hidden surfaces and (if you’re using collision) physics objects, yet? My instinct is the noise used for generation will be trivial compared to the time needed for those.

1 Like

Yes I have a hidden surfaces algorithm that also hides surfaces at chunk edges if not relevant. At the moment my chunk generation is on average around 50ms and my actual chunk drawing at 15ms (I will still improve this part with vertex pulling and compute shaders but I need to look into this more)

The chunk generation generates caves, ores and trees as well so this takes some time. I can’t remember the exact number since it is a while ago I profiled it but I believe my noise generation was about half the time of the chunk generation time.

You can do the generation from a thread.

My generation is multithreaded since I will have a wraparound world of a couple of km in every direction and need it to generate constantly. But I want to use as few threads as possible. At the moment it runs on one parallel thread and I don’t really want to go over 2 if I can avoid it

If you generate everything in a thread what does it matter if the generation takes 50 ms? It could take much more than that if it doesn’t affect the frame rate.

The fastest option is implementing noise generation in a compute shader.

1 Like

Because the quicker it runs the bigger the view distance I can get. It’s no fun looking in the distance and slowly seeing chunks loading. So the quicker it generates the world the better.

Also if I can get this quicker now I can extend the world generation and make it more interesting (but undoubtfully more complex).

I take it though that there is not really any fastnoise2 (or similar) extension available so I will write one for myself and maybe make it available after I have sufficiently tested it in my project, if there is sufficient interest for something like that.

I would still like to add though that fastnoiselite does not have something like the node editor from fastnoise2 which is just a usefull tool I would like to use in my world generation without taking the better performance in account

Aren’t chunks supposed to be streamed once the initial generation of everything inside the viewing distance is done? You always generate what’s next to be seen before it actually needs to be seen. The performance of the noise generator (or your worker thread in general) shouldn’t be an issue if you implement the streaming well. The worker thread has as much time as it takes the camera to walk across a chunk.

At the moment I can still outrun my generation. My voxels are only 0.5 m cubed so I could indeed pregenerate more but at some point you will outrun that. And when falling straight down over long distance you just clip terrain that should be there.

Do you have any techniques you think I should look at? Because that certainly a point I would like to improve on as well

How big is a chunk? You can always make the chunk contain more voxels per dimension so your walkthrough time is always longer than the generation time. The generation thread(s) can then run “slowly” in parallel and always deliver needed chunks on time.

Btw I don’t know what would be the rationale behind limiting the thing to only one thread. You’ll typically have multiple chunks to generate/stream at all times. Most todays cpus have multiple cores. If you delegate each chunk generation to its own thread, chances are the system will schedule each to run on a different core, completely parallelizing the generation and reducing the total generation time.

So I have 16 cubed chunks. But increasing their size is problematic. Because when I add or remove blocks when my chunk size is much I got a noticeable delay which I don’t want.

As far as the rational for one thread it’s not a hard limit and I probably should increase the amount. But I still want a fluid simulation and or stuff in parallel as well. What is a normal amount of threads used in games for this? From what I read I thought games usually limit this to maybe 2 or 3.

What exactly is causing that delay? Find the actual bottlenecks in your generation code by running it through the profiler. Trying to optimize via guessing, without determining the bottlenecks is a waste of time.

There’s no “normal amount of threads”. If you need things done in parallel put them on separate threads. The OS scheduler will then try to run them as parallel as possible. There’s nothing to lose by using many thread. You can only gain or in the worst case perform the same as running on a single thread.

I’d do it like this. Whenever the player enters a chunk, start generating all missing chunks that may appear when the player enters any of adjacent chunks. Run each chunk generation on its own thread or have a thread pool of fixed number of threads that take on generation tasks from the chunk queue. You can use Godot’s WorkerThreadPool so you don’t need to implement your own thread pool but still avoid the perpetual thread creation, which can be relatively costly.

Thank you for the help! I will implement that as well! I hadn’t considered doing it that way actually. I am assuming you need to use mutexes as well in that case so I will need to look into this a bit.

And the bottleneck when destroying blocks there is the redrawing of the chunk. So mesh generation (more specifically the code deciding if blocks have neighbours or not). But that I will still optimise in a different way later on.

It would depend on how you set up things. Individual chunk generation could be done without any shared data so not need for mutex locking in that case.