Stutters Generating Terrain Collision on Separate Thread

Solved!

Something to I needed to add to the solution before it started working is the vertices in the correct winding order:

var indicies = surface_arrays[Mesh.ARRAY_INDEX]
var verts = surface_arrays[Mesh.ARRAY_VERTEX]
var wound_verts: Array[Vector3]
for i in indicies:
  wound_verts.append(verts[i])

Then I could call u/Monday’s function like so: create_collision_shape(wound_verts)

Godot Version

4.6.1

Question

I’m making a 3D procedural terrain generation project.

I’ve got the terrain to generate on a separate thread with LOD switching working just fine, but when I try to generate a collision shape via MeshInstance3D.mesh.create_trimesh_shape(), I keep getting stutters.

It’s especially confusing, because these feel like the same stutters I was getting back when I was generating the terrain on the main thread. Generating the terrain on a separate thread fixed that issue.

I’m generating the collision shape on the same thread that is generating the terrain, so I don’t know why it’s causing the main thread to lock up. I don’t see anything obvious in the profiler other than a huge spike when generating the collider.

I’m pretty new to threads so I feel like I’m missing something obvious. I’m not doing the calculations on the main thread, so why is it still causing stutters???

Below is the relevant code. I’ve excluded some sections for brevity, but the basic flow is like this:

  1. TerrainGenerator.gd checks every frame if it should generate another chunk
    1. If it should generate a chunk, it marks the chunk as unready and delegates it to the terrain_thread as soon as it’s available.
    2. This thread creates a new TerrainChunk.gd node and calls add_terrain()
  2. TerrainChunk.gd instantiates the terrain node that will get added to the scene: TerrainScene.gd
    1. This node generates the mesh with SurfaceTool.
    2. It also generates the collider here.
      1. commenting out the collider code removes the stutters.
  3. TerrainChunk.gd is added to a terrain_container node that lives on the main scene.

TerrainGenerator.gd

func _physics_process(delta: float) -> void:
  clean_up_chunks()
  update_visible_chunks()

func clean_up_chunks():
  for i in range(chunks_last_visible.size()):
    chunks_last_visible[i].set_chunk_visible(false)
  chunks_last_visible.clear()

func update_visible_chunks():
  viewer_position = Vector2(viewer.position.x, viewer.position.z)
  var viewer_position_normalized := Vector2(
    roundi(viewer.position.x / chunk_size), 
    roundi(viewer.position.z / chunk_size)
  )
  var y_offset = - chunks_visible_in_view_dist
  var x_offset = - chunks_visible_in_view_dist
  while y_offset <= chunks_visible_in_view_dist:
    while x_offset <= chunks_visible_in_view_dist:
      # transform world coords to normalized chunk coords
      var chunk_coord_normalized := Vector2(
        roundi(viewer_position.x / chunk_size) + x_offset,
        roundi(viewer_position.y / chunk_size) + y_offset
      )
      var lod = get_lod(chunk_coord_normalized, viewer_position_normalized)
      if chunks.has(chunk_coord_normalized):
        var chunk := chunks[chunk_coord_normalized]
        chunk.update_chunk(viewer_position)
        if chunk.is_chunk_visible():
          chunks_last_visible.append(chunk)
        if chunk.lods.has(lod):
          chunk.update_lods(lod)
        else:
          add_chunk_lod(chunk_coord_normalized, lod)
      else:
        add_chunk(chunk_coord_normalized, lod)
      x_offset += 1
    y_offset += 1
    x_offset = - chunks_visible_in_view_dist

func add_chunk(chunk_coord_normalized: Vector2, lod: int):
  if unready_chunks.has(chunk_coord_normalized):
    return

  if not terrain_thread.is_alive():
    if terrain_thread.is_started():
      return
    var chunk_to_load = load_chunk.bind(chunk_coord_normalized, lod)
    terrain_thread.start(chunk_to_load)
    unready_chunks[chunk_coord_normalized] = 1

func load_chunk(chunk_coord_normalized: Vector2, lod: int):
  var terrain_chunk := TerrainChunk.new(chunk_coord_normalized)
  terrain_chunk.add_terrain(lod)
  call_deferred('load_done', terrain_chunk)

func load_done(chunk: TerrainChunk):
  terrain_container.call_deferred('add_child', chunk)
  chunks[chunk.chunk_position_normalized] = chunk
  unready_chunks.erase(chunk.chunk_position_normalized)
  terrain_thread.wait_to_finish()

TerrainChunk.gd

func add_terrain(lod: int):
  if lods.has(lod):
    return
  var offset := Vector3(position_v3.x, position_v3.z, 0)
  var terrain_node = terrain_scene.instantiate()
  terrain_node.level_of_detail = lod
  terrain_node.chunk_coord_normalized = chunk_position_normalized
  terrain_node.visible = false
  terrain_node.offset = offset
  terrain_node.name = str(lod)
  terrain_node.generate_mesh()
  lods[lod] = terrain_node
 
  lods_container.call_deferred('add_child', terrain_node)

TerrainScene.gd

func generate_mesh():
  noise_texture.noise.offset = offset
  var noise = noise_texture.noise
  var width := chunk_size
  var height := chunk_size

  var mesh_simplification_increment := 1 if level_of_detail <= 0 else level_of_detail * 2

  var plane_mesh = PlaneMesh.new()
  plane_mesh.size = Vector2(width, height)
  plane_mesh.subdivide_depth = height / mesh_simplification_increment
  plane_mesh.subdivide_width = width / mesh_simplification_increment

  var surface_tool = SurfaceTool.new()
  surface_tool.create_from(plane_mesh, 0)
  var plane_array_mesh = surface_tool.commit()

  var data_tool = MeshDataTool.new()
  data_tool.create_from_surface(plane_array_mesh, 0)

  for i in range(data_tool.get_vertex_count()):
    var vertex = data_tool.get_vertex(i)
    var vertex_height = noise.get_noise_2d(vertex.x, vertex.z) * 50
    var curved_height := height_curve.curve.sample(vertex_height / 50)
    vertex.y = curved_height * 50
    data_tool.set_vertex(i, vertex)

  for i in range(plane_array_mesh.get_surface_count()):
    plane_array_mesh.surface_remove(i)

  data_tool.commit_to_surface(plane_array_mesh)
  surface_tool.begin(Mesh.PRIMITIVE_TRIANGLES)
  surface_tool.create_from(plane_array_mesh, 0)
  surface_tool.generate_normals()
  generated_mesh = surface_tool.commit()
  mesh_instance.mesh = generated_mesh
  mesh_instance.material_override = terrain_material
  call_deferred('add_child', mesh_instance)

  if mesh_simplification_increment == 1:
    add_collision(mesh_instance)

func add_collision(mesh_instance: ArrayMesh):
  var collision = CollisionShape3D.new()
  collision.shape = mesh_instance.mesh.create_trimesh_shape()
  var static_body = StaticBody3D.new()
  static_body.collision_mask = 2
  static_body.call_deferred('add_child', collision)
  mesh_instance.call_deferred('add_child', static_body)

Any help is appreciated! Let me know if I can provide any more information.

Don’t create colliders from Godot’s mesh objects. For concave colliders, send your generated vertex data directly the collider using set_faces()

As said above, create_trimesh_shape just does this, but more expensive, because it “extracts” the triangles from your mesh (though I don’t know why would it cause stutters on a subthread…):

func create_collision_shape(verts: PackedVector3Array) -> ConcavePolygonShape3D:
	var shape := ConcavePolygonShape3D.new()
	shape.set_faces(verts)
	return shape

Also, in my experience you don’t have to use call_deferred to add_child as log as the parent is not inside the tree. So you can do most of the add_child calls on the sub thread. (I don’t know what’s the official stance on this, but it never caused any issue for me)

I always generate the “parts” first and parent them all to a single parent node on a subthread, then in the end add that one node to the tree on the main thread. Though if that one parent node has a lot of child nodes that can cause a hickup too.

1 Like

Once you submit surfaces to the ArrayMesh the vertex data goes to the gpu and no copy is kept on the cpu side. So whenever you retrieve vertex data from any Mesh object, the engine will have to get if from the video ram. This will likely cause stutters for large amounts of data, even if done from a thread.

2 Likes

Thanks! This was definitely a big performance improvement, but didn’t completely eliminate the stutters.

In doing some more debugging, it appears the stutter isn’t caused by the mesh/collider generation, but by the final terrain_container.call_deferred(‘add_child’, chunk) call to add the mesh and its collider to the main scene. I’m learning the physics server really doesn’t like adding a big collider all at once, even if it’s been created on a separate thread.

My solution is to cut the chunk size in half and double the render distance. I generate the same amount of area but with a greater number of less expensive chunks. This spreads out the add_child call over more time steps so I don’t exceed my frame time budget.

This solution would have solved my problem to begin with… but I learned a lot!

1 Like

You don’t need to cut the chunk size. Simply split the collider into several smaller ones and “stream” them. The actual chunk size can stay the same.

1 Like

Good idea. This was my first thought as well, but I’m a little sick of this problem and it’s a lot easier to divide an integer variable by 2 than it is to refactor my collider.

In case I do end up needing to increase the chunk size again, do you have any advice for implementing this?

Convex colliders are just triangle soups without any connectivity or volume information. You can literally split the triangle list into any number of sub-lists, even by randomly picking triangles from the main list, and add each sub-list as a collision shape every other frame. The collision should still work as if it was one big collider.

1 Like

Just a side note in case you don’t know about it, WorkerThreadPool makes it really simple to use multiple threads and spread the workload.

1 Like