More efficient GPUParticles3D collision detection on the CPU?

Godot Version

4.3

Question

I want to detect when particles in a GPUParticles3D system enter/collide with a certain area, in order to drive other game logic. This isn’t straightforward, as the particle position data only exists in GPU memory, so it normally doesn’t interact with the physics system or CPU code at all. However, it is still technically possible. I have a naive solution for which I’m hoping to get improvements, on any margin of efficiency/performance/elegance.

Naive solution

Essentially, clone the GPUParticles3D (and any affecting attractors) into a SubViewport with its own World3D, but replace the spatial shader that normally draws the particle system with a special one that does the collision check in the vertex path and writes the answer to the ViewportTexture by drawing specifically colored pixels. Then access pixels with get_image() to copy the answer back to CPU code. As long as the cloned particle system doesn’t diverge too much from the real one, the collision checks should match the visuals.

Example shader with a simple bounding box defined by uniforms:

shader_type spatial;
render_mode unshaded, depth_draw_never, cull_disabled;

// Uniforms that define the bounding box in world space.
uniform vec3 bbox_min;
uniform vec3 bbox_max;

void vertex() {
    // Compute the particle’s center in world space.
    // (This assumes that your particle’s mesh is centered at the local origin.)
    vec3 particle_center = (MODEL_MATRIX * vec4(0.0, 0.0, 0.0, 1.0)).xyz;
    
    // Check if the particle’s center is within the provided bounding box.
    bool inside = (particle_center.x >= bbox_min.x && particle_center.x <= bbox_max.x &&
                   particle_center.y >= bbox_min.y && particle_center.y <= bbox_max.y &&
                   particle_center.z >= bbox_min.z && particle_center.z <= bbox_max.z);
    
    if (inside) {
        // If the particle qualifies, we override its vertex positions so that
        // regardless of its original quad, it covers the full clip space (i.e. the whole screen).
        // This assumes exactly 4 vertices per instance.
        if (VERTEX_ID == 0) {
            POSITION = vec4(-1.0, -1.0, 0.0, 1.0);
        } else if (VERTEX_ID == 1) {
            POSITION = vec4( 1.0, -1.0, 0.0, 1.0);
        } else if (VERTEX_ID == 2) {
            POSITION = vec4(-1.0,  1.0, 0.0, 1.0);
        } else {
            POSITION = vec4( 1.0,  1.0, 0.0, 1.0);
        }
    } else {
        // Otherwise, move the geometry far off clip space so it isn’t rendered.
        POSITION = vec4(2.0, 2.0, 2.0, 1.0);
    }
}

void fragment() {
    // Output pure red color.
    ALBEDO = vec3(1.0, 0.0, 0.0);
}

And supporting gdscript to do the cloning and pixel test:

extends Node3D

# Exported node paths for references in the editor.
@export var real_particles_path: NodePath        # The real GPUParticles3D in the scene.
@export var real_attractor_path: NodePath          # The real attractor node.
@export var target_collision_box_path: NodePath    # a node for the particle target, assumed to be a 1x1x1 cube around its origin.
@export var debug_label_path: NodePath             # A Label node to show "Target Hit: true/false".

# Variables to hold the real nodes.
var real_particles: GPUParticles3D
var real_attractor: Node3D
var target_collision_box: Node3D
var debug_label: Label3D

# Cloned proxy nodes that will live in the offscreen SubViewport.
var clone_particles: GPUParticles3D
var clone_attractor: Node3D

# The SubViewport (and its container) for offscreen rendering.
var sub_viewport_container: SubViewportContainer
var sub_viewport: SubViewport

# Path to your special testing shader resource.
var test_shader_path := "res://scenes/world_grab_demo/particle_test.gdshader"
var test_shader: Shader
var test_shader_material: ShaderMaterial

func _ready():
	# Get the real nodes from the exported paths.
	real_particles = get_node(real_particles_path) as GPUParticles3D
	real_attractor = get_node(real_attractor_path) as Node3D
	target_collision_box = get_node(target_collision_box_path) as Node3D
	debug_label = get_node(debug_label_path) as Label3D

	# Create a SubViewportContainer and a SubViewport.
	sub_viewport_container = SubViewportContainer.new()
	add_child(sub_viewport_container)
	sub_viewport = SubViewport.new()
	sub_viewport_container.add_child(sub_viewport)
	# Set the SubViewport resolution to a very low value (2x2 pixels).
	sub_viewport.size = Vector2i(2, 2)
	sub_viewport.render_target_update_mode = SubViewport.UPDATE_ALWAYS
	sub_viewport.own_world_3d = true
	
	# Add a Camera3D to the SubViewport
	var camera = Camera3D.new()
	sub_viewport.add_child(camera)

	# Clone the particle system and attractor. Using duplicate() with default flags
	# (which duplicates recursively) so that the clones have similar structure.
	clone_particles = real_particles.duplicate() as GPUParticles3D
	clone_attractor = real_attractor.duplicate() as Node3D

	# Add the clones to the SubViewport so they render in its world.
	sub_viewport.add_child(clone_particles)
	sub_viewport.add_child(clone_attractor)

	# Load the testing shader and assign it to the clone particle system.
	test_shader = load(test_shader_path)
	test_shader_material = ShaderMaterial.new()
	test_shader_material.shader = test_shader
	
	# Create a copy of the draw_pass_1 mesh
	var mesh_copy = clone_particles.draw_pass_1.duplicate()
	clone_particles.draw_pass_1 = mesh_copy
	
	# Apply the shader material to the copied mesh
	clone_particles.draw_pass_1.surface_set_material(0, test_shader_material)

func _process(_delta):
	# Synchronize the global transforms of the clones with the real objects.
	clone_particles.global_transform = real_particles.global_transform
	clone_attractor.global_transform = real_attractor.global_transform

	# update the shader parameter for the target bounds.
	test_shader_material.set_shader_parameter("bbox_min", target_collision_box.global_position)
	test_shader_material.set_shader_parameter("bbox_max", target_collision_box.global_position + Vector3(1, 1, 1))

	# Get the rendered image from the SubViewport.
	var viewport_tex := sub_viewport.get_texture()
	if viewport_tex:
		var img: Image = viewport_tex.get_image()
		if img:
			# Check only the first pixel in the 2x2 render output.
			var col: Color = img.get_pixel(0, 0)
			# Our testing shader outputs red (1,0,0) for particles inside the box.
			var hit := col.r > 0.9 and col.g < 0.1 and col.b < 0.1
			# Update the debug label to show whether the target is hit.
			debug_label.text = "Target Hit: " + ("Yes" if hit else "No")

What you have implemented here is basically a one pixel camera. It’s a shame you have to replicate the whole GPUparticle system to do it.

I solved this in a gamejam by adding in a second small orthogonal camera into the scene with a very short field of view and a low framerate (and no shadows). This has the advantage of not requiring a clone of the whole GPUparticle system.

I can’t think of any better way of doing it. In a sense any point collision GPUparticle detection system is going to work like a camera.

But what other effects could we do, other than detecting impacts?

We could leave “marks” on a surface from the landing of the particles. If the surface was flat, then its texture could match the plane of an orthogonal camera with a very short far plane (less than 1mm), where you don’t clear the buffer between frames. I wonder if this could be a general trick where you have a wall where anything that hits it (GPUparticle or not) leaves a smear.

Hmm, you’re right that a manual bounds check in the vertex shader is essentially the same as clip space frustum culling, and an orthogonal camera projection does indeed give you a box. So if you ever only need a single camera frustum-sized collision box, the fixed function pipeline is probably as optimized as you’d get. I needed multiple collision boxes in my own gamejam game for this, so it wasn’t possible there.

It’d be nicer if could mask each of a particle system’s draw passes separately, so you could still have a ‘real’ draw pass for visuals and an cut down draw pass for collision checks while still only running the particle system process shader once. AFAICT though you can only set the VisualInstance3D’s layer on the particle system as a whole.

I think it’d be ideal if you could access the buffer with the particle system’s positions directly in a compute shader (to then do whatever collision testing logic you want without having to fit it in the fixed function pipeline). I think that buffer is in particle_storage.h’s Particle struct , but I don’t know godot’s internals well enough to say if it’s possible to access that without deeper code changes.