P2P Networked Physics Inaccuracy. Seeking Help

Godot Version

Godot 4.2.1

Question

Greetings everyone,

I’m currently developing a P2P multiplayer game in Godot utilizing WebRTC. I have successfully set up most of the multiplayer synchronization and authority. In my test scene, I have two connected peers and a ball (all of them being 2D RigidBodies), with the ball controlled by the main/host peer.

The issue I’m facing is occasional jittering of the ball’s position, which seems to be related to the correct_error function. I got this script from a helpful user on this forum and adapted it for a RigidBody2D (special thanks to @pennyloafers). The script pulls the physics object on the main peer, gathers position, rotation, and velocities into an array, and syncs that to the client. The client receives the array and updates its local physics object in reverse, extracting from the array and placing it onto the physics server object.

I’m creating this topic because I’m somewhat confused about what steps I should take to address this issue. The solutions I’ve attempted so far haven’t produced satisfactory results. Should I implement interpolations in my script? What steps can I take to resolve this?

Demonstration

The larger screen represents the host, which operates normally without jittering. However, jittering and inconsistencies are noticeable in the smaller one.

My Script

BallSynchronizer.gd

extends MultiplayerSynchronizer
class_name PhysicsSynchronizer

@export var sync_bstate_array : Array = \
	[0, Vector2.ZERO, Vector2.ZERO, Vector2.ZERO]

@onready var sync_object : RigidBody2D = get_node(root_path)
@onready var body_state : PhysicsDirectBodyState2D = \
	PhysicsServer2D.body_get_direct_state( sync_object.get_rid() )

var frame : int = 0
var last_frame : int = 0

enum { 
	FRAME,
	ORIGIN,
	LIN_VEL,
	ANG_VEL,
}


#copy state to array
func get_state( state, array ):
	array[ORIGIN] = state.transform.origin
	array[LIN_VEL] = state.linear_velocity
	array[ANG_VEL] = state.angular_velocity


#copy array to state
func set_state( array, state ):
	state.transform.origin = state.transform.origin.lerp(array[ORIGIN], 0.5)
	state.linear_velocity = state.linear_velocity.lerp(array[LIN_VEL], 0.5)
	state.angular_velocity = lerpf(state.angular_velocity, array[ANG_VEL], 0.5)


func get_physics_body_info():
	# server copy for sync
	get_state( body_state, sync_bstate_array )


func set_physics_body_info():
	# client rpc set from server
	set_state( sync_bstate_array, body_state )


func _physics_process(_delta):
	if is_multiplayer_authority() and sync_object.visible:
		frame += 1
		sync_bstate_array[FRAME] = frame
		get_physics_body_info()


# make sure to wire the "synchronized" signal to this function
func _on_synchronized():
	correct_error()
	# is this necessary?
	if is_previous_frame():
		return
	set_physics_body_info()

#  very basic network jitter reduction
func correct_error():
	var diff :Vector2= body_state.transform.origin - sync_bstate_array[ORIGIN]
	# correct minor error, but snap to incoming state if too far from reality
	if diff.length() < 10:
		sync_bstate_array[ORIGIN] = body_state.transform.origin.lerp(sync_bstate_array[ORIGIN], 0)


func is_previous_frame() -> bool:
	if sync_bstate_array[FRAME] <= last_frame:
		return true
	else:
		last_frame = sync_bstate_array[FRAME]
		return false

Synchronizer Settings

Screenshot_20240130_212322

Screenshot_20240130_212254

1 Like

Nice, I like the progression. I have a similar issue on my end with the same script and I plan to make a 2.0 version of this. But with my schedule… it will be a month or two.

I think we are at the whims of the cpu to make sure we get a game update every frame. But we are probably sometimes getting a game state at an inconsistent rate. Greater or less then a 16ms (60fps) frames rate. Meaning there is a “network” jitter caused by inconsistent packet handling, even on the localhost.

One remedy could be to buffer incoming game states and then on the client physics cycle apply the oldest game state. Making sure that the updates happen on the 16ms mark. The draw back is that we add latency to the client for every frame we buffer.

This is an extreme change…

I will admit the correct error is basic, and it is lerping position every frame, so we should try to get away from that. We should let the local physics do it’s thing and make corrections smarter.

Maybe we could say only start correcting if greater then 10, and don’t assign the sync states if less then 10 units of error. Just let the local physics work.

Just so you know networking is hard… If I have a chance to improve the script myself I’ll let you know how it goes.

Did you by chance give this article a read?

1 Like

@suero
btw, I’m starting on the improvement process now.

I will be deep diving into some specifics about ENET reliable rpc to understand if delta synchronization could be utilized as an improvement for bandwidth.

I will be refactoring the code to make it thread safe, because if anyone decides to use the “physics on a separate thread” option, this dose not work atm. (im also speculation if this is the source of the issue, the timing of the sync could be undoing local physics process.)

I will probably move away from the singular array property, because a MultiplayerSynchronizer will pack all synced properties together anyway. This should be a small performance improvement as the code will not need to handle an untyped array.

I’m going to take the time to understand and test the stutter/jitter.

maybe bonus I will implement better state handling for general network quality issues in the process.

1 Like

That’s great! Wish luck on your improvement process, I’ve been trying some other things but didn’t worked out well sadly, I’ll still be trying some other things and keep this topic update of a good solution aswell.

So I’m still analyzing, but I think I see a pattern at least.

I created an isolated environment scene in Godot with two
SceneMultiplayer branches. This provides separate multiplayer api’s for the children nodes. Effectively running host and client in the same SceneTree and viewport.

I had to disable collisions on my test object because the two peers share the same world3D. This on paper doesn’t sound ideal, but I have successfully been able to control the issue.

If I run one instance of the test everything works perfectly. And there is a slight error offset between the host and client bodies. This is to be expected as there will be a frame delay for the RPC to happen.

If I run two instances (windows) of the test, things get weird. The first instance runs perfectly, but the second instance exhibits the choppy/jittery issue!

So what is going on?

At first I thought it could be a render issue, but I have data showing the test objects positions are actually jumping around. So I think that is mostly ruled out.

I think the investigation will be diving into the Godot main loop, Enet packet peer, and potentially the OS net layer of ‘localhost’.

If I can pin down the culprit, I hope to be able to resolve in the script, without having to submit a ticket to Godot.

Scratch that as I was thinking about this I decided to randomize the port for each instance and the problem went away. I think there was some net collisions because the same port was used for the two host-and-client tests.

I don’t think this is the case for a traditional multiplayer setup when client and host are on separate windows.

2 Likes

I’ve done a lot of digging and I think I can confidently say that the issue resides in the jitter of packets between the two process windows.

I’m glad i made the single window test, because it became as sort of a control giving the most ideal scenario for the net code. With a single window multiplayer there is little to no packet jitter. Packet updates happen on time at 60hz. but when I setup a two window multiplayer the packet jitter gets a little wonky, fluctuating multiple frames (2-4) sometimes between packets.

This causes the jump, since we haven’t received frames from the server and the local physics has carried on, but once the “expired” frame has arrived the current code uses it and pulls the body back to the “expired” frame state.

according to the resource i’m following we need to implement a buffer of 4-5 frames so we can space the packets out evenly. this can get a little complicated because two machines may not run at exactly the same rate. so occasionally we may need to shift a frame. maybe we don’t care and allow for an occasional snap.

another thing regarding snaps and pops of bodies, is that the physics simulation needs to be the same on both ends. i.e. the forces need to be the same, like friction.

I will be working on an implementation, but I feel like implementing a lot of this behavior in a single script may not scale well for more complicated games.

There are other things like bandwidth optimizations as well, which isn’t very effective if we are doing it on a per node basis, and probably should get lower in the API to do more effective optimizations.

2 Likes

Here is the refactored code

PhsicsSynchronizer Code
class_name PhysicsSynchronizer
extends MultiplayerSynchronizer

@onready var sync_object : PhysicsBody3D = get_node(root_path)
@onready var body_state : PhysicsDirectBodyState3D = \
	PhysicsServer3D.body_get_direct_state( sync_object.get_rid() )
@export var sync_pos   : Vector3
@export var sync_lvel  : Vector3
@export var sync_avel  : Vector3
@export var sync_quat  : Quaternion
@export var sync_frame : int = 0

var ring_buffer:RingBuffer = RingBuffer.new()

var last_frame = -1
var set_num = 0

enum {
	ORIGIN,
	LIN_VEL,
	ANG_VEL,
	QUAT, # the quaternion is used for an optimized rotation state
}


#func _ready():
	#synchronized.connect(_on_synchronized)

func _exit_tree():
	ring_buffer.free()

#copy state to array
func get_state(state : PhysicsDirectBodyState3D ):
	sync_pos = state.transform.origin
	sync_quat = state.transform.basis.get_rotation_quaternion()
	sync_lvel = state.linear_velocity
	sync_avel = state.angular_velocity


#copy array to state
func set_state(state : PhysicsDirectBodyState3D, data:Array ):
	state.transform.origin = data[ORIGIN]
	state.linear_velocity = data[LIN_VEL]
	state.angular_velocity = data[ANG_VEL]
	state.transform.basis = Basis(data[QUAT])


func get_physics_body_info():
	# server copy for sync
	get_state( body_state )


func set_physics_body_info():
	# client rpc set from server
	var data :Array = ring_buffer.remove()
	while data.is_empty():
		return
	set_state( body_state, data )


func _physics_process(_delta):
	if is_multiplayer_authority():
		sync_frame += 1
		get_physics_body_info()
	else:
		set_physics_body_info()


# make sure to wire the "synchronized" signal to this function
func _on_synchronized():
	if is_previouse_frame():
		return
	ring_buffer.add([
		sync_pos,
		sync_lvel,
		sync_avel,
		sync_quat,
	])


func is_previouse_frame() -> bool:
	if sync_frame <= last_frame:
		print("previous frame %d %d" % [sync_frame, last_frame] )
		return true
	else:
		last_frame = sync_frame
		return false

class RingBuffer extends Object:
	const SAFETY:int = 1
	const CAPACITY:int = 4 + SAFETY
	var buf:Array[Array]
	var head:int = 0
	var tail:int = 0

	func _init():
		buf.resize(CAPACITY)

	func add(frame:Array):
		if _increment(head) == tail: # full
			_comsume_extra()
		if is_low():
			_produce_extra(frame)
		buf[head]=frame
		head = _increment(head)

	func _comsume_extra():
		#print( "RingBuffer: consume_extra")
		var next_index = _increment(tail)
		buf[next_index] = _interpolate(buf[tail], buf[next_index],0.5)
		tail = next_index

	func _produce_extra(frame:Array):
		#print("RingBuffer: produce_extra")
		var first_frame = _interpolate(buf[tail],frame, 0.33) # assume only one frame exists tail should point at it
		var second_frame = _interpolate(buf[tail],frame, 0.66) # assume only one frame exists tail should point at it
		buf[head]=first_frame
		head = _increment(head)
		buf[head] = second_frame
		head = _increment(head)

	func _interpolate(from:Array, to:Array, percentage:float) -> Array:
		var frame:Array = [
			from[ORIGIN].lerp(to[ORIGIN], percentage),
			from[LIN_VEL].lerp(to[LIN_VEL], percentage),
			from[ANG_VEL].lerp(to[ANG_VEL], percentage),
			from[QUAT].slerp(to[QUAT], percentage)
		]
		return frame

	func _increment(index:int)->int:
		index += 1
		if index == CAPACITY: # avoid modulus
			index = 0
		return index

	func remove() -> Array:
		var frame : Array = buf[tail]
		if is_empty() or is_low():
			frame = []
		else:
			tail = _increment(tail)
		return frame

	func is_empty() -> bool:
		return tail == head

	func is_low() -> bool:
		return _increment(tail) == head

Make sure to connect synchonized signal to _on_synchronized
and add the following props to the replication config.

It removes “the correct_error” function that was never great to begin with and adds a ring buffer to manage frames from the server.
it can be tweeked but it is recommended that it stay at 4+1 at a minimum. This is because I tried to make it lock-free by keeping a space between the head and the tail. (this was never tested)

some ring buffer features:

  • if the server packets are coming in too fast we will consume the frames faster to catch up, by interpolating two frames at the tail and dropping the most oldest frame.
  • if the server is going to slow we will “stretch” two known frames into 4. if we only have one frame in buffer and a new frame coming in, we interpolate two extra frames between the tail and the newest frame. one at 33% and one 66%, with the newest frame being at 100%. (this should be tweaked depending on the buffer size)

this is all to maintain a healthy amount of frames without falling behind or getting ahead, but could be changed based on needs.

I profiled the change and could not detect a meaningful change +/- < 1% .

There are still some jump occurances now and then but overall it is better.

1 Like

Hello, sorry for being inactive, I got delayed on the development, but ASAP I’ll try your new script and read all the thing you send to understand it perfectly, tysm for persisting on helping me!

No worries, I benefit from this as well. :grin:

I think it actually works pretty great, but I’ve started to make some tweaks on my end because of some minor consequences.

Like instead of adding interpolated frames when buffer is low on init, I just padded the buffer with empty array frames that are skipped when consumed. this should all hopefully allow some real frames to buffer naturally without making up frames. I also got rid of the is_low check and allowed it to be empty returning an empty array frame. (And at least from testing it is very smooth now)

I did keep the consume extra since I want to stay current with the host.

The drawback of tha padded frames though is that newly spawned objects are in a default location briefly. I didn’t figure out a way to fix this… At least not yet. So instead I just hid them from view for for a few frames.

I also made an optimization for physics objects that can go to sleep will reduce their sync intervals, saving bandwidth.

I can share these changes if you like… i Need to go to bed. Although at least for your soccer game some of these things probably won’t apply.

1 Like

You could initialize all the buffer’s frames to contain a single position when an object spawns, or disable interpolation code for the first frame or two after an object spawns.

1 Like

So, I’ve tested your script, and things really seem more accurate, but the problem now is about bodies interacting with each other. Now, only the host can push the ball properly, also, players can’t push themselves, any suggestion?

Video demonstration:

If you want more details or talk better with me you can call me on Discord or Telegram.

discord: @sueroo
Telegram: @Suerow

I need to do more investigation on this. I don’t remember where I checked in the order-of-operations, but I have the recollection that the client MultiplayerSynchronizers was getting synchronized callbacks with a position of zero for a few frames. I thought this could be due to some physics object initialization coming from the hosr, but even if I added a node3d position to the replication it still didn’t sync immediately after spawn.

I will need to check again, but I had the feeling I would ultimately have to compromise with some timing aspect on the client.

1 Like

Hmm, that is interesting…

I guess it’s a little hard to wrap my head around thinking about individual authority over physics objects… And honestly my implementation works for be because I’m doing an central authority server. They may not be compatible in this regard.

Let me think about that for a bit.

1 Like

Okay, sure. lmk about any question you may have

So, the two peers are sending frames about their RigidBody to each other, each with a respective buffer on the opposite peer machines. So lets take the point-of-view of peer A. So peer A is local and peer B will be remote.

When the peer A collides with stationary peer B. Peer A hits with momentum, but the buffered peer B still has frames states before the collision. So from Peer A’s pov it has hit a brick wall and all momentum goes to zero.

From peer B pov. Peer A approaches and stops at the point of contact, with no force to push B anywhere because A’s velocities went to zero.

So I think the issue is that the we don’t capture the forces applied. which would attempt to move the remote players instances.

Godot classes don’t provide a means to get the “tangential” forces that you may apply frame-to-frame with input. We could get and set constant forces (but may require a rework of your input system and we probably don’t want to do this anyway because if a player’s connection is spotty could cause erratic physics behavior. as forces will be applied continuously ),

So I think the best option would be to also pass input to the remote replication to add expected forces you would apply on the local side. so both local and remote see the same physics.

as I think about your soccer ball… since I think I suggested that one peer should take authority of it. if the host kicks the ball at a remote peer it will probably act like a it hit the remote peer like a brick wall too. So i think we would need to get creative on how to pass the forces to the remote players.

(at least from the video it seems like first contact pushes the remote peer, but cannot continuously push because velocities go to zero and we are not synchronizing the applied forces. So maybe at least for the soccer ball it works as intended? I do see you kick from the match authority to the remote peer and it moves abruptly.)

Well, this approach is starting to get complex for me, I’ll probably need to take a time and do a deep research and study about all that.

So, I requested help from some ppl in the past and I have alot of feedbacks for this issue, if you want to keep helping me I think if I share those with you it can be helpful, if u want just hmu on those contacts I sent earlier.

Anyways thanks for keeping in touch till today, I’ll see what I can do but I’ll probably focus on other things about the game and trying to solve this big problem about the networked physics as a side quest XD

2 Likes

Sounds good. Multiplayer isn’t easy. You could also look into the asset library for more feature filled multiplayer api extensions that could take care of all these nuances of net code and state synchronizations.

If you want to implement:
To implement it I would create a separate MultiplayerSynchronizer that handles syncing the player input. But to do it cleanly you need to separate your _input from your _physics_process and form raw input values that will be used in the process function. Then send the raw input values to the remote instance to be processed in the same way.

You will also probably need to buffer the input syncer so that the physics state is time aligned with a corresponding input state.

You could also extend this physics syncer to just pull the raw input values off the rigidbody root node.

2 Likes

The problem is that I already tried a Godot add-on for multiplayer called netfox, it did the same things you are telling me to do, but the add-on doesn’t support rigidbodies2D, I talked with the dev of this addon for a week or so and he told me about a “lack of engine support” for this type of situation.

Should I still try your implementation? Because, ppl tell me to do alot of things, others say is not possible, so I’m getting very confused XD