|Posted by JordanFarr on December 30, 2013 at 10:55 PM|
This post is a bit of a change of pace. As far as these go, we usually like to showcase features that we've been working on, but I thought it would be interesting for those thinking about making a networked game to talk about some of the potential problems and solutions that come up regularly when attempting something like this. The example I'd like to share with you guys is what we have been calling the unfixable break.
First, a little background: A break is a breach in the submarine’s hull caused by anything from the bite of a shark to the slamming of a tentacle into the submarine. These breaks constantly spew water into the sub’s interior, which fills gradually and can inevitably drown your characters. In order to prevent the sure drowning death, players can take out a wrench and literally whack the break until the hole is fixed. Once all breaks have been repaired, water begins to drain out of the interior, and the players inside are safe—at least from that threat.
But what if you can’t fix a break?
Worse yet, what if you can’t fix a break, and on someone else’s screen, that break is completely repaired? What we have here is a gigantic desync nightmare, which has in testing led to a player dying on one screen and living on another. Based on how we work with death, this created a “ghost” of the character who had died on one screen, popping in for one frame and disappearing everytime a keypress message came in. While this bug, among many others, is hilarious, we had to find a way to prevent this from happening.
One thing I’ve avoided saying thus far is what actually causes a break to be unfixable in the first place. Because we work on a slightly loose client-server model, the player who acts as the server is authoritative on breaks’ creation and deletion for all players ingame. The short version of why a break would become unfixable can be summed up in two words: packet loss. As it stood only a few versions ago, when a client nearly finished repairing a break, that break would only fix when and if the server believed it to be fixed, at which point the server would issue a “destroy this break object” command. The server, also carrying a local client, would receive its own mass-issued message, immediately delete the break object, and stop caring about it. But what if one of the other clients didn’t receive that message? The server wouldn’t carry any more information on that break, and would never send forward that crucial packet again.
A Pretty Nice Fix
Now, I’m not going to say I’m all-knowing on networking junk (in fact, this is the first multiplayer game I've ever attempted), but I was pretty satisfied with this fix. To remedy the unfixable break, I took out the server’s immediate response to what they saw as a break ready to send the “destroy” message about. Instead, I gave the client whose player is closest to a break object ready to be destroyed the authority on its fixing. That is to say that if you are the player closest to a break ready to be fixed, you will contact the server and request it to mass-message the “destroy this break object” command. If you don’t receive a reply between 0.5 and 1 seconds after your request, you’ll ask again. Finally, if another player does not receive this message, when they approach the break and attempt to fix it, they too will make that request for the fix—and will wait the same amount of time before asking again. Using this model, there is nearly no scenario in which a break object cannot be fixed, even when desync occurs, save for total disconnection.