Stardock CEO Brad Wardell exhaustively details the whys and hows behind Demigod's crippling networking issues in a blog post aptly titled "Demigod: So what the hell happened?"
The early release and subsequent rampant piracy of Gas Powered Games' action real-time strategy title were just the beginning of the networking problems that Demigod has been plagued with since release. Some bad networking decisions and assumptions on Stardock's part caused the issues to drag on for weeks. It seems the way they had things set up initially caused far too many sockets to try to open at once, due to a late 2008 decision to have the network library hand off sockets to the game, rather than have all the connections handled by one source. Wardell's example almost makes it understandable.
...on launch day, Alice would host a game. Tom would be connected to Alice by the network library and then that socket would be handed to Demigod. Then, Alice and Tom would open a new socket to listen for more players to join in. As a result, a user might end up using a half dozen ports and sockets which some routers didn't like and it just made things incredibly complex to connect people and put a lot of strain on the servers to manage all those connections.
So what was happening? When I tried to play, two or three players would connect successfully and then the slow, agonizing wait would kick in. Brad explains why:
Alice hosts a game. In doing so, she sends a message to the NAT server (as well as our servers). Tom wants to join so Tom clicks join and it tells the NAT server to begin connecting them. But, it turned out that a relatively small number of people online at once would quickly result in a huge delay in messages being sent back and forth. For instance, when Tom clicks join it sends a message to the server to tell it to start connecting Tom and Alice. But Alice might not get that message for 30 or 40 seconds. That means, for that entire time, Tom and Alice are "attempting to connect" but haven't even really started because Alice hasn't even gotten the message. As more people tried to join the game, that delay could get worse and worse. If someone left the game, it could take that amount of time for the server to realize that player had left (meanwhile it was trying to connect them).
At this point, the people inside the room waiting for the game to start will have resorted to cannibalism, and most of them weren't even hungry. It's just that frustrating.
Brad goes on to detail the changes that have been made and will be made as they continue to polish the network experience, plans for downloadable content, and an eventual demo, but most importantly, his post leaves us with the important lesson Stardock learned from releasing Demigod.
We've learned that you can't treat networking as just another thing to plug in like you would a sound library or even a 3D engine. It's a whole different animal. With Elemental (our next game), it's single-player focused but its MP will be server based (and I mean we literally host the game). After Demigod, I don't ever want to hear the words "socket" or "port" again.
Demigod: So what the hell happened? [Brad Wardell's Impulse Blog]