When everybody shows up at once to play a game like Palworld and Helldivers 2, it’s like a thunderous stampede for the company trying to keep the games up in the face of unexpected popularity.
When this happens for a game, it’s a lot worse than just a concert ticket launch for a band like Bruce Springsteen, where the tickets for a show are gone in seconds. That’s because the game has to be supported by an army of servers, not just when players are downloading it all at once, but also when they’re trying to play for hours at a time.
It’s a complicated problem, and the launch of the recent hits like Helldivers 2 and Palworld shows that technological challenge of keeping a game operating as a live service isn’t solved yet. But there are vendors who swear that they have this problem under control, if only the game developers will offload their backend services to them. I interviewed a number of game launch service providers about this.
Pocket Pair’s Palworld, which was described by many as Pokemon with guns, got more than 25 million players in just a month, including two million on its first day. The result was that many players couldn’t get into the game as demand just kept growing.
GB Event
GamesBeat Summit Call for Speakers
We’re thrilled to open our call for speakers to our flagship event, GamesBeat Summit 2024 hosted in Los Angeles, where we will explore the theme of “Resilience and Adaption”.
Apply to speak here
“There’s that question how much opportunity cost do you have when those failures occur? And how many players will never give you that second chance?,” said Chris Cobb, CTO of Pragma, a third-party backend services company, in an interview with GamesBeat.
Pocket Pair’s Palworld used Epic Games for part of its solution, and it had to have an emergency meeting to resolve one of its problems. Palworld received backend support from Xbox, which owns Playfab. Xbox actively collaborated with the development team to enable dedicated servers, expedite updates, and optimize the overall gaming experience across Xbox platforms.
Arrowhead Game Studios’ Helldivers 2 used Microsoft’s Playfab. The explosive launch of Helldivers 2 placed a strain on the game’s servers.
As a result, players encountered stability and connectivity issues while trying to play. The servers reached capacity, leading to difficulties in joining the game and receiving mission rewards or purchasing new weapons. In both cases, the developers were small companies.
Even so, there are third-party service providers like Pragma, Hathora, Amazon Web Services, Google Cloud, Microsoft’s Playfab, Accelbyte Improbable, Coherence, Multiplayer, Unity, RallyHere, Hadean and more that have tried to tackle the problems of too many players.
But no one has published a post-mortem on what actually happened with those games and their backend services. Call of Duty uses an internal team dubbed Demonware. Roblox also has its own internal team to handle 72 million monthly active players. But not everyone is big enough to handle its own online services.
“By now, you would think as an industry we would be better than this,” said Mark Jacobs, CEO of Unchained Entertainment, who is attempting to build a game that can support a ton of players. “But it’s hard. Especially for teams who haven’t experienced that joy and sorrow and absolute chaos of overnight success, of unexpected success.”
Cobb added, “To be honest, I don’t think it’s gotten better. And I think the reason for that is, again, for a triple-A publisher, they’re rebuilding this from scratch every time.”
To get a feel for the state of the market in 2024, I talked to a range of experts at the third-party providers.
Reality check
So what’s the reality?
The concurrency problem kind of requires you to work smarter. Rather than buy expensive servers, the Google approach is to take a lot of cheap servers and write the software to spread across a lot of machines in what is known as horizontal scaling, Cobb said. It’s essential to architect the code horizontally from the start. You can do stress testing, but if you don’t have enough people doing that, you may never find the problem that only manifests when there are a million people on the game.
“That points to something that we talk to our customers a lot about when they engage with Google Cloud,” said Jack Buser, director of games, strategic partnerships, at Google, in an interview with GamesBeat. “Launching a game is very difficult. And it’s very complex. There are topics like concurrency, matchmaking, AI, crashes, cheating. Each one of those issues is worth of discussion.”
Buser believes that it’s important for game developers “to not to try to reinvent the wheel.” Yet he sees many companies going with a vendor who can handle part of the services and then try to do the rest on their own. The problem is what to do when your game succeeds beyond your wildest dreams.
“Every engineering hour you’re spending reinventing technology that already exists, you could spend making your game better and also preparing for the launch,” he said.
The time to address problems isn’t after the launch. It’s before, he said, with something like an infrastructure review. And it requires a combination of technology — having enough servers to handle the load of users — as well as the human experience of game developers who have faced these problems before.
Some games like Call of Duty and Battlefield have learned to stage their users. They have a VIP class of players who get early access to a game’s multiplayer play. The team can examine the gameplay and make adjustments by the time the masses of players sign on a week later. That spreads out the queuing.
“It’s a great technique and one of many tools in the toolbox,” Buser said.
Traffic issues can stem from the way the game developers architect the technology, or how they have leveraged third-party solutions, or how they have done capacity planning. Google has a team of experts that offer what they call “launch counsel.”
Google users its Kubernetes engine for scaling traffic and security. They look at load testing and the database, which is often the point of failure as it is cobbled together. For this, Google uses its Spanner database that it uses for its own internal live services.
“We built it because traditional databases couldn’t keep up with the velocity of our businesses at Google,” he said. “It was built for things like at scale, live services.”
Matchmaking is another category where Google built and open sourced a solution dubbed Open Match.
One of the most common reasons for games struggling with new player influx is inadequate server capacity. If the servers can’t handle the sudden surge in players, it can lead to connection issues, lag, and overall poor performance.
New games or updates may also have unforeseen technical issues that only become apparent when large numbers of players start using them simultaneously. Bugs, glitches, and compatibility issues can all contribute to a rocky launch.
One-in-a-thousand bug
Pragma’s Cobb said that a one-in-a-thousand bug could be a small problem in a single-player game. But online, it can bring the whole game down. Too often, he sees teams that don’t see the gravity of the decision they make to do the backend work themselves, rather than hand it over to (possibly expensive) backend services teams.
“If you have a one in a thousand bug on your game client, that means the game might crash for a player rarely,” Cobb said. “But if you have a one in a thousand bug on your backend at all, it is 100% guaranteed to happen at some point pretty quickly because you got a lot of players in there at once. And once that crashes, you’ve just knocked out every player in the whole game. The quality bar for the backend is higher.”
Normally, big companies are the ones who can afford the internal teams to do the customized work for particular game backends. But Cobb has seen big messes in the past like EA’s Anthem, and that’s at a big company.
“Azure has plenty of servers available, but the how the code is written and architected affects your ability to plug in another servers,” Cobb said. “We call that horizontally scalable, which means we can drop in another server and it just scales out. But if the software didn’t get written that way, you might have to spend six months or a year rewriting the software to do to make it work that in that manner.”
Sometimes internal game company engineers will want access to source code for the third-party provider’s tools. Unreal Engine grants them access to the source code. Other service providers may not do that, and that presents small companies with a dilemma.
Unchained Entertainment’s Jacobs, who will speak about this subject at our GamesBeat Summit 2024 event, wants to get to thousands in each battle. He’s using Azure, Google Cloud Platform and Amazon Web Services to help with the load of possible players coming into the multiplayer game.
But his own team is working on the challenge of getting lots of players into one space. Unchained Entertainment is making Final Stand: Ragnarok, an upcoming online fantasy game with massive battles, and it just went into early access yesterday.
Jacobs described how hard it is for so many companies to get multiplayer gaming right at the launch. Not so long ago, his team encountered the kind of one in a thousand bug that Cobb mentioned.
For years of development, he said the seasoned team had “avoided one of the dreaded multiplayer bugs, which is all of a sudden someone dies and their body disappears. And the game thinks they’re dead, right? And you see that in multiplayer game after game, and we had avoided it. We’re in good shape. We had a report of that. And I’m like, ‘Oh my god, no. Not now.’”
In testing, they saw the bug three times in a month of gameplay. But fortunately, they fixed the problem. The lesson is that sometimes bugs appear only have many large numbers of players engage with a game and push it to its limits. Much of the company’s work is figuring out how to accommodate more and more players inside a single massive online battle.
Underestimating demand
Jacobs said he is thankful he has veterans in place.
“But for a lot of teams, you get hit with this just crazy amount of traffic. And if you haven’t had it before, if you don’t have somebody on your team who’s had to deal with it before, you can be in a lot of trouble,” Jacobs said.
Others have similar ambitions. Jenova Chen, CEO of Thatgamecompany, announced last summer at Gamescom that the company’s Sky: Children of the Light had 10,061 people on screen at the same time in the same server for its latest in-game concert from superstar singer Aurora. And that enabled the company to officially win a Guinness World Record title of “Most users in a concert-themed virtual world.”
Thatgamecompany was able to do that by working with characters that were not particularly sophisticated when it came to 3D imagery. If characters were far from your position, you couldn’t really see them or the art quality was fuzzy. And the team limited chat communication to people who were near each other. So it made a lot of tradeoffs on the quality levels that made sense. The interactions were also quite limited in terms of the kinds of actions each character could take. But nobody complained because the characters were able to do cool things like fly through the air.
“Their work is a great example of clever game design,” Cobb said. “They simplified the technical challenges to accommodate a large number of players.”
Sometimes, developers may underestimate the level of interest or demand for their game, leading to a shortage of resources or infrastructure to support the influx of players.
Can you get a million players on day one?
“You certainly hope for it, but it’s really expensive to try to over engineer to plan for it, especially if you’ve just increased your budget by a considerable amount,” he said.
On another front, if the community management team is unprepared for the influx of new players, it can lead to chaos in forums, social media, and customer support channels. Lack of timely responses to player concerns or feedback can exacerbate the situation.
Poor communication from the developers about known issues, ongoing fixes, or estimated resolution times can frustrate players and exacerbate negative sentiments.
Advance preparation
Frost Giant Studio’s Stormgate is getting ready for its launch in the real-time strategy genre. It launched an open beta in February during Steam’s Next Fest, and it took advantage of that to test the servers of Hathora, multiplayer game server hosting company, in collaboration with Pragma’s backend engine.
Stormgate, dubbed by many as the spiritual successor to StarCraft II, is being built by an independent studio with a stated goal to build community and get feedback from players before their public launch. The test was able to determine the responsiveness of the gameplay. It test SnowPlay, which is Frost Giant’s in-house gameplay engine that includes rollback netcode. It attacked the problem of latency, or a lack of responsiveness in a game’s interactions.
“We’ve seen demand for multiplayer games is growing and the player base is demanding better and bigger types of multiplayer games, and what we see is that technology hasn’t really kept up,” said Siddharth Dhulipalla, in an interview with GamesBeat. “Trying to meet this demand is challenging. It is seen as a special art only accessible to the largest of teams with the most funding and the longest runway. We’re trying to drastically lower that barrier for multiplayer development.”
Pragma handled tasks such as matchmaking, cross-platform accounts, social, meta game systems, live operations, monetization, telemetry and analytics. Once Pragma handles matchmaking, it turns the game over for hosting on Hathora’s servers.
“There’s obviously a lot that goes into making a successful launch day. On the technology side, your login system needs to be working. Your matchmaker needs to be operational. Your live services, your telemetry, and our platform provides a very critical component, which is server hosting and orchestration,” Dhulipalla said.
The game turned out to be the No. 2 most-played game at Steam Next Fest. In a talk at DreamHack Atlanta last year, Frost Giant’s lead server engineer Austin Hudelson explained how Stormgate’s matchmaking leverages Hathora’s global regions, and they did that throughout Next Fest.
Frost Giant Studios made an early strategic decision to partner with Hathora and Pragma to handle their backend stack. In order to build the game with confidence it would work, the teams spent time using the connected suite of tech to build trust and confidence that the stack would be rock-solid for launch.
For Stormgate in particular, Pragma’s tech provided matchmaking and a ranking system that players saw in Next Fest, and Hathora waited for Pragma’s signal to spin up servers just in time for each match in the region that optimized each player’s ping.
Hathora’s role was similarly specialized. Our tech enabled Stormgate to launch the games globally on our network and automatically turn off servers that weren’t being utilized. This made all of the server provisioning and orchestration one less thing that the Stormgate team needed to worry about in advance of the launch.
When the game went live during Next Fest, there were team members from Hathora, Pragma, and Frost Giant all together in the war room to ensure that Stormgate had all it needed in the event of an unforeseen hiccup.
For load testing, launch days can be prepped months in advance. In the weeks prior to a launch, Hathora helps studios to run tests to build confidence that their launch day will scale to targeted loads smoothly. Hathora’s own tests showed it could scale the game to a million concurrent users. That builds confidence for the launch.
And Hathora said it recommends teams playtest their games directly on Hathora, where the game will go live. The company has servers in 10 regions around the world.
Customizing the backend
Besides Hathora, another backend-as-a-service company is Snapser, headed by CEO Ajinkya Apte. The company makes it possible to easily customize multiplayer game services. Customers can “snap in” their own services alongside prefabricated services. It splits the difference between one-size-fits-all services like Microsoft’s Playfab and custom services like Pragma or Accelbyte.
Apte earned his stripes in running massive games at Zynga, helping games like FarmVille get off the ground with tens of millions of players. He built a central team to handle the online services needed by Zynga’s games.
His solution at Snapser is to give game developers plenty of options for customizing their own backend services.
He said the matchmaker is a hot potato. As far as the backend is concerned, it’s very easy to scale up the services like leaderboards. The matchmaker is almost like an evolving organism. It’s never static. As the number of players grows, it just starts becoming a mathematical problem as there are more attributes as there are more things to figure out between players, Apte said.
One of the problems develops as a kind of sheep herd problem. When one player can’t get in a match quickly, that player will log out and log back in. That leaves the first match in a lurch, requiring it to seek another player. Then the other player logs back in and makes the problem more complicated. It has a cascading effect in delaying matches from starting because of matchmaking problems.
“Our architecture is completely different. What we do is we build Lego building blocks, and we are all fully containerized, microservices driven framework. And so you will actually see literally people just adding custom code to Snap,” he said.
So far, five games have launched on the Snap platform, which is going to be an end-to-end ecosystem for the backend.
Future problems
Are there other problems to solve besides the volume of users arriving to get into a multiplayer game? Yes, there’s cheating, toxicity and figuring out how to get more players into the same server, or shard, at the same time.
Companies like Unchained Entertainment are trying to solve that problem in PC games with massive battles, while Thatgamecompany is trying to solve it across multiple platforms with massive concerts for the game Sky: Children of the Light. Each company has to prioritize different things — like speed, concurrency, matchmaking, interactivity and 3D imagery detail in order to strike the right balance.
“We set out right from the beginning to have an engine that was to do one thing better than anyone else. And that is to deliver large scale battles,” said Jacobs. “That that has been a mantra that I keep repeating. And, you know, that’s the one thing that nobody else has proven they can do in a game.”
Cobb thinks that it’s more fun to have small groups of players in their own instances, rather than tons of players in a single instance. That’s more like real life, except in situations like concerts. He doesn’t think it’s more fun to have lots of players in a single space. And it’s also just too hard a problem to solve.
“My adage is the most fun game is the on that you’re able to play,” he said.
Sometimes it takes new infrastructure, like improvements in bandwidth and latency at companies like Comcast Cable, to address the problems. Having good hardware matters, said Dhulipalla at Hathora.
“For us, obviously, computers a huge part. But on top of that, the network layer is also very important. So we’ve actually benchmarked a bunch of the public infrastructure providers. And we’ve chosen ultra premium networks, specifically ones that offer edge acceleration,” said Dhulipalla .
Riot Games tried to build alternative internet infrastructure, as did Subspace (the latter shut down).
But one thing is clear. Before we move on to the metaverse, we have to solve a lot of these problems in gaming first.
GamesBeat’s creed when covering the game industry is “where passion meets business.” What does this mean? We want to tell you how the news matters to you — not just as a decision-maker at a game studio, but also as a fan of games. Whether you read our articles, listen to our podcasts, or watch our videos, GamesBeat will help you learn about the industry and enjoy engaging with it. Discover our Briefings.
اكتشاف المزيد من موقع شبرون
اشترك للحصول على أحدث التدوينات المرسلة إلى بريدك الإلكتروني.