Taking Games' Commentary Beyond Repeat Performances

Even if your wideout didn't see that linebacker coiled up between the hashmarks in Madden, ready to hit him like a lifetime supply of bad news, anyone who's played the game can quote from memory what comes next.

It's when Cris Collinsworth saunters up like your uncle with three or four in him before Thanksgiving dinner, hangs a hammy arm over your shoulder and dislodges something sounding like a Rotary Club anecdote that everyone heard a zillion times by 1985.

"Let me tell youuuuuu," Collinsworth assures yet again, "you haven't lived until you've caught one of those high hard ones and had one of those clowns flip you upside down." It's a piece of dialogue so comically overplayed it's a wonder Collinsworth hasn't been autotuned into a Steve Porter remix by now.

I think Collinsworth's an outstanding commentator - real life or in Madden. Where most analyst voice talents end up doing an impersonation of themselves in video games, Collinsworth's lines, unto themselves, do come out with Sunday authenticity. He just sounds out of it because, given the limitations of either the technology or the cost of development, sports games apply a brute-force approach to the dialogue: record a bunch of lines and then apply them to the action, a kind of rhetorical card trick that disagrees with the live broadcast standards that games seek to emulate.

Is there any solution, without completely devoting the game's disc to a commentary engine? Well, maybe - and, please, stop me if you've heard this one before - it's not what you say, but how you say it.

That's one of the propositions of Phonetic Arts' "PA Commentator" software, something that's been likened to "the vocal equivalent of motion capture," for its potential transformative effect, the company's cofounder and CEO says. It's not only a more sophisticated engine for managing a game's dialogue tree - making "X gets the ball to Y" sound more natural - it'll manipulate the audio itself to put different inflections on the same recorded lines, so that a player might hear the same words but not the same commentary.

"Games have never been perfect; they've never been complete models for reality," Paul Taylor, Phonetic Arts' cofounder, said from the studio's headquarters in Cambridge, England. "But I think in the so-called sports broadcast genre, Madden, FIFA and the 2K Sports programs, the point is to make it like a TV program. I'd say we're 20 percent there with the effect we want to achieve."

Taking Games' Commentary Beyond Repeat Performances

That's the state of the art right now, which is still governed by throwing a ton of dialogue into a game and having it pick out the useful circumstances to which it might apply. Even if, for example, MLB 2K10 landed the Dodgers' golden-pipes announcer Vin Scully as its voice, it would still be Vin Scully working off a script, which no one wants to hear. And, here's the counterintuitive part, if he or any other talent tried to ad-lib or make it sound like it wasn't a script, the worse it would be.

"In general, the longer the line or the more novel it is, the more unwelcome it is to hear more than sparingly," said Joel Simmons, the audio director for 2K Sports. "Short play-by-play bits that get repeated aren't noticed nearly as much as color analysis. Fans get especially tired of hearing the same line about a specific player, team, or situation."

That brings us back to Collinsworth's thankless job as a Madden boothman, trying to inject natural-sounding analysis into a game that, featuring 22 players on the field at the same time, will spit back an incalculable number of outcomes that simply can't be described naturally without a live person calling the action.

P.A. Commentator doesn't approach infinity with its the possibilities it can describe. But neither can the existing studio methods. And if a game's commentary structure was solely about the number of unique phrases it could execute with the help of an audio tree, every sports publisher can do that on its own right now, up to the budget it had for recording sessions and the memory limit of the disc it's stamping.

What P.A. Commenter seeks to do is streamline the branching of that dialogue - like hearing Kevin Harlan in NBA 2K10, who breaks his cadence from player name ("Bryant!") to shot type ("for three!") to outcome ("got it!"), in a way that can sound like automated phone support when your cable is out.

Funny enough, that's where Phonetic Arts has its roots. Taylor's worked in speech synthesis for 20 years, building phone announcement systems that would process credit card numbers and what have you. "The main uses for this were very boring, shall we say," Taylor said. After a few conversations with colleagues in sports studios, he realized the technology he understood could be brought to bear on a field that badly needed it.

And that's where P.A. Commentator's inflection is supposed to come in. It'll take a studio's recorded dialogue and, using concepts like "signal processing" that'll probably fly overhead of a layperson anyway, blends it and flows it into a reaction more appropriate than a one-size-fits-all name and description.

"It's not a huge effect, but it's just enough that it sounds different," Taylor says. "Part of the reason repetition is such an issue is because our brains remember things quite well." Translation: The same sound file played twice is by definition repetitive. Going into it and placing varied emphasis on a player's name, or his action, or the outcome, overcomes that.

Taking Games' Commentary Beyond Repeat Performances

That addresses play-by-play, which is more linear and hews to predictive X plus Y equals Z models that sound engineers instinctively know how to build. Color analysis is much more open ended and unfortunately, defines an increasing percentage of a real-life sports broadcast. EA Sports and 2K Sports have built dynamic commentary into their games, drawing on the player progressions either in real life, through the roster update put out by the studio, or the context of the season simulation underway on a user's console.

Taylor's company has analyzed sports broadcasts to a very granular level, so they know what they're up against. "Commentators are fairly constantly talking," Taylor says, and the difference when you get to a video game is they're not. There's a lot of dead space, especially in the replays, because games either haven't been developed for or haven't been asked to respond to situations in broader contexts.

"Pre-recorded commentary triggered by events and conditions in the game is the way things have always worked," 2K's Simmons said. "The difference is definitely the amount of context you have. Before MLB Today and NBA Today, there was no context for a quick game you just pick up and play with a friend. Especially with the MLB title, most of the gamers are big baseball fans. They don't want to pick up a game in July or August and hear commentators using canned stuff about last year. What's going on this season? What's going on this month, or this week? Who's injured and how is it affecting my team? "

That requires an awareness, and as Taylor was explaining all this it briefly occurred to me that the first time we see that in an artificial intelligence might not be Skynet, it might be in a sports game, programmed to see the action on its field of play and interpret it in context and memory.

But that is far, far down the road. Taylor estimates P.A. Commentator's capability as it exists this year, would bring sports game commentary only halfway to the realism of a live broadcast. That's still valuable enough that his company is "actively working with many of the top studios," and Taylor's hoping to see his technology implemented in a sports title by the end of the year.

The next leap, maybe you're talking about a technological advance where the CPU is capable of mimicking a known announcer's voice and inflection, and it's a matter of telling it what to say, a process that scales much, much higher than pre-recorded audio, even if it's manipulated on the back end. Roger Ebert, with an immense library of spoken performances, has had his voice nearly replicated by computer, but that technology is years, if not decades, away from being cost-effective and usable in a video game.

Regardless, this is a situation peculiar to sports games, and sports games' audio must deliver a verisimilitude demanded by no other community of gamers.

"If you've just gotten into an adventure game with a clear set of levels, there's a good chance you will go through that whole game without hearing anything twice," Taylor said. "But with a sports game, you know, there aren't any levels. You're playing the same game again and again and again, and you're quite often playing it with your favorite team."

The challenge, then, is as old as storytelling itself: How to tell people, sitting around the glow of a campfire or a television, the things they've heard a thousand times before, but in a new and interesting way.

Stick Jockey is Kotaku's column on sports video games. It appears Saturdays at 2 p.m. U.S. Mountain time.