Amazon Adding Metacritic Scores Is Bad News For Everyone

The immensely powerful aggregation website Metacritic just got a little more powerful, partnering up with Amazon, the biggest online retailer in the world, to display Metascores on video game pages.

This appears to be a quiet launch—rolling out gradually over the course of the week—but that Metascore is anything but quiet. It’s right in your face, and it will likely have a significant impact on Amazon’s game sales.

That’s bad news for anyone who cares about video games, for a number of reasons:

1) Metacritic’s system is faulty. I’ve written extensively about the problems with Metacritic—how their scores remove nuance and ambiguity; how game publishers have influenced and tampered with scores; how Metascores affect which game studios stay afloat; how Metacritic culture has actively impacted the way some developers make games. Check out my full report from last year to read just how Metacritic affects the video game industry. It’s not comforting.

http://kotaku.com/metacritic-matters-how-review-scores-hurt-video-games-472462218/

2) Games change; Metascores don’t. No matter how many times a game is patched or tweaked or improved in any way, that big ol’ number won’t change. Let’s say reviewers give something a low score because of bugs, and over the next few months, the developers squash all those issues in subsequent patches. The Metascore will stay the same. “Metacritic scores really are that snapshot in time when a game is released, or close to after it’s released,” Metacritic boss Marc Doyle told me last year.

That might be a helpful number when a game first comes out, but for older games, Metascores aren’t just obsolete—they can be actively misleading. Maybe Amazon should warn readers that Metascores represent reviews as they were when the game was released?

3) Polarizing games are treated as “average.”

Look at Nier, an action-RPG with a 68 on Metacritic:

That big yellow is supposed to mean “average,” but really, despite the collection of 7/10s, it’s hard to find people who look at Nier as an “average” game. Nier is polarizing. People either love it or hate it. Saddling the game with a 68—a bad score, by most accounts—does a disservice to people who might love the weirdness of a game like this, or many other bizarre titles that sit in the 60s and 70s on Metacritic.

4) Score aggregation poisons discussions and invites unfair comparisons.

While it is impossible to compare, say, Super Mario 3D World to The Last of Us, Metacritic invites us to do just that. Mario has a 93; The Last of Us has a 95. By Metacritic’s—and now Amazon’s—definition, The Last of Us is two points better than Super Mario 3D World, even though one game is a cartoon platformer and the other is a cinematic zombie adventure game. Trying to quantify a video game’s quality encourages absurd conversations and comparisons, and teaches readers to focus on the wrong things. It’s discouraging to see Amazon participate in that culture.

https://kotaku.com/the-problem-with-review-scores-part-v-1326561822

As the critic and games writer Tom Bissell once told me in an e-mail: “Metacritic encourages the fallacy that all opinions should be weighted equally, and that a ‘bad’ review is an unenthusiastic review. But that’s not true. There are some games I am *more* likely to play when a certain critic gives them what Metacritic regards as a ‘bad’ review. Metacritic leaves no room to discuss, much less pursue, guilty-pleasure games, noble failure games, or divisive games. Everything’s just a 7, or an 8, or a 6.5. That’s the least interesting conversation I can imagine.”

5) Review scores mean different things to different people.

Here’s how the gaming website Polygon describes a 7/10:

Sevens are good games that may even have some great parts, but they also have some big “buts.” They often don’t do much with their concepts, or they have interesting concepts but don’t do much with their mechanics. They can be recommended with several caveats.

And here’s how the magazine Game Informer describes a 7/10:

Average. The game’s features may work, but are nothing that even casual players haven’t seen before. A decent game from beginning to end.

Those are two drastically different ways to define the same score, which renders the two numbers meaningless when averaged or stacked up against one another. Polygon‘s 7 is different than Game Informer‘s 7. Yet review roundups and aggregation websites like Metacritic don’t take that into account. How can you trust an average when everyone’s working on a different scale?

I don’t think there are ill intentions here. Amazon is likely embracing Metacritic as a way to serve their users—after all, these scores are designed to help people sort out what’s worth their time and money. But the consequences—video game publishers and developers working even harder not to experiment or make games better but to improve their Metascores—could be really bad news.