Data Reaper Report – FAQ
Vicious Syndicate’s Data Reaper is the first report in the Hearthstone scene that maps the data based on actual games played. Over the past several months we have built an infrastructure of players who track games, software that identifies decks played and aggregates all the game data to provide a weekly picture of what the meta game looks like.
Since we began publishing our reports, many have asked questions about the methodology. In this FAQ, we provide answers to the most commonly asked questions.
Q: Can you tell us more about how the data is collected?
A: We collect the data from players who have Track-O-Bot installed on either their PC or Mac computer. We then compile the information into a large database, from which we generate the basis to our reports. If you are interested in contributing your data, be sure to fill out this form:
For more information about how we prepare the Data Reaper Report, you can listen to the interview with our two data engineer and data analyst in the ID DM podcast:
Q: Do you only consider the decks your contributors play or do you also try to infer the decks of their opponents?
We only report on the decks of the opponents. This way, we provide an unbiased picture of what the meta looks like to a random ladder player. That is, what would the player expect to see as opponents.
Q: How do you identify the opponents’ deck?
We use algorithms that ID decks based on cards played. We continually monitor the algorithms for accuracy. Of course, not every game can provide a definitive ID, and some decks do overlap. But, we believe that our algorithm provides an accurate overall picture of what has been played over the past week. As we continue with this project, we will be experimenting with various approaches and algorithms in order to keep improving all aspects of the Data Reaper Report.
Q: I would like to see real time meta reports. Would that be possible with your data?
We now have a Data Reaper (Live), which provides a picture of the meta game over the last few hours. Currentlly, it is in a Google Sheet. We are developing new interface for it.
Q: How do you compute win rates in decktype vs. decktype matchups?
Win rates are tricky. Therefore, explaining our methodology is important for how users interpret the information. In any case, if a player has some experience playing a deck, and has compiled information about their own personal win rates, it is important to compare these rates to what we publish. Differences might tell the player whether they are proficient at the deck (or not) compared to the populations that we track. Also, win rates will vary by the particular tech choices has included and that are different from the “average” deck.
To compute the matchups, we evaluate them from two perspectives. We compile the win percentages of all our tracker players who play a particular matchup. For example, let’s suppose that our players win 65% of their games piloting a Zoo Warlock deck against Midrange Shamans. We then evaluate the same matchup from the other side. That is, what happens when our opponents play Zoo Warlock and our trackers play Midrange Shaman. Let’s suppose that this win rate is 55%. Assuming that the average builds are similar, and that the sample size is sufficiently large, these differences may suggest that our players are more proficient at Zoo, or our opponents are less proficient in Midrange Shaman, or both. To correct for these discrepancies, we take the simple average of the two win rates, and conclude that in this matchup Zoo is favored and the expected win rate is 60%.
Q: As far as I know track-o-bot data doesn’t include the mode that the game was played in (wild/standard). Are you guys filtering in a way that notices banned cards to exclude this tainted data?
TOB does not distinguish between Wild and Standard. However, we are filtering Wild games out of the data set. It is not very difficult to do.
Q:Are you mixing together results from the European and Americas servers, or just reporting one?
Through the first 10 Data Reaper Reports we have been tracking games from both servers together. At this point, the split between games is fairly even. We constantly watch for differences in deck distributions across the two servers. There are none at this point. If there are, we will certainly let everyone know.
Q: How do the card usage Radar Maps work?
We scan the database of games of a particular week, and proceed to run it through a code. The product is a chart; full of circles and links between them. Each circle on the chart is a card that an opponent has played. The circle size is an indication of the number of opponents that have played this card. Two cards are linked if they have been played by the same opponent. These links operate like springs: the larger the number of opponents that have played two cards together, the stronger the spring tension, and the closer the cards are on the chart.
Conversely, cards that have no link between them tend to repel each other. Applied to our data, these conflicting forces result in a visualization where core class cards shared by most decks (e.g. [Fiery War Axe], [Execute]) have a central location, while cards that characterize a specific archetype (e.g. [Alexstrasza’s Champion], [Blackwing Corruptor]) are clustered in a peripheral area.
In such a large number of games as our data contains, it looks almost as if every possible pair of tech cards have been played at least once, so that the visualization tends to be cluttered with a lot of irrelevant information. To reduce this noise, we exclude from the charts the cards and links that are less frequent by some threshold (namely 5% of games for cards and 1% for links).
Black circle = Neutral card
Colored circle = Class card (color is different for every class)
Ring color = Rarity
Also, if some of the cards appear to be outside of the canvas’ range, you can click on a card and drag the cluster around to adjust your view. Note that currently, the radar maps are not mobile friendly.