Steam Spy


Steam Spy is a website created by Sergey Galyonkin and launched in April 2015. The site uses an application programming interface to the Steam software distribution service that is owned by Valve to estimate the number of sales of software titles offered on the service. Estimates are made based on the API polling user profiles from Steam to determine what software titles they own and using statistics to estimate overall sales. Software developers have reported that Galyonkin's algorithms can provide sales numbers that are accurate to within 10%, though Galyonkin cautions against using his estimates in financial projections and other business-critical decisions. Due to changes in Steam's privacy features in April 2018, Galyonkin had anticipated he would need to shut down the service due to the inability to estimate accurate numbers from other sources, but later that month revealed a new algorithm using publicly available data, which, while having a larger number of outliers, he still believes has reasonable accuracy for use.

Concept and history

Tracking of video game sales is of strong interest to the video game industry, but does not have the robustness of other industries, such as television with the Nielsen ratings system or music with Billboard charts. Though the NPD Group does track retail and digital sales of video games, access to this data requires payment and it does not typically break down distribution of sales on various platforms. Sites like VGChartz have attempted to collect more detailed sales figures based on external data, but there have been reported problems with how this data is aggregated. Valve's Steam client is the largest outlet for digital sales of games for Microsoft Windows, OS X, and Linux platforms. Normally, sales of video games and other software offered by Steam are kept confidential between Valve and the publishers and developers of the titles; developers and publishers are free to offer these numbers to the public if they desired. Valve does offer statistics on the most bought and played games, but otherwise does not provide any sales figures publicly. Galyonkin noted that whereas the film industry receives funding from financing companies which are more open about sharing their financial results, funding within the video game comes from a variety of non-traditional sources, leading to the market being coy to report on the sales of a game.
The idea for Steam Spy originally came from a similar approach used by Kyle Orland and the website Ars Technica for their "Steam Gauge" feature starting in April 2014. Steam Gauge uses the Steam API to access publicly available user profiles to obtain a list of games that that user owns. At the time of its creation there were over 170 million Steam accounts, making the task of polling the entire list of games impractical. Instead, they opted to poll between 80,000 and 90,000 each day as to collect the game lists, and then used sampling statistics to estimate the total ownership of each game. Ars Technica estimated at its onset that the margin of error was within 0.33%.
Galyonkin was inspired by Steam Gauge to create Steam Spy. At the time, Galyonkin was a Senior Analyst at Wargaming. Steam Spy uses the same approach of sampling a small percentage of Steam accounts, approximately 100,000 to 150,000 per day with a rolling sampling approach. The collected data is processed nightly to create visualizations used on the site, and thus offers historical trends for games as well. As with Steam Gauge, Galyonkin notes that Steam Spy is subject to similar sampling errors, so that data for newly released games or for games with low sales will not likely have accurate estimates of numbers. The polling approach is also prone to promotions that Valve runs, such as when a game is offered for free over a weekend; during this time, the game will appear owned on every Steam profile, and will artificially bump up the sales numbers. Galyonkin's method also polls the amount of time that each profile has played a particular game, allowing him to collect estimated playtime statistics on a per-game basis.
As of September 2016, Galyonkin continues to operate the site and has plans for several major features, alongside his current position as Head of Publishing for Eastern Europe for Epic Games.
However, in April 2018, Valve announced a change to Steam's privacy policies, giving users the ability to hide games, friend lists, and other elements of their provide as a means to aid in user privacy. Galyonkin announced that with this change, specifically in that the default settings for all users would hide these profile elements, he would be unable to collect the necessary data needed to run Steam Spy, and plans to shut down the service. Galyonkin has available other data sources to pull estimates from, but he did not feel that they provide the same type of accuracy that he felt his approach with Steam Spy could meet; while he will use these sources for personal research as well as for his role at Epic Games, he does not plan to publish the results in depth. Galyonkin plans to maintain the archives of Steam Spy indefinitely. The change in Steam's privacy settings came around the time of heightened awareness of personal data security, including the Facebook–Cambridge Analytica data scandal and the pending enforcement of the European Union's General Data Protection Regulation, and brought Steam's privacy settings inline with those offered by game console services. While journalists and game developers believed that the changes in Steam were for the best, they feared that the closure of Steam Spy would have a significant impact on independent game developers, who had used the service to gauge potential market and sales projections.
Later in April 2018, Galyonkin reported that he had returned to some earlier algorithms he had developed which used other publicly available data to serve as a possible replacement. He tested the algorithm on 70 games with most estimates he calculated coming within 10% of known sales figures, but he also observed there were more outliers, making this algorithm less accurate than his previous one, but still sufficient for market research purposes. Golyonkin plans to refine this algorithm and use it to continue operating Steam Spy.
In December 2018, Epic Games announced their plans to start a similar storefront as Steam, the Epic Games Store. Galyonkin stated shortly after the announcement that he had been working with Epic for some years to help them develop the store, using the information and analysis he gained in running Steam Spy to help establish some of the features and policies that the Epic Games Store will use, such as providing developers with as much sales data they reasonably can. Galyonkin will still continue to operate Steam Spy.

Impact

Galyonkin says that his estimates of sales have been confirmed as close with several developers. Gamasutra says that developers that they have spoken to also agree that the numbers from Steam Spy are "in the right ballpark". Several developers speaking to PCGamesN stated that Steam Spy is accurate to within 10% of actual sales figures for games with more than a few thousand sales, while the accuracy drops for low-selling titles; Galyonkin himself says data for games under 30,000 sales should be considered suspect. Dave Gilbert of Wadjet Eye Games noted that developers should take caution towards using Steam Spy's data for financial projections as the analysis does not factor in the cost of the game when purchased or obtained, as it can fluctuate due to sales, gifts, developer promotions, and other situations. An undisclosed video game publisher, speaking to Gamasutra, praised Steam Spy as a go-to tool used frequently in their industry to track trends, stating that "each game of ours tracks, broadly speaking, within its stated margin of error". Galyonkin does warn of the accuracy of Steam Spy data, equating it to the accuracy of political surveys but believes it is sufficiently accurate for general analysis of trends and broad distributions. The forementioned undisclosed publisher noted that Steam Spy data should never be used in isolation to make critical development or publishing decisions.
Prior to August 2016, Galyonkin honored all requests from developers and publishers to remove games from his tracking system; examples include Kerbal Space Program and all games published by Paradox Interactive. When Paradox had requested the removal in June 2016, Paradox's Shams Jorjani noted that they had seen "flawed" business plans from developers seeking their publisher support based solely on ownership data published by Steam Spy, prompting their request. However, in August 2016, Galyonkin reversed this choice, retroactively reincluding the stats from games previously removed. Galyonkin opted for this when Techland requested him to remove their games from the site, prompting him to reassert that his site was meant to be a polling tool for game developers and not capture accurate sales data, and that "removing several important independent games from the service will hurt everyone else while not necessarily benefitting the publishers of the removed games". Galyonkin noted there was no legal requirement for him to hide this otherwise-public and non-confidential data, and felt no developer was harmed for revealing this information. One developer from a Latin American company did tell Gamasutra that they were concerned about Steam Spy's reporting of the high sales from their game, as it would potentially lead for their offices to become a target for theft in their region.
Steam Spy has been used to help quantify certain trends in video game buying behaviors by both Galyonkin and other sources. For example, when Valve introduced the ability to allow buyers to request a refund for any game within certain time constraints in mid-2015, Galyonkin observed that most games received a small increase in sales, proposing that the refund policy enabled users to be more open to try games. Galyonkin also observed that games using the Steam Early access program typically had their largest sales at the point of release for Early Access as opposed to on their completed release. Galyonkin's data also shows that, based on a two-week period in August 2015, that Steam users spend the most time playing games published by Valve, specifically Dota 2, , and Team Fortress 2. Galyonkin is able to track the number of concurrent players for a game over time, allowing him to determine a "Hype Factor" and "Surprise Factor" for games based on how much their player base shrunk or grew, respectively, after release, and correlate that to sales estimates.
In June 2018, Valve has said they are also looking to provide tools similar to Steam Spy to be provided directly by APIs to others but with more accurate reporting as to make it better than Steam Spy.