How Is Data Transforming Football? The Complete Guide

In 2019, Dundee United, then playing in the Scottish Championship, hired a teenager living in Bangalore, India as a part-time scout and analyst. Through his blog and Twitter account, Ashwin Raman had built a reputation for himself as a shrewd interpreter of the avalanche of data that is collected in modern football. His work came to the attention of Dundee United’s Head of Analysis and Opposition Scouting, Stevie Grieve, who offered him a job. The club are now in the Scottish Premiership.

The club’s on-the-field success cannot simply be attributed to one teenager living on another continent. But the way in which Dundee United discovered and hired Raman is a striking example of how data is transforming football. For the modern football club, being able to collect and correctly analyse data about player recruitment, tactics and training is essential and this need is changing the kind of people they employ and the way they approach the game.

Moneyball

It was in 2003, with the release of Michael Lewis’s book Moneyball: The Art of Winning an Unfair Game, that the idea that data analysis could play an important role in sporting success first came to prominence. The book, which was made into a film starring Brad Pitt and Jonah Hill, showed how data analysis was essential to the unexpected success of the Oakland Athletics baseball team under the management of Billy Beane.

It took a while for football to follow suit, but within the last decade the growth of data analysis in the sport has been phenomenal. American owners, such as the Fenway Sports Group who own Liverpool, have helped import the model into the English game. The group’s co-founder John W. Henry first saw the potential of a data-driven approach as owner of the Boston Red Sox baseball team. He offered Billy Beane $12.5 million to become their General Manager, an offer Beane declined.

Since Fenway Sports Group’s takeover of Liverpool in 2010, the club have developed an impressive team of data scientists to improve their performance on the pitch and enhance their recruitment of players. Traditionally coaches and scouts were former players with first-hand experience of professional football, but Liverpool’s research team includes people with PhDs in Physics and Astronomy. Indeed, all the top clubs now have a raft of data analysts and technical scouts, as well as deals with technology firms to collect, store and interpret data.

Smaller clubs, such as Brentford and Barnsley, have also used data to drive their recruitment strategies, enabling them to rise to the cusp of promotion to the English Premier League on modest budgets. The Moneyball connection is particularly strong at Barnsley, where Billy Beane was part of the consortium led by Chinese-American businessman Chien Lee that bought the club in 2017.

But not everyone is keen on the connection. Matthew Benham, the owner of Brentford, has distanced himself from the term Moneyball, saying that, ‘The Moneyball label can be confusing because people think it is using any stats rather than trying to use them in a scientific way.’[1] The way in which data is used is crucial and there is a myriad of different ways in which the scientific use of data is having a significant impact both on and off the pitch.

On the Pitch

What matters most in football is what happens on the pitch. A huge amount of data about player performance is now recorded in the major leagues and tournaments from cameras around the stadium and GPS vests worn by the players. This provides a broad and detailed range of statistics including how much ground a player has covered during a match, their average position on the pitch in relation to others, and where they are having the most touches of the ball. The data can also be used to calculate measurements such as how many Expected Goals a team should have scored based on the quality of the chances they created.

This data has had an impact on both the tactics employed by managers when facing certain opposition and in the overall way the game is being played.

Data analysis played a crucial role in a famous 4-0 victory for Hearts against Celtic in the Scottish Premiership in 2017, a win which ended Celtic’s record-breaking run of 69 domestic games unbeaten. The Hearts manager, Craig Levein, revealed after the match that the club had analysed Celtic’s defeat against Anderlecht in the Champion’s League and based their tactics on what had worked for the Belgium side.

Anderlecht had played a 4-3-3 formation with pressing from the midfield and forward players. The data from Anderlecht’s win revealed that the three midfields and three attackers managed to cover an average of 11.8km each. Knowing this specific information meant that Hearts could aim to cover the same sort of distance during the game, giving them a target as they pressed Celtic high up the pitch, and the result was a famous and unexpected victory.

As well as having an impact on specific matches, the use of data is transforming the way the game is played overall. Since clubs have started analysing data in depth, the data itself has started to change, as clubs attempt to gain an advantage.

In every season since the Expected Goals metric began to be used in the English Premier League, the average distance from goal that players shoot from has decreased. The data suggests that shooting from long range is less likely to lead to a goal and so teams are waiting until they are closer to goal before shooting.

In Training

The stadium is not the only place in which vast amounts of data are collected on the performance of players. Performances are also tracked on the training pitch and the technology is impressive.

One of the market leaders is the STATSports Apex Athletic Series, a FIFA approved GPS tracker vest which measures many physical metrics including speed, acceleration, heart rate and distance covered. The data can be viewed during training sessions so that coaches can see not only how far and fast a player is running but the effect that it is having on their bodies.

This data can be used to pin-point areas in which a player needs to improve their game. The data might show that they are unable to cover as much ground in a match as required and so need to improve their stamina. Or the statistics from the GPS position during matches might reveal that they are often too far wide on the pitch and need to improve their positional play. Or the cameras might reveal that their passing accuracy drops over a certain distance. Weaknesses can be more easily spotted and coaching can be tailored to the specific requirements of a player.

It is not only the coaching team who use this data. The data gathered on and off the pitch is also used by medical staff to keep players fit. Top players are valuable assets to their clubs and keeping them free from injury can mean the difference between a successful and unsuccessful season. Using the health data provided by companies such as STATSports, clubs can monitor their players’ fitness, enabling their medical teams to accurately assess the risk of injury and spot any problems before they become serious. The high workload of modern players makes this kind of monitoring essential.

Player Recruitment

Scouting for new talent, uncovering the next top-scorer or midfield dynamo at a bargain price, is no longer reliant on sending out scouts to sit in rain-swept stands or receiving recommendations from agents eager to make a quick buck. Instead, clubs can search through detailed databases and uncover hidden gems, using performance statistics to find the player with the skillset the club is looking for.

Over the last few years, the media have often published stories about clubs using the database of the Football Manager game franchise to unearth talent, but platforms such as InStat, The Football Player Database, Scoutpad, Stats Perform and Wyscout are now competing in the growing marketplace of scouting databases and aids.

This method of scouting has allowed clubs to sign players from markets they would not usually buy from. Emi Buendia, who has shone at Norwich City in the Championship and the Premier League for the past three seasons after being signed from the second tier of Spanish football, was first identified as a potential signing by analysts at the club looking to find someone to fit a specific role within manager Daniel Farke’s tactics. He was then subsequently watched in-person by the club, but the data analysis allowed the club to identify the player as a target without the expense of flying pitch-side scouts around the world in the hope of coming across such a talent. It is surely no coincidence that Norwich’s Sporting Director, Stuart Webber, who has overseen the club’s innovative recruitment strategy, worked as Director of Recruitment at Liverpool during the first couple of years of Fenway Sports Group’s ownership of the club.

With so much data available about players, knowing what to look for is crucial. The clubs who have successfully used data in their player recruitment have not just picked the players with the best stats. Barnsley chairman Chien Lee has described how the club looks ‘for players and coaches that play the style that we want.’[3] Having a strong identity about the club’s style of play, like Liverpool under the management of Jürgen Klopp, is essential to plucking the right player from the flood of data.

Knowing how to interpret the data in a broader context is also important. One good stat on its own might not reveal the full story. As Brentford chairman Matthew Benham points out, if a player is making an impressively high number of tackles, it is worth asking, ‘Is a player making a lot of tackles because he is badly positioned in the first place?’[3]

It is not just a potential signing’s past performances which are analysed. When Wayne Rooney moved to MLS club DC United, the club used the available data to try to predict how he would perform in America. Dan Altman, a data analyst who was involved in the transfer, has explained that a key way in which statistical analysis was used in the decision to sign the player ‘was the ability to adjust his performance metrics for the Premier League to an MLS standard – to try and simulate how he might perform in that league.‘ This use of data to predict Rooney’s performance in the MLS was so successful that in his first season at the club ‘he hit almost all the numbers exactly.’[4]

The Data Crunchers

The increased use of data and the transformative effect it is having on the game has led to the development of important new roles behind the scenes. Clubs are looking outside football and hiring people from a range of different backgrounds.

At Liverpool, the Lead Data Scientist is William Spearman, who has a PhD from Harvard in particle physics. He has helped pioneer the use of pitch-control models, which combine the in-match data about players’ positions relative to the ball with information about their speed and acceleration to calculate the team’s overall control of the ball in each position on the pitch.

The models factor in the importance of control in different areas of the pitch, such as the space behind the opposition’s defence, and provide data via a tablet to managers and coaches in the dugout so they can see where there is space to exploit on the pitch and where their team might be vulnerable. The data is reprocessed after the game, allowing for a more accurate analysis of a team’s own performance and for scouting future opponents.

Also with a Harvard connection, Laurie Shaw, who has worked as a Research Scientist and Lecturer at the Ivy League university, has recently been appointed by City Football Group as Lead AI Scientist. City Football Group is the company which owns Manchester City and his hiring is part of an attempt to try to match Liverpool’s successful use of data. His CV includes five years as Head of Statistics and Model Development for the UK Treasury and a PhD in astrophysics from Cambridge.

City Football Group have also recently appointed a Director of Insights and Decision Technology. Football’s top clubs are understandably reluctant to share all of the data science they are developing as they seek to gain an advantage over their rivals, but these kind of job titles, which are reminiscent of the roles in big tech companies, and the calibre of person they are hiring show how seriously data science features in their plans.

Before joining City Football Group, Shaw ran the EightyFivePoints blog which used data science to analyse football. One of the things that is most striking about the growth of data science in football is the way in which people who started off using Twitter or blogs to share their insights are now being hired by clubs.

It is not just people with Harvard on their CV. The case of the Indian teenager Ashwin Raman, who was hired as a scout and analyst at Dundee United, shows that clubs are willing to look to people from backgrounds outside professional football and without a formal background in computer science or statistical analysis.

Jay Socik is another example of someone who made a name for themselves online before finding work as a data expert for a football club. A Sheffield United fan, he began by sharing his analyses of their matches on Twitter as they rose up the divisions from League One to the English Premier League. He is now Head of Recruitment Analysis for Luton Town in the English Championship.

Socik predicts that more clubs will be looking to social media to find the data expertise they need to compete: ‘I’m one myself and I could reel off 10, 12 examples in the past 12 months. Right now club staff are educated to know about data but the level to which they understand it, because they’re so busy with everything else, is not the same. So I think we will continue to see more and more paid positions going to people who are out there doing the work on social media. It’s where the expertise lies right now.’[5]

This work on social media by fans is being encouraged by Dan Altman, mentioned earlier for his work on the transfer of Wayne Rooney to DC United. Altman has developed the platform smarterscout, a football analytics provider which has a free tier which anyone can use to access data about football. Having access to the data means that anyone can analyse the statistics and start to construct models to understand why a team is having success with a new formation, why a player is out-of-form since changing position, or to judge the suitability of a transfer target.

The Future of Data

Data will never provide a magic formula for success. If the players are not good enough, a team is always going to find it hard to win. Nor will data science ever make football predictable. A moment of genius from a player, an overhead kick from the edge of the box or lob from the half-way line, or a costly mistake by a goalkeeper who lets a soft shot through his grip will always be able to change the outcome of a game.

But data will continue to transform football in ways we cannot predict. Data analysis will alter tactics and the tactics will alter the data and new ways of interpreting the data will develop. After all, the data itself is only raw information. It is the questions that people ask of the data and the metrics they apply to it which determine how it is used. The influence of people from backgrounds outside traditional computer science and data analysis, people who come to the data with fresh eyes and a passion for the sport, is sure to have a lasting impact as football is transformed both on and off the pitch.

Share this article

Related Articles

Our team provides news and insights from the cutting edge of football analysis.

The Rise of Data Analytics in the MLS

Data has risen to the top in the MLS, who is now using it and what is the future for data in the MLS?
Analysing Football Tools

Working As An Analyst – Working Overseas

Like managers and players, analysts are now moving around the world to work with clubs, but what are the challenges for an analyst working with coaching staff in other countries?
Data Tools

How Do You Improve Football Performance Using Data?

Finding ways to improve performance is at the heart of football data analysis. Whether it’s in recruitment, at the training ground, or on the pitch, clubs are turning to data to help them make the most of their resources and give themselves a competitive advantage over their opponents.
football cameras

How Football Data is Used in the Media

Football data is now a big part of media coverage. We take a look at how data is used on TV and in newspapers.