How Is Data Used in the Premier League?

By Adam Warne
When Liverpool were crowned champions of the English Premier League in 2020, their data team received plaudits from across the world for their work on player recruitment and match analysis. It was thirty years since the club had last won the league. Back then, data analytics was pretty much unheard of in football.

In recent years, data analysis has become a crucial part of the modern game. In the top tier of English football, clubs are investing millions to gain an advantage over their rivals. The Premier League has been at the forefront of modern football since it was founded in 1992 and it is now at the cutting edge of applying data analytics to football.

This has not been an overnight change. Data has slowly been making its presence felt in the Premier League, growing gradually despite resistance in some quarters. It has taken a while for data analysis to really make an impact, but now it is an essential part of the game.

The First Companies to Gather Premier League Data

Data collection and analysis in the Premier League did not start with the clubs. In the 1990s, managers and coaches still mostly relied on methods that would not have been out of place in the pre-Premier League era.

It was two sports analytics companies that first saw the opportunity of applying data analysis to Premier League football: Opta and Prozone.

Opta was founded in 1996. It won a contract with Sky Sports to provide stats for their television coverage of the 1996-97 season and signed deals with newspapers such as The Observer. The media immediately saw the value of such data as both a tool for their pundits and journalists to use when analysing the game and as something that the average football fan might find interesting.

The following year, Opta became the official player performance statisticians for the Premier League, a relationship that has continued to this day. Since 2013 it has been the league’s official data partner. Over the years the data it collects has become more detailed and it now includes advanced metrics such as expected goals.

Prozone was formed around the same time. Their work in football started at Derby County, where assistant manager Steve McClaren, who would go on to manage the England national team, was keen to use video technology. The company helped McClaren put together a Portakabin with a video screen and 22 massage chairs. The players would sit in the chairs whilst McClaren would give presentations about their game plan.

From this, they started to develop video-analysis technology to allow McClaren to analyse past performances and identify areas of improvement. They installed eight cameras around the club’s stadium to record matches and bought a share in a French firm called Video Sports who had developed tracking software. This software was originally unreliable.

‘Sometimes we’d get the analysis back and there’d be players missing,’ remembers Prozone’s founder Ramm Mylvaganam.[1]

Despite these early problems, the technology developed and allowed more data to be collected and scrutinised. When he joined Manchester United as assistant manager to Alex Ferguson in 1999, McClaren brought his enthusiasm for Prozone to the club. But McClaren wasn’t the only one interested in the potential of Prozone. Another club in the North West of England was to be the first to fully embrace the potential of data analysis in football.

The First Premier League Clubs to Use Data

Sam Allardyce, most recently manager of West Bromwich Albion, is now an old reliable manager. He is the type often employed by clubs to try to steady the ship. He was an old-fashioned tough-tackling defender in his playing days and as a manager he has a reputation, which he disputes, for building sides which play a rudimentary and unattractive long-ball game. One of his teams was described by Jose Mourinho as playing ‘19th-Century football.’[2]

Allardyce may seem an unlikely person to have once been a pioneer of a radical new approach to football, but that is exactly what he was.

During his playing career, he had a spell in America playing for Tampa Bay Rowdies. The team shared facilities with the NFL’s Tampa Bay Buccaneers and Allardyce was impressed by how much attention was paid to statistics and sports science in America compared to England.

In 1999, he was hired by Bolton Wanderers, then in the division below the Premier League. They won promotion in his second season in charge. Once in the Premier League, he became one of the first managers in the competition to take data science seriously. He was one of Prozone’s most enthusiastic clients and hired a team of sports science experts to analyse the data and find ways to improve Bolton’s performances on the pitch.

The data team at Bolton calculated that the side whose players ran faster and further than their opponents over the course of a match won or drew at least 80% of the time. They also identified POMOs—Position of Maximum Opportunity—when a goal was more likely to be scored. This led to the team focussing on long-throws, corners and free-kicks. Using data, they analysed where best to position players at set-pieces in order to increase the chances of the ball falling to them.

Bolton, despite their modest resources, finished in the top half of the Premier League every season from 2003 until 2007. They matched a physical, hard-working style with a very organised tactical structure and the ability to score more than average from set-pieces. The term Position of Maximum Opportunity has never caught on, but something of the idea can still be seen in the expected goals metric which calculates how likely a shot is to result in a goal.

Two of Bolton’s performance analysts Edward Sulley, who joined in 2001 and Gavin Fleig, who joined in 2004, would both later go on to work for Manchester City. Another pioneering analyst who started at a smaller club before moving to Manchester City was Simon Wilson, hired by Southampton as Head of Performance Analysis in 2002. The previous year, the 23-year-old had completed a degree in Science and Football from Liverpool John Moores University and worked as a consultant for Prozone. Wilson’s move to Southampton was one of the first times a Premier League club had hired a full-time performance analyst.

2002 was also the year in which the American baseball side the Oakland Athletics achieved unexpected success using their ‘Moneyball’ approach, showing that data analysis could have a real impact in sport. The approach was popularised the following year in Martin Lewis’s’ book, Moneyball: The Art of Winning an Unfair Game.

Southampton did not achieve the same level of success, although things went well at first. In 2003 they reached the FA Cup final, where they lost to Arsenal, and finished eighth in the league. After that high point, the club’s fortunes declined. They were relegated from the Premier League in 2005. A year later, Simon Wilson left the club to join Manchester City.

By the end of the decade, data analysts had become common the Premier League and managers were starting to make more use of it.

Robert Martinez, who was manager of Wigan Athletic from 2009 to 2013 and later boss of Everton until 2016, had a 60-inch pen-touch TV screen installed at his home which he used to access player-tracking and other data from Prozone.

Data analysis may have been becoming common, but there was still some scepticism. For Martinez, it was important to limit its use. ‘There’s a big danger of getting inundated with data and letting it affect your play,’ he explained when manager of Everton. ‘A player can have 10 shots and all of them are on target but he doesn’t score a goal. Or he can have 10 shots and nine of them are off target, but then the last one goes in the top corner. So which stat do you prefer?’[3]

Image Credit: Adobe Stock: ink drop

How Liverpool Have Used Data

Despite facing scepticism, the use of data has grown gradually in the Premier League. Bolton, under the management of Sam Allardyce, were the first club to embrace it and gain some modest success. It is Liverpool who have really taken data analysis to the next level.

In 2010, Liverpool were bought by the Fenway Sports Group, who also own the American baseball side the Boston Red Sox. Under the group’s ownership, the baseball club had been an early adapter of the Moneyball approach. The owners brought a similar emphasis on data analysis to English football.

In 2012, Liverpool hired Ian Graham, who has a PhD from Cambridge in theoretical physics, as Director of Research. The research team has expanded over the years and now includes Tim Weskett, who has a degree in astrophysics, William Spearman who previously worked for the European Council for Nuclear Research (CERN) and has a PhD from Harvard and former junior chess champion Dafydd Steele.

When Bolton were hiring their behind-the-scenes experts, they looked for people from a sports science background. Liverpool went for people with expertise outside the sport who had the ability to understand data and interpret it in complex ways.

Back in 2015, after the departure of Brendan Rodgers, the data team at Liverpool played an important role in the club’s decision to hire Klopp as the new manager. Klopp had left his previous club, the German side Borussia Dortmund, after a poor season in which the club were briefly bottom during the winter, before a late run of form pulled them up to seventh. Klopp had achieved previous success at the club, twice winning the Bundesliga and once reaching a Champions League final, but something had gone badly wrong.

For the Liverpool data team, the statistics suggested that what had gone wrong at Dortmund was not due to Klopp, but was instead caused by bad luck.

‘I analysed 10 seasons of Bundesliga performances, and Dortmund were the second-unluckiest team in that 10-year history,’ Ian Graham has explained. ‘It was just some terrible luck that cost Jürgen that one season.’[4]

Klopp has been an enthusiastic collaborator with the data team and receptive to their ideas. As well as helping with recruitment, the data team also crunch the data recorded during Liverpool’s matches and produce complex data maps to illustrate the team’s pitch control. For a team like Liverpool which plays a high-pressing game, this kind of data is crucial for getting their tactics spot on.

Manchester City’s Response to Liverpool’s Data Success

Liverpool’s success with data analysis has started an arms race with other Premier League clubs.

Since Liverpool won the Premier League title in 2020, City Football Group, the organisation which owns Manchester City, has been investing heavily in their data analysis department. Ravi Mistry has joined as Football Intelligence Officer. John-Mark Sisman has arrived as Performance Physicist. Laurie Shaw, who has a PhD in computational astrophysics and worked as a lecturer at Harvard University, has joined as Lead AI Scientist.

Before making these appointments, Manchester City had not neglected data analysis. In 2006 they had hired pioneer analyst Simon Wilson to start a new analytics department and by 2014 they had as many as eleven analysts working at the club. The success of Liverpool made them realise they needed to raise their game even further.

As well as hiring top talent, the club are also aiming to develop new technology to give them an edge. In October 2020, Manchester City and Google Research launched a competition for designing AI agents which can simulate the behaviour of players during a football match. An AI agent is an autonomous entity which uses its artificial intelligence to decide which actions to take in its environment. The aim is to be able to use these AI agents to simulate matches and accurately test tactics before trying them with real players.

Brian Prestidge who works for City Football Group, the owners of Manchester City, as Director of Data Insights and Decision Technology, has explained that, ‘Learning is all about harnessing failure, but failure in football is seldom accepted. Working with Google Research’s physics based football environment provides us with a new place to learn through simulation and offers us the capabilities to test tactical concepts and refine principles so that they are strong enough for a coach to stake their career on.’[5]

Premier League Clubs Embracing Data

Despite the early pioneering work by Steve McClaren, Manchester United have fallen behind other big clubs when it comes to using data. In the summer of 2020, it was rumoured that they were looking to hire a team of eight data analysts in 2020 as part of a new analytics department. Whether these intentions result in an analytics team that can match Liverpool or Man City remains to be seen.

Arsenal, who like Manchester United used to be dominant but have failed to win the title in recent years, have their own in-house data company, StatDNA, which they bought in 2012. One of the jobs carried out by the club’s analysts is calculating their expected win percentage based on match statistics and previous data from the history of the Premier League. Manager Mikel Arteta has used data from StatDNA to defend the club against critics following their poor form in the 2020-21 season.

Following a 0-1 defeat at home to Burnley in December 2020, Arteta claimed that the data showed they had just a three per cent chance of losing that game.

Arteta’s comments were mocked by many fans on social media. It is true that Arsenal dominated possession in that match and had three times as many shots on target as Burnley. Analysing the match, the data company Understat gave Arsenal an expected goals value of 1.85, compared to Burnley’s 0.82. Arsenal were certainly unlucky. What was perceived as ridiculous was the preciseness of the three per cent figure. It reduced the complexity of the game to a simple percentage which seemed disconnected to the actual events on the pitch. It felt to many that data was being used as a cloak to hide behind rather than as a tool for improving the team’s performance. Despite these criticisms, Arsenal are sticking with the approach.

Clubs outside the traditional big teams are also looking to expand their use of data. Leicester City appointed their first Head of Football Analytics in 2019. Burnley’s new American owners have stated their aim to use more data and have recently advertised for a new Technical Analyst to join their expanding analytics team.

Data analysis is now firmly established in the Premier League. In the early days of the competition, clubs relied on old-fashioned methods, simple tactics, and autocratic managers. The game in England lagged behind the more professional and sophisticated methods used abroad in countries like Italy and France. That has now changed. When it comes to modern data science and applying it to football, Premier League clubs are among the world leaders.

Share this article

Related Articles

Our team provides news and insights from the cutting edge of football analysis.

The Rise of Data Analytics in the MLS

Data has risen to the top in the MLS, who is now using it and what is the future for data in the MLS?
Analysing Football Tools

Working As An Analyst – Working Overseas

Like managers and players, analysts are now moving around the world to work with clubs, but what are the challenges for an analyst working with coaching staff in other countries?
Data Tools

How Do You Improve Football Performance Using Data?

Finding ways to improve performance is at the heart of football data analysis. Whether it’s in recruitment, at the training ground, or on the pitch, clubs are turning to data to help them make the most of their resources and give themselves a competitive advantage over their opponents.
football cameras

How Football Data is Used in the Media

Football data is now a big part of media coverage. We take a look at how data is used on TV and in newspapers.