Juventus Analytics

The motivation
I have been a big fan of Juventus ever since Gianluigi Buffon was transferred from Parma in summer 2001 (there's a bonus picture of me in front of the stadium). As an avid supporter, I am also quite active in the r/Juve subreddit to discuss and keep up to date with the lastest news. Unfortunately, social media could also be a toxic environment for discussions, especially with the significant drop in the team's performance after dominating the Italian league for nine consecutive years, angering a lot of fans. At the end of 2021, some fellow Redditors and I started a mini project of data analytics specifically for all things Juventus. We hoped that with the backing of data, we could start civilized and respectful discussions in the subreddit. Additionally, it would be fun to check if popular opinions in the subreddit are indeed justifiable.

Midfielders goal contribution
Midfielders used to be Juventus' strongest area in the early 2010s, with the likes of Andrea Pirlo, Paul Pogba, Claudio Marchisio, and Arturo Vidal dictating the play. Since they left, Juventus have been struggling to find adequate replacements and as a result, look very impotent in the offensive phase. However, one can also ask: are the criticisms really justifiable? Or are the opinions biased just because Juventus do not have superstar midfielders now?
To check this, we first looked at the goals contribution from midfielders over the year, compared to the other positions. In the left plot below, we compared the goals scored by our midfielders with the other positions. By the time this plot was created, the 2021/2022 season was not finished yet, and therefore there is a drop in the goals contribution at the end of the plot. The different background colors indicate different managers that were in charge of the team at that time. Obviously, the goals mostly came from the forwards, and Juventus received a boost in forward goals since the arrival of Ronaldo in 2018. However, it is unfortunate to see that the goals contributed by our midfielders have significantly declined and even became similar to the number of goals scored by the defenders (full-backs included).
Next, we compared the expected goals (xG) vs the actual goals scored by our midfielders in season 2017-2018 (the first season where xG data is available in fbref.com), vs the 2020/2021 season. The plot on the right shows that Juventus' midfielders seem to be less effective in scoring chances if we see the linear trend in the plot (recently there were less actual goals scored compared to the expected goals). In short, it is apparently not only a subjective opinion, that the midfielders recently underperformed compared to the beginning of the 2010s.


We know now that the midfielders truly underperformed. But what about the team as a whole?
This question was difficult to quantify in the beginning, but then I figured out an interesting data source to exploit. I decided to use betting odds data of past matches, starting from the season 2007/2008 when Juventus were promoted once more to Serie A. I thought that the betting odds are a good representation of the expectation of the team's performance, since the bookies want the odds to resemble the actual results as close as possible so that they earn money. To process the data, the betting odds were first normalized into probability values of winning, drawing, or losing. They were then multiplied with the points obtained for each case (win = 3, draw = 1, and loss = 0), resulting in an expected point.
The first plot on the left shows the density distribution of Juventus' points at the end of the season compared to the expected points. The orange density plot shows more concentration on higher points, indicating that Juventus actually overperformed against the bookies' expectations. The second plot on the right shows the evolution of the total points for each season in comparison to the aforementioned expected points. It is clear that when Juventus were still recovering from the relegation in 2007, the bookies did not put too much expectation, shown by the lower expected points. Juventus actually underperformed significantly then, during Ranieri's and Zaccheroni's era (it was a dark era to forget for all the fans). The turning point could be seen in the 2011/2012 season, where Conte led the team to superior overperformance, especially the 100 points Scudetto. The only time that Juventus started to underperform again was during the 2020/2021 season, but also by a very small margin.


TLDR
This is a fun personal project that was initiated by some fellow Redditors and me, with the goal to initiate meaningful and logical discussion based on evidences, and ultimately to have a respectful environment to discuss at least in the subreddit. The project is also relatively new, so we have not explored a lot of topics so far. In the examples I show here, we were able to justify the popular general opinion that is a recurring theme in the subreddit: underperformance of the midfielders. Only by collecting data from past seasons and plotting the trend, we showed that the midfielders indeed underperformed in recent seasons. However, the team as a whole did not underperform against expectations, but the trend displays that the team is on the way towards underperformance.