Hi TOC. We recently ran our offensive line and quarterback previews and thought this might be an interesting follow-up. It does get kind of long, however, so here is the TLDR summary:

Some key statistics were examined in an attempt to figure out how the Spartans underachieved offensively in 2024.

The results of this very basic analysis reveal that the number of times Aidan Chiles was sacked in a game and the support, or lack thereof, that he received from the Spartans’ run game were larger factors

in determining a win or loss than any of the quarterback-specific statistics that were examined. In other words, the number of sacks Chiles took and the lack of an effective run game in most of the team’s losses may be factors that are more attributable to the offensive line than Chiles individual play.

The factors involving Chiles that were analyzed were:

Quarterback Rating (QBR)
Completion Percentage
Total Turnovers (interceptions and fumbles lost)
Yards per Passing Attempt
Passing Yards
Sacks

The support Chiles received from MSU’s running game was also factored in by looking at Michigan State’s:

Yards per Carry
Rushing Yards

A T-test using a Google Sheets formula, logistic regression using a Google Sheets add-on, and logistic regression in Posit Cloud’s RStudio were used to analyze the importance of the above variables.

The whole thing is below, if interested.

Introduction

The 2024 Michigan State football team featured a new head coach in Jonathan Smith, who the Spartans hired away from Oregon State. Quarterback Aidan Chiles followed Smith from Corvallis to East Lansing via the NCAA transfer portal. Chiles served as Oregon State’s backup quarterback in 2023, his freshman season, but he did see meaningful playing time in several games. The OSU coaching staff often played Chiles in one planned series per game during the 2023 season and he was able to post some impressive statistics in those appearances.

In 9 games for Oregon State, Chiles completed 24-35 passes (68.6%) for 309 with 4 touchdowns and no interceptions which produced a QBR of 85.6. The Beavers went 7-2 in those games.

In most of these outings, however, Chiles was only throwing between 1 and 3 passes. The only contests where he exceeded 3 attempts were against UC Davis, where he went 9-13 for 74 yards and a TD, and then 5-8 for 75 yards and a TD against Stanford. Both of these were lopsided Beavers’ victories.

Still, this was enough for the Spartan fans to be excited about given that MSU was coming a couple of dismal seasons in a row. After a Kenneth Walker III-led 11-2 season in 2021 that featured a victory over rival Michigan and a Peach Bowl win to end the season, the Spartans had limped to a 5-7 finish in 2022 and an even worse 4-8 in 2023. The 2023 season saw head coach Mel Tucker get suspended after the second game of the season and eventually fired. Harlon Barnett led the team the rest of the way but, almost as soon as the season ended, the Spartans announced the hiring of Smith, with Chiles to follow later through the portal.

The Spartans’ hopes didn’t quite pan out in 2024 though. The general expectation was that, although some struggles behind a first-year starting quarterback were to be expected, the team should be improved and make a post-season bowl game. In the end, those bowl game hopes were dashed in a 41-14 loss to Rutgers at home to end the Spartans’ season at 5-7.

Chiles was up and down from the very beginning. His first pass as a Spartan was intercepted and he completed just 10-24 passes for 114 yards in the season opener against Florida Atlantic but followed that up by going 24-38 for a season high 363 yards in a week 2 win at Maryland. After a blowout win against Prairie View A&M, Chiles threw 3 interceptions in a disappointing 23-19 loss at Boston College – a game that would have moved the Spartans to 4-0 if they had been able to hang on.

MSU was still 3-1 after the BC game though, which was probably in line with preseason expectations. Chiles did cut down on his turnovers as the season went on and didn’t commit any over the last three games of the season. However, the Spartans went just 1-2 over those last three games with the only victory being a narrow 24-17 win at home over last place Purdue.

Even though Chiles showed improvement in certain areas as the season went on, the Spartans as a whole seemed to get worse. So how much blame really rests with Chiles and what other factors could be at play in the Spartans’ lackluster 2024 season?

Some key statistics were examined in an attempt to figure out how the Spartans underachieved in 2024.

The results of this very basic analysis reveal that the number of times Chiles was sacked in a game and the support, or lack thereof, that he received from the Spartans’ run game were larger factors in determining a win or loss than any of the quarterback-specific statistics that were examined. In other words, the number of sacks Chiles took and the lack of an effective run game in most of the team’s losses may be factors that are more attributable to the offensive line than Chiles himself.

Additionally, although not studied in this analysis, the “hangover” effect from two tough losses the Spartans suffered in 2024 may have played a large role in the overall season outcome. As mentioned above, the Spartans dropped a tough game to Boston College after winning their first three games of the season. Following the loss to BC, MSU was blown out by Ohio State and Oregon – the two toughest opponents they would face in 2024.

After a bye week following the Oregon loss, MSU rebounded to beat Iowa at home, restoring hope leading into the annual rivalry game against Michigan. The Spartans jumped out to a lead in Ann Arbor but ultimately fell to the Wolverines 24-17. Blowout losses to Indiana and Illinois followed before MSU squeaked by Purdue only to then get blown out again by Rutgers to end the season.

It could be argued that the Spartans never really recovered and moved on from the losses to BC and Michigan. Again however, the focus here attempts to determine Chiles’ role, based on common quarterback statistics, in MSU’s 2024 season outcome and the conclusion is that other factors played a larger role in the team not meeting expectations than Chiles’ play.

Methods

This analysis looks at commonly used and freely available football statistics, each of which is easily found on espn.com. There may be many more stats that could better describe the situation but they are not as freely available. For example Pro Football Focus and Sports Info Solutions provide more advanced metrics like pressure to sack rate, accuracy under pressure, and on-target percentage but access to these statistics require a subscription that this author is not in a position to invest in at this time!

Additionally, since this analysis attempts to look at Chiles’ role in the Spartans’ 2024 performance, only offensive statistics are analyzed. Factors such as injuries (particularly on the offensive line) and defensive metrics like third down conversion percentage (the Spartans defense often struggled to get off the field in key situations in 2024) were not analyzed.

The factors involving Chiles that were analyzed were:

Quarterback Rating (QBR)
Completion Percentage
Total Turnovers (interceptions and fumbles lost)
Yards per Passing Attempt
Passing Yards
Sacks

The support Chiles received from MSU’s running game was also factored in by looking at Michigan State’s:

Yards per Carry
Rushing Yards

Each of these factors was broken down for MSU’s wins and losses in 2024 to see if there was a correlation between any of these metrics and the Spartans’ ability to win a game.

It should also be noted that 12 games is an extremely small sample size and any conclusions should keep this in mind.

Data

The data used in this analysis can be seen in Table 1:

Table 1: Data used; source: espn.com

Without going into any statistical analysis and just from looking at the table it appears that Chiles’ completion percentage, yards per attempt, and total turnovers were not significant factors in the outcome of games during the 2024 season. Chiles actually completed a higher percentage of his passes in games the Spartans lost (60.2%) than in games they won (58.5%). His yards per attempt (7.5) were the same in both wins and losses. Turnovers also do not appear to be a factor with Chiles committing an average of 1.2 in Spartan wins and 1.3 in losses.

It also appears that passing yards per game (213 in wins vs. 192.9 in losses), QBR (62.3 in wins; 54.0 in losses) may be contributing factors.

Sacks taken and support from the run game, which are probably not as directly attributable to Chiles, may be the factors in game outcome. For example, Chiles may be blamed for taking a sack if he holds the ball too long but the primary responsibility may rest with the offensive line if a lineman misses a blocking assignment or can’t hold his block.

Similarly, in the run game, Chiles may be responsible if he fails to check to the right play at the line of scrimmage but the offensive line and running backs may also come into play here. Again, if a lineman does not complete his assignment, or a running back does not go to the right hole then a run play is less likely to succeed.

Chiles was sacked an average of 1.6 times per game in wins but this nearly doubled to 3.1 times per game in losses. In the five games that MSU won, they were able to run for 166.8 yards per game and 4.6 yards per carry. But in the seven losses this dropped dramatically to 78.6 yards per game and only 2.6 yards per carry. This was punctuated by very low rushing outputs against the best teams MSU faced last year in Ohio State, Oregon, and Indiana.

The data in the above table can be seen more visually in the charts below:

Figure 1: Aidan Chiles Completion Percentage and QBR by game

Figure 2: Aidan Chiles Negative Plays by game

Figure 3: Aidan Chiles Yards per Pass Attempt and MSU Yards per Rushing Attempt by game

Statistical Analysis

Statistical analysis was used in an attempt to determine if any of the variables here were more likely to affect the outcome of MSU games in the 2024 season. To review, the 8 variables analyzed are:

Chiles Quarterback Rating (QBR)
Chiles Completion Percentage
Chiles Total Turnovers (interceptions and fumbles lost)
Chiles Yards per Passing Attempt
Chiles Passing Yards
Chiles Sacks Taken
Team Yards per Carry
Team Rushing Yards

The Independent Samples T-Test and Logistic Regression, using both a Google Sheets add-on and RStudio in Posit Cloud were used to test the variables. Perhaps interestingly, the different methods produced different levels of significance between some of the variables above and game outcome.

The Independent Samples T-Test

The independent samples t-test is a statistical test that compares the means of two independent groups to determine if there is a statistically significant difference between them. In this case, the two groups are “wins” and “losses.”

How the T-Test works:

Hypothesis: The t-test works by setting up a null hypothesis (e.g., there is no difference in the mean QBR between wins and losses) and an alternative hypothesis (there is a difference).
Calculation: The test calculates a t-statistic and a p-value. The p-value indicates the probability of observing a difference as large as the one produced if the null hypothesis were true.
Conclusion: If the p-value is below a chosen significance level (usually 0.05), the null hypothesis can be rejected and it can be concluded that the difference is statistically significant. This would provide evidence that the statistic in question is a better predictor of the outcome.

A separate t-test was performed for each statistic (QBR, Completion Percentage, Sacks, etc.) to see which ones show a significant difference between wins and losses.

Calculating p-Value

The T-TEST Google Sheets function was used to calculate p-value. The syntax for the T-TEST function is:

=T.TEST(range_for_wins, range_for_losses, 2, 2)

Where:

range_for_wins: The range of cells containing the QBR values for games won
range_for_losses: The range of cells containing the QBR values for games lost
2: This specifies a two-tailed test. A two-tailed test checks if the means are different, without specifying which mean is expected to be larger.
2: This specifies an equal variance (two-sample type) test. This is a common assumption when the variances of the two groups are similar.

The calculated p-values for each of the eight variables analyzed is shown in Table 2:

Table 2: p-values for variables analyzed

Interpreting p-Value

The p-value is a number between 0 and 1. To determine if the difference is statistically significant, the p-value is compared to a significance level (alpha), which is most commonly 0.05. For example:

If the p-value is less than 0.05, the difference in mean QBR between wins and losses is considered statistically significant. It can be concluded that QBR is a significant factor.
If the p-value is greater than 0.05, the difference is not considered statistically significant. It cannot be concluded that QBR is a significant factor in predicting game outcomes based on this data.

Based on this analysis, the number of times Chiles was sacked in a game and MSU’s yards per carry and rushing yards per game were statistically significant factors in determining if the team won while the other five variables test (QBR, completion percentage, turnovers, yards per pass attempt, and passing yards) were not.

Logistic Regression

Logistic regression is a more advanced and powerful statistical method for this type of analysis. It is used when the outcome variable is binary, meaning it has only two possible values (in this case, win or loss).

How it works:

Modeling: Instead of just comparing means, logistic regression builds a model that predicts the probability of a team winning based on all the statistics you provide simultaneously.
Importance of Variables: The model assigns a coefficient to each statistic. A larger absolute value of a coefficient indicates that the statistic has a stronger effect on the probability of winning or losing. By comparing the coefficients, you can determine which statistics are more “important” or have a greater impact on the outcome.
Hypothesis Testing: The model also provides p-values for each statistic. A statistically significant p-value for a coefficient means that the statistic is a significant predictor of the game’s outcome, even when accounting for the other variables in the model.

Logistic regression allows for the testing of a hypothesis directly by comparing the predictive power of the statistics believed to be more important (sacks, rushing yards per attempt, rushing yards per game in this case) against those believed to be less important (QBR, completion percentage, passing yards per attempt, passing yards per game, and turnovers).

Logistic regression was done via the XLMiner Analysis TookPak, a free Google Sheets add-on, and in RStudio.

Logistic Regression Using the XLMiner Analysis ToolPak in Google Sheets

To perform the logistic regression with the XLMiner ToolPak in Google Sheets, data was organized as follows in Table 3 where, in the “Win” column, a “1” represents a game won and “0” is a game lost.

Table 3: Data organized for Logistic Regression using the XLMiner TookPak in Google Sheets

After the TookPak was added, the Logistic Regression option was selected where Y and X ranges needed to be specified. The Y range was the “Wins” column with wins and losses. The X ranges were input individually for each of 8 variables tested.

Inputting the Y range and the eight different X ranges produced a summary output for each variable as shown in tables 4-11 below:

Table 4: XLMiner Summary Output for QBR Variable

Table 5: XLMiner Summary Output for Completion Percentage Variable

Table 6: XLMiner Summary Output for Turnovers Variable

Table 7: XLMiner Output for Yards per Pass Attempt Variable

Table 8: XLMiner Output for Passing Yards per Game

Table 9: XLMiner Output for Sacks Taken per Game

Table 10: XL Miner Output for Yards per Rushing Attempt

Table 11: XL Miner Output for Rushing Yards per Game

Interpreting the Results of the XL Miner TookPak Output

Key values to look for are the coefficients and p-values for each statistic.

Coefficients: These indicate the direction and strength of the relationship. For example, a positive coefficient for QBR means that as QBR increases, the probability of a win also increases.
P-values: Just like with the t-test, if a statistic’s p-value is less than 0.05, it is considered a statistically significant predictor of the outcome. You can use these p-values to prove that some statistics (like QBR) are better at predicting wins and losses than others.

In this case, we are looking at the X Variable 1 coefficient to determine if the likelihood of a win increases as QBR increases.

Using the QBR variable as an example:

X Variable 1 Coefficient (0.0199…): This value directly represents the relationship between QBR and the outcome of the game. A positive coefficient indicates that as the value of the QBR statistic increases, the likelihood of a win also increases. The larger the positive number, the stronger this relationship.
Intercept Coefficient (-1.498…): The intercept is a baseline value for the model. It represents the logarithm of the odds of winning when the value of the QBR statistic is zero. It is a necessary part of the logistic regression equation but does not describe the relationship between a predictor variable and the outcome.

P-value is still the most important value for determining a variable’s impact on game outcome. It tells you if the relationship between the statistic and the game outcome is statistically significant.

If the p-value is less than 0.05, the statistic is a statistically significant predictor of the outcome. This is strong evidence that the statistic is important.
If the p-value is greater than 0.05, the relationship is not considered statistically significant.

None of the p-values were less than 0.05 when using the XL Miner analysis but rushing yards per game, sacks allowed per game, and rushing yards per carry had the lowest p-values when using XL Miner to run logistic regression.

Logistic Regression Using R

Finally, to run a logistic regression using R, Table 3, above, was imported into RStudio as a CSV file to run the analysis. The full Markdown file for this analysis is available here.

Interestingly, this analysis did not show a significant correlation between any of the variables and game outcome.

Conclusions

In full disclosure, I looked to Google Gemini at several points during this analysis for help, including asking why the results were different for each of the tests used in this analysis. The Gemini response to this prompt is below:

The difference in your results is due to the fundamental differences in what each statistical test is designed to measure.

The T-Test and Univariate Analysis

The T-Test is a univariate test. When you run a T-Test on a variable like sacks, you are asking one simple question: “Is there a statistically significant difference in the mean number of sacks between games that were won and games that were lost?”

The T-Test only looks at a single variable at a time, ignoring all the other variables in your dataset.

Logistic Regression and Multivariate Analysis

Logistic regression (in XLMiner and R) is a multivariate test. When you run a logistic regression model with multiple variables (sacks, yards_per_rush, rush_yards, etc.), you are asking a more complex question: “Is a variable a statistically significant predictor of a win or loss after accounting for the effects of all the other variables in the model?”

This is the key difference. A variable that appears significant on its own (in a T-Test) may become insignificant in a full logistic regression model because:

Collinearity: Another variable in the model is already capturing the same information. For example, yards_per_rush and rush_yards are likely highly correlated. The model may see that yards_per_rush is a stronger predictor and effectively “ignore” rush_yards, causing its p-value to be high.
Small Sample Size: With only 12 games, the small differences that appear significant in a single variable test may not hold up when you add other variables to the model.

In short, the logistic regression p-values are a more accurate representation of the predictive importance of a variable because they consider all the variables in the model together. The T-Test results are a good starting point, but they don’t give you the complete picture.

All of that being said, the T-Test results indicated that sacks taken by Chiles as well as MSU’s rushing yards per game and yards per rushing attempt had the most impact, of the variables studied, on the Spartans’ ability to win games in 2024. The logistic regression results did not support this however, although the logistic regression using a Google Sheets add-on was much closer to the T-Test results than running a logistic regression in RStudio.

If we are to go just off of the T-Test results, the statistics attributed most closely to Aidan Chiles’ play (QBR, passing yards per game, passing yards per attempt, yards per attempt, and total turnovers) in 2024 were not the most significant factors in the Spartans disappointing 5-7 record. The offensive line’s inability to protect Chiles and generate a more effective running game were likely more significant factors.