Michael McKenzie A Statistical Analysis of the Backyard Brawl
Since its four-game series return in 2022, the Backyard Brawl, a recurring matchup between the University of Pittsburgh and West Virginia University, has been dominated by the home team every year. Starting with the Pittsburgh victory at Acrisure Stadium in 2022 to the 2025 Mountaineer victory in Morgantown, hometown fans have had the excitement of victory every year.
The two schools announced on Sept. 12, the day before this year’s match, that the next series will extend to eight seasons from 2029 to 2036. With that, now is the perfect time to check whether a history of home team advantage has existed throughout the course of this rivalry.
Using data from Winsipedia that lists both the game location and the final scores, it can be determined that in the 130-year history of the Backyard Brawl, there have been 63 home wins, 42 home losses, and three ties. For the sake of this argument, ties will be discarded as they have neither a winner nor a loser. This takes the total number of usable games from 108 to 105.
To test if a home field advantage exists with the current data, a one-proportion Z test can be used to compare the current proportion based on data from the games and the expected proportion of home team wins, which would be 50%.
At the beginning of a statistical test, there are a number of conditions that must be met to ensure proper sampling. However, for the sake of this conversation and knowing that we have the correct and complete history, we are going to assume that the conditions are met.
After the conditions are satisfied, a hypothesis has to be made to compare the predicted reality or null hypothesis (H0) versus the alternative hypothesis (Ha), a home-field advantage. As stated above, the H0 would be 50%, and the Ha would be that the population is greater than the H0.
Beyond this point is a lot of calculator work, but there are three main factors to understand: the significance level, the Z score and the p-value.
At the end of all these calculations, we are looking to be below the significance level, the standard of which is 0.05. If it is below the significance level, the difference is statistically significant, meaning the alternative hypothesis can be accepted.
We can calculate this using the Z score, which measures the deviation of the data set from the mean. In this case, the Z score is 2.0494, which means the data set is that number of standard deviations away from the mean.
The p-value can be found with the Z score, which is what we compare against the significance level. In this case, the p-value is 0.0202, which is significantly lower than 0.05.
We can now mathematically conclude that, based on the history of the Backyard Brawl, there is some kind of home-field advantage, where the home team’s win percentage is significantly above 50%.
There are some statistical principles that cannot apply to this data set because we assumed that our conditions were true. This also cannot be replicated due to the use of the entire population instead of a sample experiment. However, for the sake of looking at it historically, the findings are interesting enough to discuss.
