Determinants of Scoring in the Alfred Dunhill Links Championship
This analysis is based on scores and stats from individual rounds in the last ten events in the Alfred Dunhill Links Championship: 5,229 rounds in total.
Section 1: Absolute Correlation Coefficients with Score
Key Points:
- SGP (Strokes Gained Putting) has the highest correlation with Score, indicating its importance.
- SGApp (Strokes Gained Approach to Green) is the second most important SG metric.
- SGTee and SGATG have lower correlations, showing less impact compared to putting and approach metrics.
Key Points:
- Greens in Regulation and PPGIR show the highest correlations, highlighting their importance in determining scores.
- Driving metrics, while important, have lower correlations compared to putting and greens-related metrics.
- Scrambling shows variable importance, depending on specific conditions or years.
Key Points:
- Par 4 has the highest and most consistent correlation, indicating its critical role in determining overall score.
- Par 5 shows more fluctuation, with less consistent impact on scoring.
- Par 3 generally shows moderate correlation but is less significant than Par 4.
Section 2: Importance of Each Metric in Determining Score
Random Forest Regressor and Feature Importance
Random Forest Regressor is an ensemble learning method that constructs multiple decision trees during training and outputs the average prediction. It combines the predictions of several models to improve accuracy and robustness.
Feature importance is a technique used to interpret a machine learning model. It refers to the score that quantifies the contribution of each feature to the prediction made by the model.
In a Random Forest, the importance of a feature is computed by looking at how much the feature decreases the impurity (e.g., variance for regression tasks) across all the trees in the forest. The more a feature decreases the impurity, the more important it is considered.
The calculated importance scores for all features are then normalized to give relative importance as a percentage. This shows the relative contribution of each feature to the prediction task.
Interpreting Feature Importance
Features with high relative importance percentages have a strong impact on the model's predictions. They are crucial for accurate predictions and indicate key areas where performance matters most.
Features with low relative importance have a minimal impact on the model's predictions. While they can still contribute, they are less critical.
Key Points:
- SGP (Strokes Gained Putting) is the most important metric at 31.56%, higher than the DP World Tour average.
- SGApp (Strokes Gained Approach) is the second most important metric, contributing 30.09% to score variance.
- SGTee (21.74%) is less important at this championship compared to the DP World Tour average.
Key Points:
- PPGIR (Putts per Greens in Regulation) is the most important traditional metric, contributing 34.28% to score variance.
- Greens in Regulation remains a significant factor at 24.92%, though lower than the DP World Tour average.
- DrivingDistance and DrivingAccuracy have lower importance, reflecting their reduced impact compared to other metrics.
Key Points:
- Par 4 holes are by far the most important factor, with 78.50% relative importance, higher than the DP World Tour average.
- Par 5 holes, at 12.39%, have less impact than expected, below the DP World Tour average.
- Par 3 performance has a minor role, contributing just 9.12% to score variance.
Top 5 Ranked Players - 2024 Alfred Dunhill Links Championship
The table below shows the top-5 ranked players across the three different Random Forest models above.
Rank |
Surname |
Firstname |
Avg Predicted Score |
1 |
Rahm |
Jon |
69.15 |
2 |
Koepka |
Brooks |
69.47 |
3 |
Hatton |
Tyrrell |
69.48 |
4 |
Mckibbin |
Tom |
69.51 |
5 |
Mcilroy |
Rory |
69.56 |
Rankings and estimated scores for all players can be found here.