Determinants of Scoring at Keene Trace Golf Club

This analysis is based on scores and stats from individual rounds in the last 5 Tour events at Keene Trace Golf Club: 2,141 rounds in total.

Section 1: Absolute Correlation Coefficients with Score

Chart 1(a)

SGTee: The correlation between Score and SGTee fluctuates over the years, indicating varying importance of tee-to-green performance at Keene Trace. This variation might be due to changes in course setup or weather conditions affecting tee-to-green performance differently each year.

SGApp: The absolute correlation for SGApp remains relatively high and consistent, suggesting that approach shots significantly influence the score each year. This highlights the importance of precise approach shots at Keene Trace, where hitting greens in regulation likely leads to better scores.

SGATG: The correlation for SGATG shows variability, indicating that short game performance around the green has a varying impact on scores. This could be due to changes in green speed or rough conditions, making around-the-green play more or less critical in different years.

SGP: The correlation between Score and SGP varies, but it generally shows a significant impact on the score. This reflects the importance of putting performance at Keene Trace.

Chart 1(b)

DrivingDistance: The correlation with Score varies, suggesting that while driving distance is important, its impact changes yearly. This may be due to course length changes or weather conditions that make driving distance more or less critical.

DrivingAccuracy: The correlation shows some fluctuation, indicating that accuracy off the tee has a varying effect on scores. This could reflect changes in rough severity or fairway width, influencing how critical accuracy is each year.

GreensInRegulation: The absolute correlation is consistently high, underscoring its importance. Hitting greens in regulation is a key factor for low scores at Keene Trace.

Scrambling: The correlation varies, suggesting that the ability to recover from missed greens fluctuates in importance. This could be due to changes in rough conditions or green difficulty, affecting scrambling success.

PPGIR: The correlation is relatively consistent, indicating that putting performance on greens hit in regulation is crucial. This highlights the importance of converting opportunities into scores at Keene Trace.

Chart 1(c)

Par3: The correlation varies, reflecting the changing impact of par 3 performance on overall scores. This might be influenced by the difficulty of par 3 holes, which can vary significantly from year to year.

Par4: The correlation is generally high, indicating that performance on par 4 holes is crucial for scoring well. Par 4s often make up the majority of holes on a course, making consistent performance on these holes vital at Keene Trace.

Par5: The correlation shows some variability, but par 5 performance generally has a significant impact on scores. Scoring well on par 5s leads to lower overall scores, reflecting their importance in strategy and scoring at Keene Trace.

Section 2: Partial Dependence Plots against Score

Partial dependence plots (PDPs) are a tool used in machine learning and statistical modeling to illustrate the relationship between a target variable and one or more feature (e.g. SGApp, SGATG, DrivingDistance, GreensInRegulation). They show the marginal effect of a feature on the predicted outcome of a model. PDPs are particularly useful for understanding how individual features impact the target variable, allowing for better interpretation and insights from the model.

In determining the value of Score, PDPs can help visualize how changes in each feature impact the predicted score, holding other features constant. This can provide insights into which features are most influential and how they affect the score.

Section 2 - Partial Dependence Plots

Partial Dependence Plot 2(a)

SGTee: The plot suggests that as SGTee increases, the Score tends to decrease, indicating that better tee-to-green performance generally leads to better scores. This is expected as strong performance from tee to green is crucial in reducing overall strokes.

SGApp: Similarly, an increase in SGApp results in a lower Score. This reinforces the importance of approach shots at Keene Trace, where accurate approach shots can significantly reduce the number of putts needed.

SGATG: The plot for SGATG shows a negative relationship with Score, but it appears less steep compared to SGTee and SGApp. This indicates that while the short game around the green is important, its impact on the score is relatively moderate.

SGP: The plot shows a clear negative relationship, highlighting that better putting performance leads to lower scores. Good putting is essential in converting birdie opportunities and saving pars.

Partial Dependence Plot 2(b)

DrivingDistance: There is a slight negative relationship between DrivingDistance and Score, indicating that longer drives contribute to lower scores, though the effect is not very strong.

DrivingAccuracy: A clearer relationship is observed, emphasizing the importance of hitting fairways. Accurate drives set up better approach shots and can avoid trouble.

GreensInRegulation: This plot shows a strong negative relationship, reinforcing that hitting greens in regulation is crucial for scoring well. It provides more opportunities for birdies and reduces the need for scrambling.

Scrambling: The plot indicates a negative relationship, though less pronounced. Good scrambling helps save pars when greens are missed, which is essential for maintaining low scores.

PPGIR: There is a strong relationship, indicating that fewer putts per green in regulation leads to lower scores. Efficient putting on greens hit in regulation is a key factor in scoring well.

Partial Dependence Plot 2(c)

Par3: The plot shows a negative relationship, indicating that better performance on par 3s contributes to lower scores. Par 3s can be challenging, and scoring well on these holes can boost overall performance.

Par4: There is a strong negative relationship, highlighting that performance on par 4s is crucial. Since par 4s make up a significant portion of the course, consistent play on these holes is vital for a good score.

Par5: The plot shows a negative relationship as well, though slightly less steep. Scoring well on par 5s, which are often seen as scoring opportunities, is important for reducing overall scores.

Section 3: Importance of Each Metric in Determining Score

Random Forest Regressor and Feature Importance

Random Forest Regressor is an ensemble learning method that constructs multiple decision trees during training and outputs the average prediction. It combines the predictions of several models to improve accuracy and robustness.

Feature importance is a technique used to interpret a machine learning model. It refers to the score that quantifies the contribution of each feature to the prediction made by the model.

In a Random Forest, the importance of a feature is computed by looking at how much the feature decreases the impurity (e.g., variance for regression tasks) across all the trees in the forest. The more a feature decreases the impurity, the more important it is considered.

The calculated importance scores for all features are then normalized to give relative importance as a percentage. This shows the relative contribution of each feature to the prediction task.

Interpreting Feature Importance

Features with high relative importance percentages have a strong impact on the model's predictions. They are crucial for accurate predictions and indicate key areas where performance matters most.

Features with low relative importance have a minimal impact on the model's predictions. While they can still contribute, they are less critical.

Using Random Forest Regressor, the relative importance of each factor on Score is quantified as follows:

Feature Importance of SGTee, SGApp, SGATG, and SGP on Score

SGApp is the most critical factor, indicating the importance of approach shots. SGP also has a significant impact, highlighting the need for effective putting. SGTee and SGATG, while important, have a lower impact compared to SGApp and SGP.

Using Random Forest Regressor, the relative importance of each factor on Score is quantified as follows:

Feature Importance of Driving Metrics on Score

PPGIR is the most important factor, underscoring the critical role of putting. GreensInRegulation and Scrambling are also vital, indicating the importance of hitting greens and recovering effectively. DrivingDistance and DrivingAccuracy have a lower impact.

Using Random Forest Regressor, the relative importance of each factor on Score is quantified as follows:

Feature Importance of Par3, Par4, and Par5 on Score

Par4 performance has the most significant impact on the overall score, emphasizing the need for effective management of these holes. Par5 holes also play a substantial role, while Par3 holes, though important, are less influential.

Top 5 Ranked Players - 2024 ISCO Championship

The table below shows the top-5 ranked players and their average estimated scores from the three different Random Forest models above.

Player Score
Jannik De Bruyn 70.25
Austin Smotherman 70.27
Aaron Baddeley 70.30
Seonghyeon Kim 70.32
Russell Knox 70.33

Estimated scores for all players can be found here.