In attempting to find a link between winning and pace, many regressions were ran. The data collected was that of the most recent, 2021-2022 college basketball season. Data was found from basketball reference, it was cleaned (manually and automatically) in Excel, and regressions were run in R. A regression was run for win percentage and point differential, in attempting to find a link between this and pace, as a preliminary test. A local regression plot can be seen below.
No obvious causation is found in the scatterplot in terms of a more winning team based on their pace. This lack of causation can be viewed more directly on the plot below. The regression line is relatively flat, indicating that having a higher pace does not lead to a better or worse point differential.
Next, let us view if offensive rating is directly altered by pace. It is important to note that teams that run at a quicker pace will likely score more points. Offensive Rating is normalized based on possessions, to not be influenced by these aggregate numbers. Clustering is used to attempt to find repeated samples of similar offensive rating, to identify trends and patterns.
There is some clustering, but it is negligible, as it is in the middle of the graph. This does not indicate clustering near either pole, squashing the hypothesis that there is some link. It is quite obvious that there is no link between pace and offensive efficiency, nor anything that pertains to winning.
Subset Selection
What does a high pace team look like, though? Let us find the causal elements for pace, by making an easily visualized tree based model. The first step in this process is to reduce our model. We have 63 variables, and the model needs to be reduced to 20.
Subset selection provided us the 20 optimal variables. We make use next of RSS, Adjusted R^2, AIC and BIC to find the average preferred number of variables is equal to 14. We will move forward with this number in mind. The 14 most optimal variables are shown below.
Tree Model
In an attempt to see the causal nature of pace, let us create a tree graph in order to see if it is linked in any way to something obviously related to winning that we may have missed previously. Working with this new dataset, the tree on the set of optimal coefficients is shown below. While it may be difficult to understand based on its size, it isn’t as confusing as it may seen. The first node speaks towards whether the opponent scored more or less than 2,280 points.
This ideology is not complex. Team’s pace was heavily influenced on how much their opponents scored. A team whose opponents scored more would be presented more opportunities by a team with higher pace. Let’s visualize this pretty simple concept.
Obviously, there is a relation between opponent’s total points and pace. Presenting your opponent with more scoring opportunities is going to increase them a greater likelihood of scoring more points. Let us end this experiment by subsetting the data.
Teams who allow more points than average (2,219) have a higher pace on average than those who don’t, by about two possessions per game.
As shown above, this doesn’t affect winning percentage at all, as we subset the data again.
Concluding Thoughts
Pace does not influence winning percentage, nor offensive rating. The majority of the statistics that are linked to pace as clearly multicolinear, such as the amount of points an opponent’s team scores.
Pace is not the solution
Data
“2021-22 Advanced Opponent Stats: College Basketball at Sports.” College Basketball Reference. Sports Reference, April 4, 2022.
“2021-22 Advanced Stats: College Basketball at Sports.” College Basketball Reference. Sports Reference, April 4, 2022.
“2021-22 Opponent Stats: College Basketball at Sports.” College Basketball Reference. Sports Reference, April 4, 2022.
“2021-22 Stats: College Basketball at Sports.” College Basketball Reference. Sports Reference, April 4, 2022.
Share this post
Tracking Pace's Influence in an Offensive System
Share this post
Preliminary Visualization
In attempting to find a link between winning and pace, many regressions were ran. The data collected was that of the most recent, 2021-2022 college basketball season. Data was found from basketball reference, it was cleaned (manually and automatically) in Excel, and regressions were run in R. A regression was run for win percentage and point differential, in attempting to find a link between this and pace, as a preliminary test. A local regression plot can be seen below.
No obvious causation is found in the scatterplot in terms of a more winning team based on their pace. This lack of causation can be viewed more directly on the plot below. The regression line is relatively flat, indicating that having a higher pace does not lead to a better or worse point differential.
Next, let us view if offensive rating is directly altered by pace. It is important to note that teams that run at a quicker pace will likely score more points. Offensive Rating is normalized based on possessions, to not be influenced by these aggregate numbers. Clustering is used to attempt to find repeated samples of similar offensive rating, to identify trends and patterns.
There is some clustering, but it is negligible, as it is in the middle of the graph. This does not indicate clustering near either pole, squashing the hypothesis that there is some link. It is quite obvious that there is no link between pace and offensive efficiency, nor anything that pertains to winning.
Subset Selection
What does a high pace team look like, though? Let us find the causal elements for pace, by making an easily visualized tree based model. The first step in this process is to reduce our model. We have 63 variables, and the model needs to be reduced to 20.
Subset selection provided us the 20 optimal variables. We make use next of RSS, Adjusted R^2, AIC and BIC to find the average preferred number of variables is equal to 14. We will move forward with this number in mind. The 14 most optimal variables are shown below.
Tree Model
In an attempt to see the causal nature of pace, let us create a tree graph in order to see if it is linked in any way to something obviously related to winning that we may have missed previously. Working with this new dataset, the tree on the set of optimal coefficients is shown below. While it may be difficult to understand based on its size, it isn’t as confusing as it may seen. The first node speaks towards whether the opponent scored more or less than 2,280 points.
This ideology is not complex. Team’s pace was heavily influenced on how much their opponents scored. A team whose opponents scored more would be presented more opportunities by a team with higher pace. Let’s visualize this pretty simple concept.
Obviously, there is a relation between opponent’s total points and pace. Presenting your opponent with more scoring opportunities is going to increase them a greater likelihood of scoring more points. Let us end this experiment by subsetting the data.
Teams who allow more points than average (2,219) have a higher pace on average than those who don’t, by about two possessions per game.
As shown above, this doesn’t affect winning percentage at all, as we subset the data again.
Concluding Thoughts
Pace does not influence winning percentage, nor offensive rating. The majority of the statistics that are linked to pace as clearly multicolinear, such as the amount of points an opponent’s team scores.
Pace is not the solution
Data
“2021-22 Advanced Opponent Stats: College Basketball at Sports.” College Basketball Reference. Sports Reference, April 4, 2022.
“2021-22 Advanced Stats: College Basketball at Sports.” College Basketball Reference. Sports Reference, April 4, 2022.
“2021-22 Opponent Stats: College Basketball at Sports.” College Basketball Reference. Sports Reference, April 4, 2022.
“2021-22 Stats: College Basketball at Sports.” College Basketball Reference. Sports Reference, April 4, 2022.