Undershooters and Overshooters

Much has been made of the fact that our methodology finds Kobe Bryant, Lebron James, and many other high usage players to be undershooters. Over on TrueHoop, Devin Kharpertian made a post about this result of our research which, unsurprisingly, led to a good old fashioned Kobe vs. Lebron flame war, with each side refusing to believe the other’s hero is not a good for nothing chucker.  When understood correctly this should not be a surprising result at all.

 

The fact is, very few elite players in our data were not under-shooters. Most teams would be better in the short term, if their superstars were more aggressive with the ball. And in fact, almost without exception, these superstars get much more aggressive with the game on the line and, I suspect, this is mostly in their teams best interest.

Being an under-shooter in our research simply means that if you were to create an additional scoring opportunity on the margin (estimating the usage curve for each player is very important here, I detail the math in the paper and will make an effort to post on the intuition soon), in any given period of the shot clock, that shot would be of better value than your team usually gets when you pass up that shot and your team looks for an opportunity later. Being an under-shooter or an over-shooter does not necessarily mean you are a good or a bad player or even an efficient or inefficient offensive player. It simply means that your team might see a boost in overall efficiency if you altered your threshold for taking a shot by a little bit. Given that players have limited amounts of energy and are exposing themselves to injury risk on every drive, it is only natural that elite players should err on the side of letting their teammates do more than their “optimal” share of the work. In fact we should not be surprised that many of the names at the top of the undershooting list are guys who have had or are having serious knee problems and may face an especially steep trade-off in terms of their own health.

Sadly, it is far beyond the power of this research to determine whether or not these superstars are optimally trading off their teams immediate offensive efficiency for their own long-term health, but it is clear that such a trade-off exists.

Just for fun, I am attaching the entire list of NBA players and our estimate of how much their shooting behavior deviates from dynamic efficiency. The numbers on the right are t-statistics which show how many standard deviations each player is from optimality. Just like with adjusted plus/minus, our results are not based on an infinite sample size and any given player’s t-statistic comes with a confidence interval of roughly +/- 2 (this means that players between -2 and 2 might still be optimal). Thus we hang our hats NOT on our results for any given player, but on the idea that the biggest deviation from optimality we observe (elite guys undershooting) can be easily justified and that the NBA appears to get shot selection mostly right.

Building Bayesian Plus-Minus (Part 1): APM and SPM estimates 2007-11

Adjusted Plus-Minus is the only well-known statistical method of evaluating players that even attempts to capture aspects of player value that are not associated with the accumulation of traditional box-score statistics. It does this by attempting to quantify the actual impact that the presence of each player on the court has on the average number of points a team gets and gives up. And unlike unadjusted +/-, it does so via a regression analysis which controls for all the other teammates and opponents on the court by estimating the following equation.

Points = \alpha Home + \sum_{i \in Off} \beta_{i,O} + \sum_{i \in Def} \beta_{i,D} + \epsilon

where Points represents the number of points scored on any given possession and Off and Def represent the set of players on offense and defense during that possession. Thus, \beta_{i,O} and \beta_{i,D} represent the average impact that each player i‘s presence on the court has on his team’s offensive and defensive performance 

To see both the value and the pitfalls of this approach, consider the following example. Intuitively, we might observe that when Kevin Garnett and Kendrick Perkins are on the court together the Celtics defense is much better than when both players are on the bench. We can look, then, and see that each player has a very good unadjusted defensive +/-, but this does not give us any insight into how to split the credit up between these two players for the Celtics defensive dominance. To speak in econometrics parlance, it is easy to identify the joint effect of Kevin Garnett and Kendrick Perkins, but perhaps more difficult to separate out their individual effects.

To tackle this problem, we use Adjusted +/- to exploit whatever information we have about those instances when only one of Kevin Garnett or Kendrick Perkins was on the court. In such cases, if Adjusted +/- is splitting the credit inappropriately, it will do a bad job of retrodicting the Celtics defensive performance in these situations. Thus whoever’s presence on the court is more strongly correlated with the Celtic’s defensive prowess will receive the lion-share of the credit. In general, this is an entirely valid procedure, but in many cases it leans heavily on limited samples of data. And since every player’s value must be estimated while controlling for the impact of every other player, all it takes is a few player pairs that are very hard to separate (highly collinear) and uncertainty will propagate throughout all of our results. This does not invalidate our analysis, but it does make it noisier than we would like.

There are two basic solutions to this problem that have been previously employed:

1) The first is to use large samples of data. You can compute Adjusted Plus Minus values over 6 years of data. Using a bigger sample allows you more opportunities to observe players playing with lots of different teammates. This, in turn, allows for stronger identification of players’ individual contributions. The estimates we used here are based on a five year data sample starting with the 2006-07 season and ending with yesterday’s games and are presented below. Unfortunately there is a natural trade-off with this kind of analysis, namely that one must assume constancy in a player’s impact across all five years of the data set. It is not implausible, for example, that Kevin Garnett and Tim Duncan are not making the contribution they were five years ago and that their modern day teammates are being short-changed a little more than is fair.

Note that for the estimates presented below, players who played fewer than 5,000 offensive possessions were omitted. Thus a typical player in that category should be considered to have a score of zero in both offensive and defensive APM. We should not be surprised that most players in our sample tend to make better contributions on both ends of the floor than these “replacement” players. In fact, only about 10 players in our sample were found to have joint APM values below the replacement level in a statistically significant way.

 

“>

Player Off APM S.E. D APM S.E. Joint APM S.E. Off SPM Def SPM Joint SPM
Kevin Garnett 5.033 1.312 -10.304 1.313 15.337 1.856 3.879 -3.231 7.111
LeBron James 10.095 1.077 -4.381 1.075 14.475 1.522 9.401 -0.806 10.206
Dirk Nowitzki 9.825 1.180 -3.526 1.178 13.351 1.667 6.524 -0.724 7.249
Steve Nash 12.386 1.255 -0.562 1.252 12.948 1.773 7.805 2.705 5.100
Amir Johnson 4.188 1.385 -6.965 1.386 11.153 1.960 2.609 -2.152 4.760
Dwyane Wade 10.103 1.145 -0.324 1.144 10.426 1.618 7.562 0.239 7.324
LaMarcus Aldridge 5.621 1.275 -4.796 1.275 10.417 1.803 3.757 -0.214 3.970
Dwight Howard 6.306 1.308 -3.896 1.305 10.203 1.848 3.247 -3.579 6.826
Tim Duncan 3.339 1.387 -6.750 1.387 10.089 1.961 5.065 -2.982 8.047
Andrew Bogut 2.301 1.257 -7.590 1.253 9.892 1.775 1.557 -2.705 4.262
Chris Paul 9.172 1.337 -0.261 1.340 9.433 1.893 9.791 -2.075 11.866
Manu Ginobili 7.033 1.124 -2.310 1.124 9.344 1.590 7.299 -0.673 7.972
Blake Griffin 5.712 1.947 -3.121 1.949 8.832 2.754 5.426 -0.571 5.996
Roy Hibbert 2.177 1.524 -6.169 1.526 8.346 2.157 1.456 -1.828 3.284
James Harden 5.685 1.712 -2.601 1.713 8.286 2.422 3.602 -0.563 4.165
Luol Deng 3.213 1.138 -4.964 1.137 8.177 1.609 2.467 -0.064 2.532
Jeff Foster 4.399 1.249 -3.616 1.247 8.015 1.765 1.742 -2.449 4.190
Chris Bosh 5.915 1.022 -2.071 1.021 7.986 1.445 5.160 -0.940 6.100
Lamar Odom 3.561 1.171 -4.379 1.170 7.940 1.656 3.007 -1.394 4.401
Baron Davis 6.789 1.036 -1.054 1.038 7.843 1.466 5.552 -1.728 7.280
Ron Artest 2.563 0.982 -4.789 0.982 7.352 1.388 2.986 -1.886 4.872
Paul Pierce 5.160 1.275 -2.177 1.273 7.337 1.802 4.818 0.137 4.681
Mike Conley 7.187 1.439 -0.126 1.438 7.312 2.034 2.912 -0.183 3.096
Vince Carter 6.547 1.011 -0.761 1.010 7.307 1.429 5.939 0.253 5.686
Nene Hilario 2.110 1.193 -5.196 1.194 7.306 1.688 4.914 -2.735 7.648
Kobe Bryant 9.195 1.390 1.945 1.389 7.250 1.965 7.296 0.174 7.122
Russell Westbrook 6.655 1.558 -0.560 1.559 7.215 2.204 4.312 1.276 3.036
Peja Stojakovic 5.771 1.271 -1.363 1.274 7.133 1.800 3.787 0.983 2.804
Joe Johnson 7.062 1.344 0.139 1.339 6.922 1.897 5.405 0.840 4.565
Yao Ming 2.836 1.422 -4.049 1.421 6.885 2.011 3.342 -0.916 4.258
Rashard Lewis 4.312 1.012 -2.568 1.011 6.880 1.430 3.885 -0.438 4.322
Thaddeus Young 3.950 1.202 -2.901 1.203 6.851 1.701 1.873 0.105 1.768
Paul Millsap 4.845 1.174 -1.984 1.174 6.830 1.660 2.881 -2.448 5.329
Pau Gasol 5.943 1.068 -0.693 1.069 6.636 1.511 5.449 -1.008 6.458
Chuck Hayes 0.749 1.188 -5.749 1.187 6.498 1.679 0.659 -3.208 3.866
Carlos Boozer 5.971 1.169 -0.455 1.168 6.426 1.653 3.852 -1.004 4.856
Shane Battier 3.720 1.159 -2.588 1.161 6.308 1.641 1.655 -1.737 3.393
Anderson Varejao 3.116 1.202 -3.178 1.199 6.294 1.698 1.363 -3.026 4.389
Chauncey Billups 7.811 1.161 1.602 1.163 6.209 1.643 7.766 0.957 6.808
Omri Casspi 4.685 1.598 -1.484 1.599 6.170 2.261 1.426 0.646 0.780
Brad Miller 4.052 1.107 -2.111 1.107 6.163 1.566 2.393 -1.926 4.319
Josh Howard 5.174 1.232 -0.777 1.231 5.952 1.742 3.577 -0.588 4.164
Jrue Holiday 7.610 1.622 1.753 1.624 5.857 2.295 2.334 0.389 1.945
Kyle Lowry 4.080 1.166 -1.752 1.168 5.832 1.650 4.086 -0.804 4.890
Deron Williams 6.488 1.325 0.694 1.324 5.794 1.873 7.299 0.626 6.674
Josh Smith 2.059 1.227 -3.719 1.224 5.778 1.733 1.989 -2.817 4.806
Jared Jeffries 4.098 1.202 -1.658 1.203 5.756 1.700 -0.927 -1.222 0.296
Nick Collison 2.225 1.121 -3.473 1.121 5.698 1.586 0.869 -1.749 2.617
Anthony Tolliver 4.317 1.669 -1.374 1.671 5.690 2.361 2.655 -0.366 3.021
Chris Andersen 1.129 1.606 -4.510 1.605 5.638 2.271 1.460 -4.004 5.464
Gerald Wallace 2.933 1.085 -2.670 1.086 5.604 1.535 2.798 -3.059 5.856
Anthony Parker 2.678 1.092 -2.917 1.092 5.595 1.544 2.381 -0.419 2.800
Jamario Moon 4.182 1.183 -1.390 1.180 5.572 1.671 1.275 -2.455 3.730
Marcus Camby 1.109 1.076 -4.377 1.078 5.486 1.523 1.080 -5.198 6.278
Ryan Anderson 2.733 1.555 -2.688 1.557 5.421 2.200 5.048 0.661 4.388
Julian Wright 3.407 1.611 -1.994 1.611 5.401 2.278 -0.373 0.086 -0.459
Joakim Noah 3.927 1.282 -1.469 1.280 5.396 1.812 2.051 -2.356 4.406
Devin Harris 5.838 1.096 0.450 1.096 5.387 1.550 4.850 0.419 4.431
Rasheed Wallace 1.614 1.230 -3.759 1.230 5.373 1.740 2.419 -3.895 6.314
Tim Thomas 3.080 1.323 -2.285 1.326 5.364 1.873 2.469 -0.263 2.732
Matt Bonner 4.238 1.274 -1.032 1.273 5.271 1.801 3.732 -0.070 3.802
Raymond Felton 4.928 1.084 -0.336 1.085 5.264 1.534 3.125 -0.020 3.145
Tracy McGrady 3.963 1.166 -1.271 1.167 5.234 1.649 4.688 1.351 3.337
Hedo Turkoglu 5.194 1.017 -0.012 1.018 5.206 1.439 3.627 -0.133 3.760
Kevin Love 4.917 1.288 -0.205 1.289 5.122 1.822 5.028 -0.192 5.220
Antawn Jamison 6.287 1.076 1.215 1.075 5.072 1.521 4.626 -0.620 5.246
Mike Dunleavy 3.280 1.085 -1.671 1.085 4.951 1.534 2.882 0.962 1.920
Toney Douglas 4.649 1.674 -0.256 1.676 4.906 2.369 4.449 0.601 3.848
Tony Allen 1.091 1.285 -3.720 1.283 4.811 1.815 0.316 -1.723 2.039
Stephen Jackson 4.416 0.979 -0.380 0.980 4.796 1.385 2.809 -0.281 3.091
Thabo Sefolosha 1.711 1.177 -3.059 1.176 4.770 1.664 -0.735 -1.794 1.059
Elton Brand 1.338 1.197 -3.419 1.198 4.758 1.693 2.565 -2.078 4.643
Carmelo Anthony 6.628 1.097 1.927 1.097 4.701 1.552 4.999 0.208 4.791
Jared Dudley 3.877 1.308 -0.810 1.304 4.688 1.847 2.900 -0.224 3.124
Kenyon Martin 1.782 1.226 -2.876 1.227 4.657 1.735 0.522 -3.483 4.005
Tyson Chandler 0.824 1.151 -3.819 1.152 4.643 1.628 1.242 -1.852 3.094
Danilo Gallinari 4.503 1.379 -0.104 1.380 4.607 1.951 4.037 -0.317 4.354
Andre Iguodala 1.679 1.212 -2.894 1.211 4.572 1.713 3.912 -1.926 5.838
Ben Wallace -0.011 1.110 -4.497 1.108 4.487 1.568 0.281 -5.027 5.309
Joel Przybilla -2.795 1.355 -7.218 1.355 4.423 1.916 -1.463 -3.283 1.821
Tony Parker 4.747 1.325 0.342 1.323 4.405 1.872 5.390 2.063 3.327
Ime Udoka 2.275 1.244 -2.129 1.241 4.404 1.757 0.579 -0.574 1.153
Andrei Kirilenko 1.852 1.180 -2.542 1.178 4.394 1.667 2.562 -2.055 4.616
Nicolas Batum 4.969 1.545 0.598 1.540 4.371 2.182 3.324 -0.057 3.381
Jason Kidd 3.637 1.061 -0.708 1.061 4.345 1.500 4.986 -2.348 7.334
Andre Miller 5.352 1.069 1.034 1.068 4.318 1.511 4.498 -0.013 4.510
Antonio McDyess 2.526 1.122 -1.763 1.121 4.289 1.586 1.339 -2.122 3.461
Ersan Ilyasova 2.527 1.481 -1.758 1.480 4.285 2.093 2.633 -0.069 2.703
Kurt Thomas -0.366 1.169 -4.606 1.167 4.240 1.652 -0.096 -2.713 2.617
Gilbert Arenas 2.400 1.225 -1.813 1.227 4.212 1.734 4.982 0.653 4.329
Stephen Curry 5.980 1.471 1.787 1.473 4.192 2.082 3.912 0.102 3.810
Brandon Roy 6.862 1.169 2.690 1.170 4.172 1.654 6.397 0.467 5.930
Eduardo Najera 1.187 1.364 -2.892 1.362 4.079 1.927 0.731 -1.503 2.234
J.R. Smith 4.797 1.149 0.731 1.149 4.066 1.625 4.449 0.501 3.948
Tayshaun Prince 4.287 1.238 0.247 1.239 4.040 1.752 3.346 0.360 2.986
Rasho Nesterovic 1.999 1.396 -2.036 1.395 4.034 1.973 1.655 -0.725 2.379
Danny Granger 5.545 1.091 1.588 1.089 3.957 1.541 4.134 -0.238 4.371
Rafer Alston 1.676 1.236 -2.235 1.234 3.910 1.747 2.424 -0.285 2.709
Jamal Crawford 6.255 0.974 2.347 0.976 3.908 1.379 4.142 2.044 2.098
Reggie Evans -0.304 1.306 -4.186 1.306 3.882 1.847 -0.411 -3.002 2.592
Renaldo Balkman 0.249 1.651 -3.602 1.648 3.852 2.332 0.389 -2.550 2.939
Marc Gasol 1.509 1.512 -2.334 1.509 3.844 2.137 2.351 -2.759 5.110
Quentin Richardson 2.869 1.114 -0.969 1.114 3.838 1.575 1.582 -0.575 2.157
Ronny Turiaf 2.221 1.179 -1.594 1.180 3.815 1.668 1.510 -2.646 4.156
Caron Butler 4.001 1.118 0.206 1.119 3.795 1.582 2.662 -1.316 3.978
Leon Powe 1.882 1.610 -1.893 1.607 3.775 2.275 3.579 -0.375 3.954
Chase Budinger 4.374 1.676 0.606 1.678 3.769 2.372 3.571 2.383 1.189
Ty Lawson 7.192 1.695 3.426 1.695 3.766 2.397 4.799 1.510 3.289
Kyle Korver 2.744 1.046 -1.019 1.046 3.763 1.479 2.696 1.442 1.254
Quinton Ross 0.544 1.256 -3.214 1.252 3.758 1.773 -1.056 -0.508 -0.548
Marquis Daniels 2.597 1.200 -1.073 1.200 3.670 1.697 0.211 0.419 -0.208
Amare Stoudemire 4.436 1.061 0.779 1.059 3.657 1.499 4.481 -1.047 5.528
C.J. Watson 2.242 1.325 -1.370 1.329 3.611 1.877 2.264 -0.089 2.353
Shaun Livingston 2.945 1.444 -0.599 1.447 3.544 2.044 1.374 0.104 1.271
Rudy Fernandez 3.410 1.356 -0.118 1.357 3.528 1.919 4.003 -0.144 4.147
Richard Hamilton 2.515 1.263 -0.955 1.262 3.470 1.785 4.502 2.336 2.166
David Lee 4.643 1.056 1.175 1.057 3.469 1.494 3.331 -0.873 4.204
Al Horford 2.216 1.431 -1.230 1.427 3.446 2.021 2.955 -2.374 5.329
Anthony Morrow 4.368 1.305 0.936 1.308 3.432 1.848 3.273 1.377 1.896
Taj Gibson 0.597 1.645 -2.829 1.646 3.427 2.327 0.420 -1.941 2.361
Dorell Wright 1.523 1.220 -1.845 1.217 3.368 1.723 2.094 -0.922 3.016
Luc Richard Mbah a Moute 0.325 1.396 -2.981 1.393 3.306 1.972 -0.018 -1.629 1.611
Jason Thompson 3.038 1.324 -0.268 1.325 3.306 1.873 0.874 -0.685 1.559
Trevor Ariza 0.482 1.051 -2.799 1.052 3.281 1.487 1.583 -1.911 3.494
Jason Terry 4.610 1.094 1.347 1.092 3.263 1.545 5.504 0.909 4.595
David West 4.555 1.289 1.309 1.290 3.246 1.824 3.186 -0.830 4.016
George Hill 3.648 1.386 0.432 1.384 3.216 1.958 2.940 -0.006 2.947
Zydrunas Ilgauskas 0.876 1.209 -2.285 1.208 3.161 1.709 1.425 -1.208 2.633
Ramon Sessions 4.482 1.276 1.351 1.278 3.130 1.806 3.641 1.732 1.909
Brendan Haywood 0.117 1.227 -3.009 1.228 3.127 1.736 1.380 -2.012 3.391
Luis Scola 2.578 1.214 -0.512 1.217 3.090 1.719 2.567 -1.176 3.742
Shaquille O’Neal 2.391 1.127 -0.668 1.126 3.059 1.594 3.045 -2.482 5.527
Leandro Barbosa 5.475 1.063 2.428 1.062 3.047 1.503 4.708 1.098 3.611
Ray Allen 5.564 1.212 2.564 1.212 3.000 1.714 5.096 0.950 4.147
Dan Gadzuric 0.082 1.673 -2.907 1.671 2.989 2.365 -0.599 -1.872 1.273
Michael Redd 5.756 1.418 2.788 1.417 2.968 2.005 5.432 1.598 3.834
Ronnie Price 1.114 1.564 -1.826 1.561 2.940 2.210 0.269 -0.760 1.030
Andrew Bynum 0.553 1.252 -2.380 1.252 2.933 1.770 3.185 -1.656 4.840
Stephon Marbury 3.089 1.791 0.170 1.793 2.919 2.535 3.329 1.907 1.422
Bruce Bowen -1.092 1.451 -4.002 1.448 2.910 2.050 -0.953 -0.715 -0.239
Zach Randolph 2.172 1.007 -0.717 1.007 2.889 1.424 3.853 -0.379 4.232
Luke Ridnour 3.799 1.073 0.911 1.073 2.887 1.518 3.123 0.188 2.935
Donte Greene 1.355 1.571 -1.519 1.570 2.874 2.221 -0.271 1.131 -1.402
Ronald Murray 2.867 1.162 0.008 1.162 2.860 1.644 1.469 1.014 0.455
Jordan Farmar 0.549 1.319 -2.290 1.319 2.840 1.865 2.331 0.945 1.386
Zaza Pachulia 0.131 1.299 -2.675 1.297 2.805 1.836 0.974 -1.475 2.449
Darius Songaila 2.248 1.306 -0.555 1.306 2.803 1.846 0.088 -0.618 0.706
Rudy Gay 2.927 1.161 0.146 1.160 2.781 1.641 1.721 -1.145 2.866
Derek Fisher 1.943 1.298 -0.809 1.298 2.751 1.835 1.958 -0.653 2.611
Nenad Krstic -1.088 1.340 -3.835 1.341 2.747 1.896 0.852 -0.357 1.209
Ben Gordon 4.953 1.019 2.216 1.017 2.737 1.440 3.776 2.472 1.303
Marko Jaric 2.327 1.477 -0.366 1.476 2.693 2.088 0.242 -0.806 1.048
Cuttino Mobley 4.824 1.518 2.134 1.520 2.690 2.148 0.911 0.035 0.875
C.J. Miles 2.837 1.295 0.162 1.293 2.675 1.829 2.505 0.089 2.416
Kevin Durant 3.888 1.342 1.261 1.345 2.627 1.900 4.504 0.906 3.598
Rajon Rondo 3.126 1.341 0.501 1.343 2.625 1.898 3.755 -2.028 5.783
Vladimir Radmanovic 2.556 1.175 -0.060 1.174 2.616 1.660 1.148 -0.012 1.160
Chucky Atkins 4.182 1.527 1.582 1.531 2.600 2.162 3.743 2.050 1.693
Emeka Okafor -0.649 1.069 -3.204 1.071 2.555 1.514 1.442 -2.816 4.257
D.J. Augustin 4.144 1.355 1.603 1.356 2.541 1.917 4.601 2.210 2.391
Wally Szczerbiak 2.954 1.403 0.427 1.402 2.527 1.984 2.347 2.052 0.295
Daniel Gibson 2.700 1.169 0.181 1.166 2.519 1.652 2.646 0.870 1.776
Luke Walton 1.094 1.343 -1.415 1.343 2.510 1.899 1.436 0.684 0.752
Tyrus Thomas -0.240 1.171 -2.722 1.170 2.482 1.656 -0.145 -3.490 3.345
Ronnie Brewer 0.688 1.132 -1.752 1.133 2.440 1.601 2.184 -1.745 3.929
Craig Smith 1.557 1.240 -0.876 1.241 2.434 1.754 2.742 -0.087 2.829
Corey Brewer 1.045 1.359 -1.378 1.361 2.423 1.924 -0.502 -1.179 0.677
Channing Frye 1.072 1.104 -1.334 1.103 2.406 1.561 1.741 -1.126 2.868
Shawn Marion 0.889 0.925 -1.498 0.924 2.388 1.308 2.164 -2.178 4.342
Bobby Jackson 2.839 1.408 0.474 1.411 2.365 1.993 3.141 -0.215 3.356
T.J. Ford 1.412 1.185 -0.872 1.183 2.284 1.674 3.497 0.460 3.037
Andris Biedrins 0.384 1.129 -1.841 1.131 2.225 1.598 1.359 -4.030 5.388
Louis Amundson 0.998 1.685 -1.206 1.685 2.205 2.383 0.383 -0.841 1.224
Daequan Cook 2.273 1.432 0.073 1.433 2.200 2.025 1.785 1.233 0.551
Eric Gordon 4.577 1.363 2.400 1.364 2.177 1.928 3.924 1.538 2.387
Rashad McCants 2.994 1.543 0.826 1.545 2.167 2.183 1.916 1.390 0.526
Beno Udrih 2.280 1.090 0.167 1.090 2.114 1.541 2.527 0.565 1.962
Jameer Nelson 1.283 1.350 -0.822 1.346 2.105 1.906 4.043 1.261 2.782
Serge Ibaka 1.113 1.795 -0.966 1.795 2.079 2.539 1.471 -2.351 3.822
Raja Bell 1.615 1.095 -0.389 1.095 2.005 1.549 2.034 0.424 1.610
Wilson Chandler 1.234 1.192 -0.763 1.193 1.997 1.686 1.799 0.085 1.713
Fabricio Oberto -0.333 1.482 -2.317 1.477 1.983 2.092 0.554 -0.495 1.049
Rodney Stuckey 3.283 1.241 1.303 1.241 1.979 1.755 3.313 0.853 2.460
Udonis Haslem -0.055 1.261 -2.029 1.260 1.974 1.782 0.621 -0.752 1.373
Jason Richardson 3.990 0.935 2.019 0.935 1.971 1.322 4.561 0.144 4.417
Keith Bogans 2.872 1.092 0.935 1.091 1.937 1.544 1.298 0.125 1.173
Brandon Bass 0.457 1.252 -1.472 1.254 1.930 1.772 1.772 0.707 1.065
Keyon Dooling 2.468 1.115 0.547 1.113 1.921 1.576 2.551 0.589 1.962
Marvin Williams 1.903 1.128 0.001 1.128 1.902 1.595 1.605 -0.529 2.133
DeSagana Diop -1.113 1.487 -3.007 1.485 1.893 2.102 -1.537 -2.732 1.195
Michael Beasley 2.693 1.274 0.802 1.274 1.892 1.802 1.683 0.337 1.345
Brandon Jennings 3.456 1.755 1.566 1.751 1.890 2.479 3.825 0.639 3.186
Jason Collins -2.991 1.467 -4.877 1.467 1.886 2.075 -3.458 -1.528 -1.930
Allen Iverson 2.503 1.156 0.643 1.156 1.860 1.635 4.549 1.019 3.530
Corey Maggette 2.978 1.016 1.133 1.017 1.845 1.438 3.805 0.932 2.873
Jason Williams 2.184 1.329 0.348 1.327 1.836 1.879 3.352 1.040 2.312
James Jones 1.539 1.269 -0.235 1.269 1.773 1.795 2.622 0.911 1.711
Charlie Bell 2.395 1.165 0.632 1.164 1.763 1.647 1.802 0.720 1.082
Monta Ellis 2.722 1.071 0.960 1.074 1.762 1.517 2.808 -0.337 3.146
Louis Williams 2.844 1.467 1.092 1.466 1.752 2.074 4.862 2.234 2.628
James Posey 0.454 1.043 -1.290 1.041 1.744 1.474 1.526 -1.962 3.488
Al Harrington 2.090 0.964 0.352 0.965 1.738 1.364 3.313 -0.563 3.876
Marcus Thornton 1.523 1.586 -0.214 1.587 1.737 2.244 4.694 2.049 2.645
Mo Williams 4.231 1.043 2.500 1.043 1.730 1.475 4.634 1.261 3.373
Matt Barnes 2.686 1.002 0.963 1.003 1.723 1.418 1.990 -0.422 2.412
Mike Bibby 2.404 1.083 0.714 1.083 1.689 1.532 4.113 1.069 3.044
Samuel Dalembert 1.964 1.148 0.277 1.149 1.688 1.624 -0.394 -3.233 2.839
Matt Harpring 1.523 1.536 -0.133 1.534 1.655 2.171 1.470 1.376 0.094
Travis Outlaw 2.245 1.110 0.606 1.110 1.640 1.570 1.483 -0.028 1.511
Mike Miller 3.222 1.019 1.623 1.020 1.599 1.442 2.512 0.997 1.515
Boris Diaw 0.719 1.046 -0.868 1.045 1.587 1.479 1.521 0.141 1.379
Kirk Hinrich 1.219 1.028 -0.366 1.027 1.586 1.453 3.203 -0.352 3.555
Kevin Martin 6.501 1.015 4.928 1.016 1.573 1.436 5.792 1.553 4.239
Andray Blatche 0.470 1.119 -1.093 1.121 1.563 1.584 0.624 -1.378 2.002
Earl Watson 2.573 1.045 1.015 1.045 1.558 1.478 1.838 0.107 1.731
Carlos Delfino 0.550 1.194 -1.007 1.193 1.557 1.688 2.389 -0.627 3.016
Richard Jefferson 1.736 1.009 0.185 1.009 1.551 1.427 3.243 0.649 2.594
Kendrick Perkins -0.024 1.299 -1.566 1.298 1.542 1.836 -0.966 -2.686 1.720
Nate Robinson 2.508 1.131 0.977 1.135 1.531 1.602 4.700 1.264 3.436
Kris Humphries 1.925 1.318 0.410 1.316 1.516 1.862 1.516 -0.946 2.461
Carl Landry 3.805 1.216 2.328 1.218 1.477 1.721 3.381 0.414 2.966
Jermaine O’Neal -1.352 1.162 -2.792 1.163 1.440 1.643 0.741 -2.302 3.043
DeJuan Blair 2.180 1.721 0.750 1.719 1.430 2.433 1.653 -1.501 3.154
Sebastian Telfair 2.246 1.187 0.829 1.191 1.417 1.682 1.252 1.097 0.155
Courtney Lee 3.255 1.269 1.867 1.268 1.389 1.793 1.413 0.047 1.366
Al Jefferson 3.106 1.060 1.732 1.061 1.375 1.500 3.392 -2.134 5.527
Jose Calderon 4.544 1.188 3.195 1.187 1.349 1.679 6.735 0.710 6.025
Jason Smith -0.569 1.677 -1.913 1.682 1.344 2.375 -0.673 -0.660 -0.013
J.J. Redick 4.360 1.342 3.028 1.343 1.332 1.899 3.448 2.816 0.632
Marco Belinelli 2.673 1.429 1.344 1.431 1.329 2.022 2.119 1.916 0.203
Chris Quinn 4.255 1.691 2.934 1.690 1.321 2.390 3.716 1.865 1.851
Mikki Moore 1.201 1.209 -0.114 1.209 1.315 1.710 0.320 -1.331 1.652
Steve Blake 3.197 1.055 1.915 1.053 1.282 1.491 2.728 0.837 1.891
Goran Dragic 4.142 1.822 2.877 1.819 1.265 2.574 2.270 1.691 0.579
Francisco Garcia 2.120 1.122 0.862 1.121 1.258 1.586 1.902 -0.483 2.385
Arron Afflalo 1.331 1.249 0.088 1.248 1.243 1.766 1.923 0.905 1.018
Derrick Rose 3.658 1.381 2.422 1.380 1.236 1.952 5.163 2.728 2.435
Francisco Elson -1.720 1.497 -2.950 1.496 1.230 2.116 -2.097 -1.275 -0.823
Mehmet Okur 3.026 1.285 1.847 1.284 1.179 1.817 4.064 -1.038 5.102
Jason Maxiell 1.924 1.274 0.751 1.272 1.173 1.800 1.377 -0.698 2.075
Delonte West 0.674 1.134 -0.492 1.134 1.166 1.604 1.889 -0.459 2.348
Chris Wilcox 1.208 1.214 0.068 1.213 1.140 1.716 1.258 -0.603 1.861
Tyreke Evans -1.054 1.598 -2.191 1.598 1.138 2.260 2.906 0.165 2.741
Antonio Daniels 2.846 1.388 1.714 1.390 1.133 1.964 3.559 0.908 2.651
Sam Young 2.340 1.770 1.238 1.768 1.101 2.502 -0.177 1.065 -1.242
Luther Head 1.285 1.263 0.237 1.262 1.049 1.785 1.915 1.590 0.325
Glen Davis -0.889 1.377 -1.912 1.378 1.023 1.948 0.683 -1.292 1.974
Grant Hill 0.352 1.058 -0.660 1.056 1.012 1.495 1.874 -0.698 2.572
JaVale McGee 3.913 1.476 2.903 1.476 1.010 2.087 1.366 -1.880 3.246
Mardy Collins 0.538 1.696 -0.460 1.697 0.998 2.399 -1.592 0.377 -1.969
Dahntay Jones 0.703 1.194 -0.275 1.194 0.978 1.688 -0.231 0.658 -0.890
Nick Young 1.170 1.280 0.226 1.281 0.943 1.811 1.773 2.516 -0.743
Brook Lopez -0.065 1.394 -0.931 1.394 0.866 1.972 2.752 -1.227 3.980
Linas Kleiza 3.113 1.212 2.266 1.213 0.846 1.715 2.104 1.190 0.914
Chris Duhon 1.280 1.103 0.437 1.102 0.843 1.559 2.210 1.430 0.780
Mario Chalmers 1.290 1.468 0.458 1.467 0.832 2.076 1.850 -1.935 3.785
Jamaal Magloire -0.777 1.667 -1.591 1.664 0.814 2.355 -1.891 -1.222 -0.669
Josh Boone 1.610 1.466 0.829 1.462 0.781 2.071 1.133 -0.814 1.946
DeShawn Stevenson 2.503 1.226 1.725 1.227 0.779 1.735 1.786 0.781 1.005
Chris Douglas-Roberts -0.231 1.632 -0.914 1.630 0.683 2.307 0.141 1.485 -1.345
Juan Dixon 0.763 1.503 0.111 1.505 0.651 2.127 0.299 1.245 -0.947
O.J. Mayo 4.131 1.448 3.489 1.447 0.642 2.047 2.170 1.383 0.787
Randy Foye 2.699 1.100 2.079 1.102 0.620 1.557 2.557 1.599 0.958
Kwame Brown -0.899 1.337 -1.481 1.336 0.582 1.890 -0.286 -1.558 1.272
Andrea Bargnani 3.423 1.134 2.852 1.133 0.572 1.603 2.322 0.284 2.038
Antoine Wright 1.469 1.253 0.954 1.249 0.514 1.769 0.012 -0.180 0.192
Joey Graham 0.215 1.318 -0.275 1.314 0.490 1.861 0.223 1.274 -1.051
Tony Battie -3.947 1.600 -4.434 1.598 0.487 2.261 -1.135 -0.686 -0.448
Jannero Pargo 0.587 1.490 0.106 1.494 0.481 2.110 1.337 2.675 -1.337
Chris Kaman -1.621 1.163 -2.093 1.164 0.472 1.646 -0.366 -1.871 1.505
Darren Collison 3.299 1.557 2.870 1.559 0.429 2.203 2.599 1.898 0.701
Sasha Pavlovic -1.452 1.312 -1.872 1.309 0.419 1.853 -0.580 1.143 -1.723
Joe Smith 0.734 1.245 0.352 1.245 0.382 1.761 1.461 -0.256 1.716
Ryan Gomes 0.631 1.077 0.282 1.077 0.349 1.523 1.485 0.135 1.350
Marcin Gortat -0.607 1.614 -0.940 1.609 0.333 2.279 1.427 -1.631 3.058
Jarvis Hayes 1.931 1.229 1.634 1.230 0.296 1.739 1.270 0.447 0.824
Andres Nocioni 2.258 1.081 2.002 1.082 0.256 1.530 1.542 0.523 1.019
Charlie Villanueva 3.084 1.120 2.859 1.118 0.225 1.582 3.165 0.055 3.110
Roger Mason 1.604 1.165 1.412 1.165 0.192 1.648 2.774 2.234 0.541
Brevin Knight -2.196 1.511 -2.355 1.511 0.160 2.137 1.332 -2.291 3.624
Rasual Butler -0.599 1.080 -0.755 1.082 0.156 1.528 1.408 0.993 0.416
Mickael Pietrus -0.782 1.111 -0.881 1.112 0.099 1.572 1.629 -0.571 2.200
Marcus Williams 1.820 1.713 1.777 1.714 0.043 2.424 0.984 2.656 -1.672
Malik Allen 0.438 1.601 0.432 1.603 0.006 2.266 -1.119 0.852 -1.971
Spencer Hawes 1.183 1.202 1.215 1.203 -0.032 1.701 0.060 -0.384 0.444
Eddie House 2.167 1.246 2.229 1.250 -0.062 1.765 3.614 1.514 2.101
Brent Barry 3.672 1.681 3.766 1.681 -0.094 2.377 3.384 1.156 2.229
Maurice Evans 2.481 1.054 2.584 1.052 -0.103 1.489 2.320 1.048 1.273
Bobby Simmons 1.217 1.524 1.325 1.528 -0.107 2.158 1.227 -0.116 1.343
Darko Milicic -1.008 1.130 -0.818 1.128 -0.191 1.597 -1.403 -2.200 0.796
Matt Carroll 2.906 1.317 3.104 1.318 -0.198 1.863 1.472 1.586 -0.114
Kelenna Azubuike 2.245 1.332 2.459 1.334 -0.214 1.886 2.351 1.007 1.344
Hilton Armstrong -2.796 1.633 -2.573 1.638 -0.223 2.313 -2.299 -0.713 -1.586
Lou Williams 5.445 1.720 5.701 1.724 -0.256 2.436 4.359 1.411 2.949
Shawne Williams 1.937 1.609 2.195 1.610 -0.258 2.276 1.785 0.331 1.454
Jamaal Tinsley -1.659 1.588 -1.397 1.587 -0.262 2.245 1.710 -0.043 1.753
Larry Hughes 1.007 1.177 1.289 1.176 -0.282 1.664 1.117 -0.088 1.205
Troy Murphy 1.484 1.107 1.773 1.107 -0.289 1.565 3.502 -1.225 4.726
Devean George 0.402 1.509 0.742 1.507 -0.340 2.133 -0.240 -0.950 0.710
Willie Green 0.187 1.110 0.528 1.111 -0.342 1.570 1.177 2.062 -0.886
Brandon Rush 0.011 1.330 0.530 1.330 -0.519 1.881 -0.085 0.775 -0.860
Aaron Brooks 4.017 1.365 4.541 1.370 -0.524 1.934 4.500 2.787 1.713
Martell Webster 0.524 1.214 1.091 1.216 -0.567 1.718 1.547 0.920 0.627
Jonny Flynn 1.836 1.842 2.441 1.844 -0.605 2.606 -0.061 2.720 -2.781
Jarrett Jack 2.133 0.969 2.764 0.969 -0.631 1.370 2.651 0.757 1.894
Rodney Carney 0.270 1.291 1.003 1.291 -0.733 1.826 1.305 0.593 0.711
Wayne Ellington -0.783 1.816 -0.033 1.814 -0.749 2.567 0.165 2.942 -2.777
Shelden Williams -0.390 1.363 0.363 1.361 -0.754 1.927 -0.201 -1.576 1.376
Josh Childress 2.424 1.369 3.197 1.367 -0.773 1.935 2.464 0.450 2.015
John Salmons 0.338 0.957 1.214 0.956 -0.876 1.353 2.128 -0.122 2.250
Darrell Arthur -3.959 1.626 -3.070 1.624 -0.889 2.298 0.020 -1.890 1.910
Greg Buckner -0.457 1.681 0.486 1.671 -0.942 2.370 -1.400 -0.828 -0.572
Will Bynum 2.868 1.620 3.815 1.621 -0.947 2.292 2.893 1.766 1.127
Sasha Vujacic 4.080 1.336 5.038 1.338 -0.958 1.891 3.243 0.861 2.382
Jerryd Bayless 1.766 1.592 2.730 1.589 -0.964 2.250 2.350 2.255 0.096
Stephen Graham -1.390 1.663 -0.294 1.659 -1.096 2.349 -1.189 1.869 -3.058
Jerry Stackhouse 0.653 1.472 1.816 1.474 -1.163 2.083 2.105 1.857 0.248
Devin Brown 0.255 1.265 1.457 1.264 -1.202 1.789 1.531 0.806 0.725
Jason Kapono 0.854 1.173 2.077 1.174 -1.223 1.659 0.942 2.709 -1.767
Morris Peterson 0.074 1.373 1.316 1.374 -1.242 1.942 2.185 0.236 1.949
Michael Finley 3.401 1.274 4.649 1.271 -1.248 1.799 2.447 1.574 0.872
Al Thornton 1.604 1.092 2.888 1.094 -1.284 1.546 0.768 0.168 0.599
Erick Dampier -3.495 1.296 -2.175 1.296 -1.320 1.833 0.689 -1.908 2.597
Yi Jianlian 0.154 1.289 1.506 1.289 -1.352 1.823 -0.434 -0.708 0.274
Ryan Hollins -0.207 1.516 1.237 1.514 -1.444 2.142 -0.374 0.492 -0.866
Mike James 0.565 1.549 2.021 1.549 -1.456 2.191 1.637 1.759 -0.123
Joel Anthony -3.844 1.513 -2.360 1.511 -1.485 2.138 -1.599 -1.774 0.175
Carlos Arroyo -0.262 1.345 1.258 1.343 -1.520 1.901 2.312 1.755 0.557
Desmond Mason -0.909 1.324 0.627 1.326 -1.536 1.874 -1.374 0.518 -1.892
DeMar DeRozan -0.591 1.595 1.000 1.593 -1.591 2.254 1.273 0.573 0.700
Juwan Howard -0.696 1.332 0.951 1.328 -1.647 1.881 -0.287 0.978 -1.265
Nazr Mohammed -1.248 1.295 0.413 1.294 -1.660 1.831 1.310 -1.118 2.428
Anthony Johnson -0.883 1.369 0.823 1.367 -1.706 1.934 1.415 0.891 0.524
Tyronn Lue 0.650 1.748 2.413 1.745 -1.763 2.470 3.155 2.540 0.615
DeAndre Jordan 0.920 1.633 2.708 1.632 -1.787 2.309 0.123 -1.630 1.753
Mark Blount -1.540 1.736 0.261 1.736 -1.801 2.455 -0.410 0.173 -0.584
Jeff Green 0.944 1.218 2.817 1.218 -1.873 1.723 0.975 -0.759 1.734
Shannon Brown -1.251 1.504 0.662 1.505 -1.913 2.127 1.334 0.677 0.657
Sonny Weems 2.123 1.750 4.045 1.748 -1.922 2.474 -0.298 1.954 -2.252
Anthony Carter 0.369 1.378 2.433 1.375 -2.064 1.947 0.728 -1.179 1.907
Drew Gooden -0.566 1.131 1.825 1.130 -2.390 1.598 1.107 -1.061 2.168
Fred Jones -0.776 1.373 1.768 1.376 -2.543 1.944 1.281 -0.337 1.618
Marreese Speights 0.592 1.796 3.142 1.797 -2.550 2.541 2.682 -0.157 2.839
Sergio Rodriguez 0.961 1.509 3.528 1.509 -2.567 2.134 2.167 1.256 0.911
Bostjan Nachbar -0.734 1.718 1.915 1.718 -2.649 2.429 2.686 0.627 2.059
Solomon Jones -1.519 1.695 1.149 1.692 -2.668 2.395 -0.495 -0.231 -0.264
Trenton Hassell -0.191 1.394 2.723 1.393 -2.914 1.971 -0.984 1.076 -2.060
Smush Parker -1.341 1.838 1.695 1.842 -3.036 2.602 0.107 -0.250 0.357
Jose Juan Barea 0.833 1.505 4.029 1.500 -3.196 2.125 2.793 2.715 0.078
Royal Ivey 0.984 1.536 4.292 1.533 -3.308 2.170 0.278 0.763 -0.485
Damien Wilkins -0.618 1.163 2.760 1.163 -3.379 1.645 0.623 -0.017 0.640
Adam Morrison -0.079 1.625 3.342 1.628 -3.422 2.300 -0.323 3.043 -3.366
Ricky Davis -0.177 1.327 3.263 1.328 -3.440 1.877 1.638 1.493 0.146
Dominic McGuire -1.505 1.539 2.059 1.536 -3.565 2.174 -1.501 -1.001 -0.500
Wes Matthews -1.454 1.813 2.313 1.810 -3.768 2.561 2.837 0.396 2.441
Johan Petro -4.458 1.331 -0.446 1.328 -4.012 1.881 -1.764 -1.235 -0.530
Earl Boykins -0.396 1.323 4.037 1.324 -4.433 1.872 3.463 2.891 0.572
Brian Skinner -2.865 1.560 1.729 1.555 -4.594 2.203 -2.135 -1.821 -0.314
Hakim Warrick -1.782 1.034 3.215 1.033 -4.997 1.461 1.982 0.642 1.340
Josh Powell -2.804 1.501 4.116 1.496 -6.920 2.120 -1.421 1.807 -3.228
Eddy Curry -4.447 1.773 2.743 1.777 -7.190 2.510 0.289 1.794 -1.505
J.J. Hickson -3.677 1.508 4.297 1.505 -7.975 2.130 0.357 0.397 -0.040

\vspace{4mm}
*95% Confidence Intervals for each player’s “impact” can be calculated as 1.96 standard errors above and below our coefficient estimates.

2) Additionally one can attempt to form a prior belief about the value of each individual player based on his statistical contributions. We can easily measure the accumulation rate of points, rebounds, and assists over a long period of time for any given player and our measurements will quickly converge to the truth with little noise. If we can measure the value of an individual statistical contribution, it is then straightforward to find a much lower noise measurement of the part of a player’s value that is correlated with his statistical contribution. This is known as statistical +/- and can be derived via a secondary regression of the previously computed Adjusted +/- values onto the statistical contributions of all the payers in our sample.

Explaining Offensive Adjusted +/-

Variable Impact on OAPM Standard Error t-statistic
Points/Poss .578 .047 12.35
Shots/Poss -.363 .061 -5.99
TO/Poss -1.3637 .163 -8.37
Ast/Poss .657 .048 13.65
Offensive Rebounding Rate .263 .044 5.97
3PA/Poss .230 .041 5.63
Constant -.047 .006 -8.2

RMSE=.01511
$latex R^2=0.6322Explaining Defensive Adjusted +/-

Statistical Contribution Impact on DAPM S.E. t-statistic
Defensive Rebounding Rate -0.1136786 0.0252884 -4.5
Block/Possessions -0.5996897 0.1475999 -4.06
Turnover/Possession 0.4969019 0.1447022 3.43
MPG -0.0000246 5.60E-06 -4.39
Steal/Possession -1.524569 0.2020461 -7.55
PF/Possession -0.224203 0.1035915 -2.16
Bail Out Rate -0.1696974 0.0470056 -3.61
Shots/Possession 0.2726532 0.0549351 4.96
_cons 0.0513824 0.0088427 5.81

$RMSE=.01771$
R^2=0.4305*Bail out rate is the fraction of shots a player takes for his team in the last five seconds of the shot clock on offense. Interestingly it was not predictive of OAPM, but in combination with total shot attempts it seems to be predictive of DAPM. I think the intuition here that the fraction of your shots that occur at the end of the shot clock is a good proxy for your athletic ability and in turn for your defense. Players who take many shots, but not at the end of the shot clock most likely do so to do a limited ability to create separation from the defense which correlates to low athletic abillity and generally poor defense.

The value of APM that this regression predicts for each player based on the relevant statistics is known as Statistical Plus Minus (SPM) and is indicated for each player on the table above. Standard errors for SPM are not reported but they are generally very low (~.3 pts/100 possessions) because both the coefficients and the covariates in the above regressions tend to have very little noise. However, this does not mean that SPM is a nearly perfect measure of player value. It is merely a very precise measure of the part of player value that is correlated with statistical contributions.

But where does this leave us? We have one noisy measure of a player’s complete ability (APM) and another much more precise measure of only the parts of a player’s ability that can be correlated with their statistical contribution.

As a hint as to where this is going I will point out that the RMSE on our Statistical Plus Minus regression is significantly larger than the standard errors in our Adjusted Plus Minus regression (especially for defense). The difference is precisely the non-statistical component of player value that drive us to look at APM in the first place. An economist might call it a player-specific fixed effect and a basketball nerd might call it an “intangible”, but it is clearly there and it is clearly important. Unfortunately our only measure of it is bundled up with the natural noise of Adjusted Plus/Minus.

*This post is mostly building blocks for future work and very little of it is wholly original. Steven Illardi and Dan Rosenbaum (perhaps among others) have done similar analysis.

 

Small N Decisions and the Data Deluge

I was recently asked in a podcast, “what is the next generation in statistics?” My answer is it will all be about N.

In statistics, N is the sample size. The whole “advanced stats” debate in basketball and other sports seems focused not on N, but on which stats best quantify player value. This is important, don’t get me wrong, but it’s more important from a contracting perspective than for actively making coaching decisions. The reason is that nearly all of these advanced stats need a large sample size, large N, to function properly. In Matt and I’s paper that he has discussed heavily on this blog, the same is true. For line-ups that have less than say 4,000 possessions together, we are wary to make any strong claims about efficiency. Now show me the coach that can wait 4,000 possessions to make a decision and I’ll show you a guy who has a job at FIU.

So while the “what’s the best metric” debate rages, coaches and other insiders still feel safe in not giving to much play to stats, because they are not very useful in making decisions in the short or medium run. Take a very simple example. Suppose a hitter in baesball hit .300 last year. Over the first 100 official at bats he has 23 hits, so he’s batting .230. Now given that he is truly a .300 hitter, he would have 23 or fewer hits about 8% of the time (this can be found in a binomial table N=100, M=23, P=.3). Ok 8%, that says it’s pretty unlikely, but if the player has a strong reputation, or is otherwise important to the team, maybe it’s worth keeping him the lineup. Mind you, this is already about 1/6th of the way through the season and the guy is hitting a puny .230. Is he now “worse”? Should he be benched?

Well if you were the coach, how moved would you be by 8%? Perhaps a little. But probably you’d weigh it against what you see in his swing, what you see in practice, reports from the hitting coach, etc. If those subjective signals are positive, you keep him in the line-up. But these days you don’t have to make decisions that way. Baseball teams know exactly how hard the ball was hit and the trajectory. So the statement “he’s swinging the bat well, but has had bad luck” can be tested. If true, we should see more line-outs and deep fly balls caught than expected. The more detailed data “takes the noise” out of the raw statistic of batting avg. Given how much luck is associated with hits vs. out (many ground balls, for instance,i rely on “hitting holes”, etc.), this new data might give you the same confidence with 50 at bats, that you previously had with 100.

In basketball the same can also be done. By looking at shooting zones, defense on shots, and various other metrics, coaches can isolate poor decisions quite quickly. These poor decisions might take a long time to “significantly” show up in performance, because of the noise associated with the randomness of shot success.

So next time you are thinking about your favorite stat, think about N and how new data sources can help reduce the number of trials you need to act. Being “right” about a player after the season’s over is little consolation when you are watching the playoffs on TNT.

Marginal Shots and Usage (Skill) Curves

If you have been reading some of our other posts (this is post 3 in our history :) ), you know that our entire analysis of optimal shot selection hinges on our ability to infer a particular player’s “marginal shot” in each period of the shot clock. This is not trivial. It is really easy to measure a player’s average efficiency, but players do not tell us which of their shots they just barely decided to shoot so that we can tell them if those were good or bad shots.

Our approach to this difficulty is to infer the value of these “marginal” scoring opportunities by comparing player performance at different points in the shot clock. The actual technique we use is called Maximum Likelihood and is quite complicated because it has to deal optimally with the limited amount of data we have. However, the intuition behind its performance is pretty straightforward.

 

Suppose, for example, we observe Kobe Bryant play for 1,000,000 years and we observe that with 10 seconds on the shot clock Kobe uses a possession exactly 4% of the time and his team scores exactly 1 point/possession when he does so. Then with 9 seconds remaining on the shot clock, Kobe gets more aggressive, shoots 5% of the time but is only able to average .98 points/possession.

We assume that with 10 seconds left on the clock, Kobe also had the option to preform as he did with 9 seconds on the shot clock, but is choosing not to. Is his choice optimal or should Kobe be aggressive? From a little calculation we can see that the additional opportunities Kobe takes in period 9, but not in period 10, are worth \frac{0.98*.05-1.00*.04}{.01}=.9 points on average. This is the value of Kobe’s marginal shot with 10 seconds on the shot clock. If the Lakers possessions are only worth an average of .85 at this point in time, the Lakers are better off with Kobe being more aggressive and slightly less efficient and Kobe is undershooting in this period of the shot clock.

More generally, if we plot all the combinations of usage and efficiency which Kobe is capable of (in one period of the shot clock), we see a generally linear trend as Kobe is forced to accept lower and lower quality shots on the margin. We call the line we fit to it, Kobe’s usage curve and try to ask the question whether or not the Lakers are better off if Kobe were to slide up or down that line.

 

 

Suppose the Lakers could trick Kobe into thinking there was slightly less time on the shot clock than there really was, if this would make the Lakers a more efficient team than Kobe would be an under-shooter. If the opposite is true, our model would find Kobe as an over-shooter.

Finally it is worth noting that a very important assumption is buried in this post. This analysis only works if we believe that in shot clock period 10, Kobe is capable of playing the way he does in shot clock period 9. Basically, we assume the shot clock matters for what a player decides to do, but not for what he is capable of doing. If you believe the abilities of the offense and the defense fundamentally change across values of the shot clock, you should have some serious questions about our analysis.

Why Better Teams Lose

Matt once told me that “the biggest fallacy in sports is that the better team will win.”  This is an interesting statement, especially in light of the unexpected results from the early rounds of the NBA Playoffs this year.  Matt’s statement can be understood in two ways.

First, variance is king.  The idea that the winner of a 7-game playoff series is better than the loser either ignores or significantly discounts the role of variance in the outcome of that series.  The Lakers won it all last year, but do you remember how close the Finals were?  When a few inches could have changed the outcome does it really make sense to say the Lakers were somehow better than the Celtics based on the outcome?  If you think the Lakers were actually better, would you feel differently if the ball had bounced the Celtics way and the Lakers had lost?

 

In any game and in any series the better team (the team that would win the majority of games if they played an infinite number of games) can very easily lose.  This is precisely why we hold players and teams that win repeatedly in such high esteem; bigger sample sizes make us more confident that they deserve the praise.  Still, variance is never given enough credit.  Fans and commentators are beset by hindsight biases, and as soon as an outcome is known these biases are deployed to explain exactly what happened and why it was actually the only possible outcome.  This is irrational thinking.

The second way to look at Matt’s fallacy is actually more interesting.  Variance tells us that the better team won’t always win, but the truth is we don’t know who the better team actually is.  To explain this point, it will be helpful to use the playoffs as a vehicle.

Before the Spurs entered their first round matchup against the Grizzlies, who did we think was the better team?  The Spurs had the second best regular season record in the league at 61-21, and were the 1st seed in their conference.  The Grizzlies were the 8th seed with a record of 46-26.  We can mitigate for variance slightly by looking at their average regular season point differentials (instead of just wins and losses which are noisier): the Spurs average point differential was +5.7 and the Grizzlies was +2.3.  Yet the Grizzlies wound up beating the Spurs 4-2 in the first round of the playoffs, and even 2 wins for the Spurs was a stretch.

We can draw two conclusions from this information.  First, the Spurs were better than the Grizzlies at winning games against the rest of the NBA (we can say this with high confidence), and second the Grizzlies were better than the Spurs at winning games against each other (we can say with much lower confidence).  These results are seemingly incoherent.

This means that win-loss records and point differentials are imperfect exogenous indicators of who the better of two teams is.  The very nature of being “better” than another team is relative; it depends entirely on the comparison between the two teams.  Knowing how well each team played against the rest of the league tells us something, and is often a good indicator of who will win between two teams, but it’s not perfect, and nuance escapes it.

Look at the Lakers and the Mavericks for another example.  Those teams had identical records and similar average point differentials, yet the Mavericks swept the Lakers out of the playoffs.  If we stipulate that the Mavericks were actually the better team of the two (still not a given, considering the role of variance) what can explain the disparity?  Matchups.  Neither of the Lakers’ big men could guard Dirk Nowitzki and that turned out to be pivotal in the series.  This is the kind of information that doesn’t come out when simply looking at how each team faired against the rest of the league.

We could look at how two teams played against each other during the regular season, but the sample size is just too small to say anything meaningful.  The 4 games two conference rivals play during the regular season do not give us enough data to say meaningful things about which team is superior, and we have even less data on individual player matchups.  After factoring in other variables like injuries, fatigue, and varying degrees of effort we aren’t left with much information to make meaningful predictions about a playoff matchup.

As Justin has pointed out, we need better ways to manipulate smaller data sets; until we can draw meaningful conclusions from fewer observations we are handicapped in predictive analysis.  That doesn’t mean we won’t get there, and this is certainly something we’ll be working on in the future.

A Collective Bargaining Puzzle

A Collective Bargaining Puzzle

With the NFL in the midst of a labor standoff and a similar dispute looming in the NBA, an interesting question recently occurred to us: why don’t the Owners offer to double the League minimum salary in return for a hard cap on max contracts at a substantial discount to what the highest paid players are currently making? Since there are only a few players earning huge contracts and the vast majority of players earn the minimum salary or slightly more, the Owners could easily set the new minimum and maximum figures in a way that would save them a lot of money and would be beneficial to the majority of the players.

For example, consider a hypothetical league with 100 players. 60 players are making $10, and 40 players are making $100. The players are making $4,600 collectively. If an agreement was reached where the minimum salary was doubled but there was a hard max salary cap at $50, you’d have 60 players making $20 and 40 players making $50. The players are now making $3,200 collectively. The owners are saving $1,400 (30%) and the majority of players (the 60 low earners) just got their salaries doubled. The owners would all vote for this plan, as would the 60 low-earning players. So as long as the players’ union has a majority voting rule we would expect to see this bargain agreed upon. You could take this to the extreme and have the new agreement stipulate a single $11 contract price for all players. The majority of players would still be better off and the Owners would save even more money.

So why don’t we see an agreement like this coming forth?

This situation is hardly unique to professional sports, and we see similar dynamics in a variety of union negotiation settings. Teachers unions could presumably also get better deals for the majority of their members by sacrificing the interests of a few high earners. The difference between a teachers union and the NFL Players Union, however, is that in a teachers union pay is determined by seniority. So while the majority of union members may be low earners, they may want to preserve the salaries of the high earners since they hope to eventually have those salaries themselves.

But this logic doesn’t work in the NFL. One might argue that the low-earning players want to preserve the maximum contracts possible in the hopes that they will one day be the stars earning those big contracts. But considering the average duration of an NFL career (around 3.6 years), the mode NFL career (1 year), and the mode NFL contract (1 year), saying that the majority of these low earners are expecting to make it to the top of the pile someday takes a pretty dim view of their intelligence and of the intelligence of those advising them (their agents). Add on to this the variation of contract value by position (the highest earners are all quarterbacks) and you’re just not going to convince me that an undrafted rookie center who was signed for 1 year at the league minimum wouldn’t prefer to have his salary doubled for that one year over preserving the maximum earning potential of the top 1% of players in the league, none of whom even play his position. Something else must be going on.

Our Answer

There are 2 answers to this puzzle, both lying in the internal dynamics of the players union.

First, while the union probably doesn’t employ a system of weighted voting rights, it does have a clear incentive to keep its high-earning members happy. If the high-earning players become sufficiently dissatisfied they may well opt out of the collective bargaining agreement. Since high-earning players drive revenue, the Owners could then scrap the CBA with the union, sign the star players along with a host of replacement players. The overall quality of the League would diminish some, but would its revenues? Maybe not. Alternatively, the high-earning players could take their talents elsewhere: would we see a mass exodus of NFL stars to the Canadian Football League? Probably not, but the point remains that the union derives much of its bargaining power from its inclusion of the star players that drive League revenue. The union is strongly incentivized to keep its star players happy which probably caches out in de facto weighted voting rights where the median earning members are compelled to vote in line with the high earners.

Second, high earners likely comprise a disproportionate share of the union leadership, as compared to the distribution of salaries in the League. With an average NFL career of 3.6 years and a mode career of 1 year, the union is a very amorphous body. In such a body, the leadership positions are going to be held by members that have been around the longest and that are likely to be around in the future. So we’re not going to see a lot of minimum contract players rising to leadership roles within the union, simply because they aren’t going to be around long enough. Since high earners (especially quarterbacks) will have the longest careers it should not surprise us that we see high earners occupying leadership roles within the union. Indeed, in the “class-action” suit recently brought by ten NFL players, 3 were high-earning quarterbacks (Eli Manning, Drew Brees, and Tom Brady), and 6 out of the 10 players who brought the suit are high-earning house hold names (the other 3 notables were Vincent Jackson, Osi Umenyora, and Mike Vrabel).

Conclusion

We have shown that high-earning star players do have outsized influence in the collective bargaining process compared to the vast majority of the players in the NFL who are comparatively low earners. This is meaningful in at least one respect. The players and their representatives want fans to blame the Owners for the Lockout. They say the Owners are greedy and self-interested, and that they are trying to rip-off the players, and I’m not here to dispute those claims. But it is clear from the above analysis that the players, in particular the high earners, are acting with the same greed and self-interest they are decrying. The high earners want to vilify the Owners for an unfair distribution of revenue between the owners and the players, yet these same high earners won’t accept a more equitable distribution of salaries within the ranks of the players.

The players could, after all, accept a reduction in maximum contracts in exchange for better salaries for low earners and better league-wide benefits like health insurance and retirement funds. The Owners would save money, the majority of players would make more money, and the only players who would lose money would be the already mega-millionaires. Most importantly, we’d have an NFL season. But the high earners won’t make that bargain, and in refusing to do so are just as much responsible for the low salaries and poor benefits afforded low-earning NFL players as the Owners are. If we don’t have a 2011 NFL season the high-earning players will be just as responsible as the Owners are for that as well.

Parity in the NBA

In 2007, Kevin Garnett, Paul Pierce, and Ray Allen teamed up in Boston and Pau Gasol arrived in L.A. where he joined Kobe Bryant.  In the three years since that blockbuster offseason, the Celtics and the Lakers have combined for all three NBA Championships, and only one other team has even appeared in the finals (the Magic in 2009).  While 3 seasons is a pretty small sample size, I don’t think anyone will dispute that when the best players in the league get together on the same teams, those teams are likely to be successful.  Then came “the Decision,” thrusting this question to the forefront of NBA discourse: is this really good for the League?

A common criticism is that when star players cluster together on a few major market teams it weakens competition which is bad for the NBA.  To evaluate this criticism empirically, you need some sense of how competitive the League is today and how the current level of parity in the league stacks up historically.  This is pretty easy to look at.  We took the win/loss records of all the teams for the past 40 NBA seasons and simply looked at the standard deviation of the number of games won during each season.  This is basically just the average deviation from the mean, which is 41 games.  The higher the standard deviation in a given season, the more uneven the teams were in that season, or, equivalently, a higher standard deviation corresponds with lower League parity.  Here are our results…

 

So what does this graph tell us?  For starters, everyone’s intuition about the past few seasons seems right: parity has declined as talent has become more unevenly dispersed in the post KG/Pau NBA.  We might also observe a general decrease in parity since unrestricted free agency emerged in 1983, though this impact seems slight and is likely mitigated by the salary cap that came around at the same time.  Finally, while blockbuster free agency signings or trades are obviously not necessary conditions for low parity in the league, it is pretty apparent that dominant players playing together on dominant teams always drive down parity (as we would expect), from Larry Bird’s Celtics to Magic Johnson’s Lakers to Michael Jordan’s Bulls…

Of course this begs the question: is parity even good for the League?  The NBA exploded in popularity during the Magic/Bird rivalry and MJ’s decade of dominance, and these were the years during which parity was at its all-time low.  Check out the NBA Finals TV ratings for the past 23 years.

So what can we take away from this?  The best players, playing together on dominant teams are definitely bad for parity, but they’re also great for ratings.  This doesn’t necessarily mean that parity is bad for the league, as there are also costs associated with disparity.  For all the renewed interest in the Miami Heat and this year’s Eastern Conference Playoffs, there are probably, fewer NBA fans in Cleveland this year than last year.  It’s an empirical question, what level of parity is most efficient for the league, and the presence of star players is a hugely confounding factor since stars make the game more fun to watch and draw in fans while simultaneously decreasing parity in the league with their dominance. Still, if we overlay the above two graphs, comparing levels of parity with NBA finals ratings over time, we can at least take a first crack at the question.

This positive slope of this graph is very statistically significant with a p-value of .001 and a 95% confidence interval of .85-3.9. This means that there is basically NO possibility that we just happened to observe this relationship by random chance. We can say pretty unequivocally that, over the last 25 years, the NBA finals have been better TV in the years when the NBA has had less parity (higher standard deviation of win totals). This is not necessarily a statement of causation (as lots of important things have changed over those 25 years that we might not be accounting for), but certainly good reason to question the dogma of parity.

What I think we can safely say is that perfect parity is not the most efficient level of competition for the league.  While we don’t know what the ideal level of parity is for the league, we know we are not there now, and we can be pretty sure that the trend of star players clustering in major market teams is, in fact, moving the NBA in the right financial direction.

Sorry Chuck.

Allocative and Dynamic Efficiency in NBA Decision Making

Part 1: Half-Court Offense, An Optimal Stopping Problem.

Welcome to Hoop Theory!  For those of you who are familiar with the MIT Sloan Sports Analytics Conference, I am very excited to announce that I have been given the opportunity to present some of my joint research with Justin Rao on Allocative and Dynamic Efficiency In NBA Decision Making at their prestigious venue.   I hope that this blog can be a convenient place to discuss the intuition behind the paper and to have a broader conversation about the application of game theoretic models to our understanding of sports.

We are often critical of players who attempt difficult shots early in the shot clock.   That player’s team may have had many more opportunities to create a high value shot and settling for such a poor opportunities may have been an especially bad choice.   Conversely, as the shot clock ticks toward zero we become progressively more permissive of players settling for difficult shots.   Intuitively, we know that every second on the shot clock that passes is one less second left with which our team can try to create another scoring opportunity if we pass this one up.  The value of not shooting – of continuing the possession – is steadily declining with every second that we do not shoot.

 

If our team’s players are responding to this dynamic optimally, they should be settling for shots more frequently and for progressively lower value opportunities with each passing second.   It should come as no surprise that NBA players behave accordingly.  The blue line on the graph above indicates how the value of holding onto the ball decreases over the shot clock.  Players should shoot when they have opportunities that are more valuable than holding onto the ball.  The values of possessions used at any particular period of the shot clock are expressed by the green line.

Finally, the orange line indicates the hazard rate with which NBA teams attempt to use a possession in any given second.  A hazard rate expresses a conditional probability that an event will happen at a given instant.   Here, the hazard rate expresses the probability that a basketball team that has the ball with t seconds left on the shot clock will use the possession (shoot) before the shot clock gets to t-1. Since you always use the possession with zero seconds on the shot clock (by a shot clock violation, if nothing else) we should not be surprised to see the hazard rate go to 1 on the left end of the graph.

Based on this graph, it is clear that the time remaining on the shot clock is an important determinant of how offensive players should and do behave.   In our paper we model player i as being willing to use a possession with t seconds remaining on the shot clock, only if he realizes an opportunity of greater value than some fixed cut-threshold (c_{i,t}).   We posit two intuitive conditions that a player’s chosen cut-off level must satisfy to be optimal for his team.



Dynamic Efficiency:
A player should shoot if and only if he realizes a scoring opportunity of larger value than the continuation value of the possession.   That is, c_{i,t} should be exactly equal to the continuation value of the possession for player i’s team.


Allocative Efficiency:
If two players, i and j, share a court together, player i should not be passing up shots that are better than some of the ones that player j takes. More formally, for all shot clock periods t, c_{i,t}=c_{j,t}.

It should come as no surprise that the conditions on optimal shot selection are conditions on the worst shot a player is willing to take.   When a player has a very good scoring opportunity (like he is wide open), he has an easy and uninteresting decision to make.  Just as in economics, it is his choices on the more difficult - marginal – shots that are interesting.   Unfortunately there is no simple way to determine the value of the worst shot that a player takes in a given period of the shot clock.   We can only observe each player’s average efficiency over all possessions used, and we really have no idea which attempts may have been wide open jumpers and which may have been closely contested by an elite defender.

The main innovation of our research is that we present a novel scheme for identifying and estimating the value of the worst shot a player takes in each period of the shot clock.   In doing so we must address the fundamental issue of how basketball players trade off between usage and efficiency….

Stay tuned for Part 2.