The Hard Problem of Basketball

Basketball is hard. Figuring out exactly how to translate the total results of a game into explanations of individual performance is even harder. Let's identify how.

Dec 22, 2024

The more complex a system is, the more difficult it is to translate the sum of the individual components into the experience of the entire system. Cognitive psychologists will often distinguish between the “easy problem” - explaining something such as executive function based on the neural signals in the frontal lobe - and the “hard problem” - taking the sum of all neural signals in the brain and trying to translate that into conscious experience. The idea is that specific tasks and functions are actually measurable and therefore explainable while the total experience is still almost entirely a mystery.

Basketball is on some level infinitely complex since it is composed of 13 players a night coming in at varying orders with varying inputs from coaching staffs. This is at minimum, the combination of 30 people’s distinct actions coming together to produce one end result: the final score. However, the more granular we get, the easier it is to explain why a specific action went the way it did. One-on-one basketball, for example, can mostly be explained in simple terms. While NBA basketball is never truly one-on-one, many actions bare strong resemblance. Consider a high post-up resulting in a midrange shot 10 feet from the basket. It is quite easy to talk about the entry pass, how physical the defender was on the catch, the players’ footwork as they jockey for position, how much separation the offensive player was able to create, and the mechanics of the shot itself in order to explain the outcome (a made or missed shot). Of course, all of these components have dozens if not hundreds or thousands of micro components, but we can mostly understand those as sums resulting in the final outcome; a made or missed shot.

However, even this very simple action becomes a little more complicated in five-on-five. Unless both teams commit to allowing the one-on-one action, the defender must worry about a cutter or a kick-out as additional avenues to cover. Perhaps the defensive player can be more physical and aggressive because they know an elite rim protector is standing behind them. Maybe the offensive player is only interested in getting past the defender off the dribble to draw help and create an opening for teammates. The point is that the considerations grow nearly infinitely. While the action itself up until the point when the ball leaves the offensive player’s hands can probably still be explained in very simple terms according to the skills and actions of the two specific people involved, but the different permutations of how the next 10 seconds play out create outlets for impact from the other 8 players on the court even if none of it is recorded; in the box score, tracking stats, or anywhere else.

Of course, we cannot just throw up our hands and say it is impossible to isolate individual impact and use only team stats. We couldn’t explain the post up action without understanding what the cutters were capable of or how effective the rim protector was, player movement dictates GMs and hobbyists must have some way to understand player impact separate from context, and quite frankly, westerners tend to think of themselves as individuals and as such find it easier to relate to and cheer for specific players than they do teams. These needs have led to the creation of composite statistics, usually referred to as advanced statistics. Composite statistics usually aim to take a wide array of different measurable outcomes (“basic statistics” such as points, assists, missed shots, etc) as inputs and output a single number that evaluates a player’s impact. The important takeaway is that these numbers are still ultimately a human evaluation. They are “objective” only in the sense that they are consistent. They are, however, at their core, mere numerical representations of the evaluation of the designer. That is to say, the decision to weigh any given component of the composite heavier or lighter is subject to whatever biases and misconceptions the designer of the statistic brings to the table. To illustrate this point, we’ll use Box Plus-Minus (BPM).

I won’t go too deep explaining the mechanics of BPM - there are plenty of resources online written by people far smarter than I breaking down exactly how it makes calculations and the math used. The gist is that BPM takes plus-minus data for each player and modifies it up or down based on a player’s box score stats. In general, volume is weighted more than efficiency (to account for pressure on defenses), stats are given more weight if a player absorbs a higher percentage of them than their teammates, and if those stats come “out of position.” The first statistical modifier essentially means a player averaging 8 assists per game on a team averaging 23 assists per game will have greater weight put on their assist stat than a player averaging 8 assists on a team averaging 29 assists per game. The second modifier means that a guard recording a block is given more weight than a big recording a block.

I’ll start with the very easy claim: this tends to favor players on worse teams. Theoretically, starting with plus-minus as a baseline should account for this, but there is some indication that high BPMs don’t translate with team success. Despite the common adage that the best player in a series wins more often than not, only two of the top-10 BPM seasons in history resulted in titles and only two of the remaining 8 even made it out of the second round. The major caveat here is that four of the top-10 (and a likely fifth after this season) and three of the top-4 are owned by Nikola Jokic and his Nuggets have not been winning much during that time. Of course, this doesn’t negate the broader point that BPM is inflated by weaker teams but does diminish the sample size we are working with. It does raise a separate question that is central to this essay: I do not believe Nikola Jokic is the greatest player in history, neither does almost anyone, so why is the BPM statistic so certain that he is?

More interesting data can be drawn from the most recent playoffs. In the 15 series played, 11 of them had the individual player with the highest BPM on the team with the lower net rating. While this is the highest number of such series in the last ten years, the overall trend is up. 2015-17 saw 4, 3, and 5 such match-ups while 2022-24 saw 10, 6, and 11, and the intervening years steadily rising. It’s also worth noting that BPM was initially created in 2014, meaning that since its inception the correlation between an individually high BPM and a successful team has gradually been decreasing.

We can only speculate and check our work when trying to appreciate this trend and explain causality. One explanation we could take (and the one I am most partial to) is “what gets measured is what gets done.” Since its creation, the league leader in BPM has been MVP 8 out of 10 times. It is quite possible that players have shifted their styles to fit better into BPM and other composite statistics either in pursuit of individual accolades or because misguided coaches believe maximizing these statistics to be the best way to win games. Here we bump into a major contradiction. BPM rewards higher usage: holding onto the ball more, taking more shots at the expense of efficiency, and punishes for the success of teammates. “Sacrifice” - a trait that is nearly universally agreed upon as a massive benefit towards actually winning - is actively discouraged by a team trying to maximize BPM. Compounding this, BPM does a very poor job assessing defense. If a player were dedicated to maximizing it, their incentives would be to slack off on defense, make that a role player job, and play heliocentric offense. If a team were treating BPM as an accurate reflection of output, they would systematically undervalue defense, particularly in stars and we’d see a similar effect. Reasonable people could come to different conclusions. Alternative explanations are that the last ten years are a small sample and the correlation is basically random or there is some sort of alternate mechanism that sorts great players onto bad teams (the draft could be one version of this mechanism, though I find this unpersuasive due to players like Jokic being very late draft picks). As previously stated, this is ultimately an exercise in plausibility: we likely cannot refute most explanations so it is about picking what sounds right while allowing for the possibility that we are wrong. To me, the one that rings most true is my initial proposal - what gets measured is what gets done, either by executives, coaches, or even players themselves.

This puts us in a weird place. To recap - we have a stat that tends to overweight individual output, especially when it comes at the expense of team success. Because people tend to value their individual success higher than winning, we see some combination of players and teams prioritize this stat, again, hurting their teams in the process. We have not only made a slight error in observation (the stat itself), but this error seems to drive action, further warping our ability to use the metric descriptively.

Up to this point I’ve built the case on the proposition that the BPM stat makes errors in its assessment of impact. This feels fairly irrefutable; it is unlikely players suddenly become worse when they play for better teams but perhaps we could reframe this statistic as what proportion of on court actions a player is carrying. This is similar to the original intent, but makes the stat less transplantable, that is to say, we cannot necessarily use cross-team BPM comparisons to compare players to each other and instead appears to be a slightly more complex variation of usage percentage or perhaps on-off splits. I’ll now pivot to identifying the biggest aspect I think composite statistics miss; roster and rotation construction.

Composite statistics tend to only focus on things players do on the court and ignore how this affects macro strategy. To illustrate this, lets look at the “out of position” modifier for stats. This modifier makes a lot of sense if you treat the other four players on the court as randomly selected within their position. Having a rim protector is important, but we generally assume the Center will be at least somewhat competent in this regard and the Power Forward may be able to supplement it. If we assume this, a guard who can block shots will probably increase the odds any given shot gets blocked by more than increasing the competence on the correct position. This logic is coherent and makes sense, but this does not reflect how teams generally, or even in game rotations, are constructed. If a team has for example, a Derrick White, they may be more willing to make different trade offs with their Center, perhaps sacrificing some degree of rim running for other offensive skills such as 3-point shooting. Of course this rightfully should be seen as a strength for these players: Derrick White’s shot blocking ability gives more options to a team and allows them to play a greater variety of styles. However, this may become overstated depending on how the front office or coaching staff decide to embrace this skill.

Lets switch over to Mr. BPM, Nikola Jokic. I’ll be very clear to start that Nikola Jokic is a fantastic player and among the top of the league, but I don’t think he is the best player of all time - something BPM is absolutely certain of. He’s also an excellent example of “out of position” stats; he is perhaps the best passer in the league right now despite being a Center, traditionally a non-passing position. His front office has largely tried to construct a team around him that maximizes players who are effective receiving passes out of the high post (think Aaron Gordon slashing to the hoop or Jamal Murray come off a hand off into the paint). However, Jokic has notable weaknesses under the basket on the defensive end that the team also looks to compensate for. This places some limitations on the players they’re able to get, but they have mostly found a formula for maximizing the impact of the titanic offensive weapon that is Nikola Jokic. We see the weakness of this approach reflected in his on/off splits. Since the basic requirement to be in the rotation is to play effectively off of Jokic (and I want to emphasize, this is not a dumb strategy or a conspiracy to inflate his numbers, its a reasonable approach to a unique superstar that happens to see an effect in the numbers), they naturally are maximized less when the team cannot replicate what Jokic does.

Here we have the double edged sword of a unique superstar: maximizing their specific traits often means making sacrifices without them. Because there is no Center in the world, perhaps in history, quite like Jokic (even a similar skillset but scaled back in terms of talent), it is nearly impossible to have a backup that will slot in effectively with the role players. The last two seasons, their backup was Deandre Jordan, a prototypical rim runner who almost makes more sense to imagine catching lobs from Jokic rather than filling in for him during rest. Naturally, this results in the massive drop-off in team performance when the team’s lynchpin is off the court: you simply can’t recreate Jokic and so all the players who have been selected specifically to maximize him see immense drop off without him.

We could also ask why a team cannot simply get players who both complement Jokic but function outside of him as well, but the answer is that the more flexible a player’s skillset, the rarer they are and the harder they are to acquire. Consider that the Nuggets gave max contracts to Jamal Murray and Michael Porter Jr., neither of whom are max quality players. Part of this is the small market tax - you end up having to pay more to bring guys to a less desirable city - but also part of it is that the type of players they want to surround Jokic with are not readily available and they have to hang on tight to keep them. Celtics fans may find this ridiculous, simply acquire all the good players and win, but this is how most teams operate. Making this puzzle trickier is that Jokic’s main defect - his rim protection - creates weird scenarios for the team where they must choose between having another rim protector (who can help rebound as well, but may sacrifice perimeter defense) and increased offensive output.

Compare this to a player like Jayson Tatum, roundly criticized for not doing anything special on the court, simply being good at a lot of different things other guys do. It’s often said he’s not the best in the world at any single thing and that throws people off. However, his gift to the team is being pretty good at everything and so being extremely flexible. Tatum plus four bench players has been one of the best lineups in the world for the last 5ish seasons. Part of this is Tatum’s ability to carry, but part of this is there are almost no combinations of players that he cannot maximize by simply shifting his style to meet the needs of that particular lineup. This also lets the front office throw caution to the wind and just acquire talented players without worrying tons about fit. Of course, this largely does not get represented effectively in BPM. Because style shifts to accommodate different lineups, you won’t see huge modifiers for “other players don’t generate this stat” and you won’t see tons of “out of position” stats because he is normally adapting his game to his assigned role.

Let’s walk through a few more examples of where we see this benefit. In the 2023-24 regular season, Tatum averaged roughly 9.1 drives per game, scoring 7.9 points and passing out of them 2.3 times. In the playoffs, teams were more focused on taking away his 3 (going from 5.4 open and wide open attempts per game down to 4.5) and his drives spiked to 13.4, scoring 8.8 points and passing out of 4.6 of them. In the finals, this skyrocketed to 19.2 drives, scoring 9 points on them and passing out of 8.4 of them. The finals saw a similar rate of open 3s (4.4) as the playoffs overall, but the liability of having Luka Doncic on the court called for more pressure into the paint to open up the floor for the rest of the team, something Tatum was able to deliver.

Next, let’s take a look at some of the players that have played parts of their careers with Tatum. The most obvious selections are the current team: Kristaps Porzingis and Derrick White both saw revolutions in their game after joining the Celtics. Some other notable examples are Terry Rozier never quite seeing the same heights after leaving and Semi Ojeleye going from a 4-year, 36% 3 point shooter with Tatum to out of the league not even one full season after leaving. Even Kyrie Irving made his only All-NBA second team next to Tatum and never rose that high again. Notable about this is perhaps that the Celtics organization is just miles ahead of everyone else, but there are simply no exceptions to this rule except for a handful of players who couldn’t make the rotation on the Celtics (more a commentary on the talent in front of them) who became ok bench players elsewhere and players who got injured upon their arrival in Boston. It doesn’t matter where they came from or where they went, their best years were in the situation best fit to maximize their skills rather than bending them to meet what the team needed.

This is not to say Jayson Tatum is better than Nikola Jokic, I think reasonable people can reach different conclusions on the topic. What I am trying to highlight is how BPM measures the success of these players. Over the last three seasons, both players have made the All-NBA first team every year and both have won championships. Jokic has boasted BPM marks of 13.7, 13, and 13.2, taking home two MVPs (and quite frankly was more deserving of the third than Joel Embiid). Tatum meanwhile, has posted 5, 5.5, and 5.1 but has additional trips to the Conference Finals and Finals while Jokic hasn’t been out of the second round. As stated above, team success is an incredibly complex calculation and the variables and permutations are near infinite, but it doesn’t pass the smell test that the Celtics could have better success with a superstar less than half as impactful.

Conventional wisdom is that “best player in the series” gives you at least a fighting chance and often an advantage. In an overview of the last ten years, if the weaker team has said player, they win nearly half the time. The exact definition of best player gives you varying results, but all hover around 50% (highest BPM beating higher wins 49% of the time, highest award winner beats wins 44% of the time, and my estimation of public perception of best player wins 54% of the time). In 69% of series, the high BPM mark is on the higher seeded team (71% and 68% for awards and reputation), all of this suggesting that having the best player usually translates to having the best team. Once again, we are left with the conclusion that there is a gap between BPM’s assessment and impact based on the wild discrepancies in results.

Some of this is compensated for by defense - BPM does almost nothing to measure it effectively despite being 50% of the game and possibly more indicative of playoff success (median title team has the 5th best offense and 4th best defense in the league) - but still more must be compensated for by the above mentioned flaws. Analytics minded people (including the inventor of BPM) will now take the time to point out BPM is a flawed stat and more accurate metrics require use of tracking data. No doubt the more inputs we can put into a calculator the more refined our answer will be. However, the broad premises of this post can still be extrapolated to almost any composite statistic. The fact remains that most of what happens on a basketball court is not recorded anywhere and none of how a player’s skills impact team building and coaching is recorded anywhere. We simply don’t have the capability to solve The Hard Problem numerically yet.

This post should not be read as a refutation of analytics or an endorsement of the eye test. The eye test of course has significant flaws and suffers even further from not being transmissible to others exact by those who spend extensive time reflecting on their eye test and even more time recording film breakdowns to explain what they see. Analytics remain an immensely useful tool for helping us to understand the game and composite metrics often tell us more than their basic inputs. However, we need to be very careful not to treat these as “catch-alls” that input a bunch of data and output who the “better” player is. The game is changing: we’re seeing in real time that increasingly gaudy individual outputs receiving high composite scores are negatively correlated with team success. The key here is to stay flexible, open, and considering the inputs as well as the calculation presented by a talented programmer.

Thanks for reading, if you are a big composite stathead and think I’m misstating aspects, please let me know! Being wrong is how we learn. Next post will be about the true catch-all statistics: results, and how varying measures of success stack up to each other. Happy holidays, stay warm!

RedactedIguana’s Substack

Discussion about this post