I build an updated free agency model. It is a relatively simple OLS model, and maybe slightly less accurate than the fancier machine learning model I built last year, but it is much simpler, easier to understand, and easier to update live.

None of them are particularly good at projecting value, but they give a broad shape of the overall league evaluation outline. In simple terms Win Shares with a Usage adjustment plus age and games started have performed the best as at mimicking the market of any metrics I have tried, every year for the last three years. I suspect if you converted PER to a win shares style metric with a team adjustment included, as Justin Willard has discussed, it might predict the market even better.

I also added a cap spike variable, which helped a bit, and helped a bit with this year. But I couldn't track down a consistent available cap space at the start of free agency variable for prior years, so I doubt I have captured the tight post July 4th market for this year.

In any case, here's a scatter of where the model looks compared to the overall market, above the line are "over pays" according to the model (It underestimates max players a bit), and below the line are "bargains" according to the model (Click to expand).

And here is a link to the full results, data via Basketball Reference and Spottrac.

The model is high on more of the big men left in the free agent pool. I suspect it and the player's agents are going to have to make an adjustment next year.

As part of some model prep I have been working on for a possible new draft model I have been exploring some play by play data. As a part of that, I wanted to do a quick hitter on those numbers. I was able to use the data via Will Schreefer to look a little more at how players score, how that changes as they age, how it changes by height and how guys who eventually become NBA players differ from those that top out at a lower level.

The two breakdowns I was interested in for this were the split between transition and half court scoring and unassisted scoring vs points coming from an assist. Transition ratio is simply the points a player scored in transition divided by their total point scored. Creation ratio is the unassisted points from the field divided by assisted points from the field, as the play by play data does not track free throw assists.

First thing is that Transition Ratio goes down slightly by age in college, while Creation Ratio tends to be even more stable:

The slight decline in Transition Ratio is due to a larger increase in half court scoring, not a decline in transition scoring. Meanwhile it's somewhat interesting to me that on average, at least, college players are able to increase their assisted scoring numbers at a comparable rate to their self created scoring.

Then I looked at the numbers by player height. As you can see below there is a clear downward trend in percent of points scored in transition by height, and a persistent higher percentage of points scored in transition by NBA bound college players, when height is accounted for.

There is a similar downward trend in Creation ratio by player height. Again the future NBA players tend to create a higher percentage of their own points when accounting for height. However here the gap pretty much disappears for taller players. For guards,showing the ability to sore with the ball in your hands at the college level is very important to be able to get into the NBA.

A couple of last nuggets, I was interested to see that there is a bigger gap on Transition ratio between NBA bound bigs and non-NBA bigs bound than there was in the Creation ratio, while for guards is was the opposite. The biggest growth in raw transition scoring per forty minutes by a good distance were the guards, and especially those who eventually make the NBA.

Last year I explored a benchmark system for draft prospects by Ed Weiland. The benchmarks track whether a prospect reaches minimum statistical marks in a number of different statistical categories. In looking at Weiland’s benchmarks, I found that the benefits of a benchmarking system is that it allows one to highlight potential red flags and rewards versatility in a way most draft models do not. The downside is the loss of information with a simple pass fail test on the benchmarks, outstanding efficiency is treated the same as adequate efficiency. Another important factor maybe the simplicity in conveying the information, meeting or not meeting specific benchmarks may be easier to convey to less analytically inclined readers than the output of a regression or ML model.

This year I am introducing my own benchmarking system designed to work without picking a position for every player, a complication that can make the system less reliable for players with positional versatility. I am using four benchmarks:

- Scoring- Does the player score 21 points per 80 possessions?
- Efficiency- Does the player’s two point percentage and free throw percentage add up to 1.25? (Allows positional diversity, players need to demonstrate ability to score inside monstrously efficiently or some evidence of ability to shoot).
- Offensive Activity- Do assists plus offensive rebounds equal at least 6? (Strong negative correlation between the two, indicates trade off of size and skill)
- Defensive Activity- Do steals plus blocks equal 2.5 or more?

**Comparison to Draft Model**

While the point of the benchmarks is not to displace or mimic the methodology of a trained model, I did look at both the individual benchmarks and the total percent of benchmarks met against the data used for my draft models. In both cases the benchmark data, combined with age, had significant positive correlations to measures of success in the NBA (though less than more detailed model). Offensive Activity showed the strongest relationship followed by Defensive Activity, Efficiency and Scoring in descending order. Also notable, when combined with the data for the draft models, percent of benchmarks reached had a modest positive effect, consistent with other measures of versatility I have tested in the past.

In a boosting regression using the two models as predictors the relative importance in predicting power was 80% to the trained model and 20% to the benchmarks. (Measure used was the max performance in years two through four as measured by a combination of box score stats and RAPM).

**Examples from this Prospect Class**

Looking at the bench marks applied to this year’s top prospects should give a little better idea of how they work and what information they can reveal.

There are six major prospects that make all four of the benchmarks, DeAndre Ayton, Luka Doncic, Wendell Carter, Shake Milton, Josh Okogie and Gary Clark.

Doncic, Ayton, and Wendell Carter are well known to anyone even casually following the draft. Milton and Clark are the older prospects on this short list, which takes a little of the shine off of making the benchmarks. All of the prospects that meet all of the benchmarks also rate reasonably well in the traditional model, with all in the top 30 by the traditional model rank.

On the other side are the three significant candidates that fail to meet a single benchmark, Lonnie Walker, Sviatoslav Mykhailiuk, and Hamidou Diallo, with the highest rated prospects to fail to meet all four benchmarks being Walker and Diallo, both rated in the first round by ESPN. Walker is the only one of these prospects to rate in the top 40 via my traditional model.

Age is the big factor not explicitly addressed by the benchmarks. A 23 year old hitting all four benchmarks is still not necessarily a strong prospect, and an 18 year old only hitting two is not necessarily a non-prospect. In the attached link, I added a column that factors the prospect score by age in order to get a more approximate evaluation,

Over at Nylon Calculus, Will Schreefer shared some historical men's NCAA On/Off numbers covering 2009 to 2017. The numbers are available to copy via Google Docs, if you are so inclined.

I am, in fact, so inclined, so I copied Will's sheets and did some quick exploratory analysis. Schreefer provides a couple of interesting breakdowns including the offensive rating (ORtg) and defensive rating (DRtg) with each player on the court and then off, as well as the number of possessions with the player on and off the court. There are also columns that have the average Opponent ORtg and DRtg weighted by the possessions the player was in for each game, so if a player plays more against harder opponents or misses a scheduling cream puff, his opponent ORtg and DRtg numbers will look better than the team's overall opponent schedule might indicate.

The first thing I looked at is the year to year variation of different measures. To that I filtered out any players with fewer than 1,000 possessions on offense for each year (the defensive possession numbers are slightly different, but not enough to make any significant difference for the application of the filter). Then I identified the players individual years in the database by first year, first year plus 1, first year plus two and so on.

For this analysis I concentrated on the first year to second year relationships, which gave me a little over 4,000 players to analyze that played over 1,000 offensive possessions in both years. I looked at the coefficient of determination (R^2) for the Net On/Off numbers, the On numbers, Off numbers, as well as Opponent numbers.

There is only a very weak relationship between any of the Net numbers from year to year. Below is the graph of the Net Total plus/minus, which is plus-minus for the player on the court adjusted for the plus minus of the team with them off, showing a coefficient of determination of .017.

The numbers for Net Offense and Net Defense are pretty much the same. With so little relationship year to year, I would be very hesitant to read much into any potential draft prospect's Net On/Off numbers.

The raw ORtg and DRtg with the player on the court had stronger relationships. The R^2 for ORtg On year to year was .32 and for DRtg On it was .25 The On ORtg scatter plot is shown below:

Notably the R^2 for ORtg and DRtg for when the players in the study were Off the floor were lower, .16 and .12 respectively. This is in some part because of the smaller sample size with the players meeting my minimum possession count were off the court, but that just highlights the issue of getting a decent sample for On/Off comparison in a single year of college ball.(Average of ~ 1,350 for the On sample and ~ 800 for the Off sample).

Lastly, opponent ratings are very stable, reflecting that college teams play a pretty stable schedule from year to year. So I calculated Opponent Adjusted Offensive Ratings and Opponent Adjusted Defensive Ratings for every player. The calculation simply takes the On ORtg minus the Opponent DRtg (the lower the Opponent defensive rating the better the opponent's defense). For Defense the calculation is the Opponent's Offensive Rating minus the Player's On Defensive Rating, with the larger the number is the better. These numbers are more stable than the simple On ORtg and DRtg, and in a comparison of the entire NCAA population, at least, probably a better indicator than either the Net numbers or the raw On Ratings.

There’s a long and probably unresolvable debate in basketball over how much of a player’s development is due to the team environment and how much is intrinsic to the player, and likely would’ve happened in most NBA settings.

I probably lean more to the intrinsic development than most, but I acknowledge that is probably an unprovable point. A big part of this is the belief that there is more inherent variance between the young prospects entering the league than there is between teams. The other factor is the inherent difficulty in changing another person’s behavior, academic and parenting research has often found it difficult to replicate successes or produce anything like a magic one size fits all formula.

In part it is unprovable, because at best we could maybe estimate an average percent of improvement or isolate certain skills. There will always be individual variance, guys who thrive despite of a bad team situation or unlikely success stories that maybe needed just the right environment to happen.

But there are bigger obstacles in trying to measure even a broad average contribution of development environment vs intrinsic player development. I have been considering running an analysis of young player performance after they change teams compared to the expectations set by their prior performance. There are obvious issues with selection bias in such a study, teams don’t let go of budding stars. D’Angelo Russell and Jahlil Okafor are on the Brooklyn Nets precisely because they underperformed in their first NBA stop.

But, it goes further than that. There are tricky questions about how sticky early development, whether good or poor, is to a player’s long term career. In economics this is sometimes called path dependence, the idea that there is a hangover effect from initial conditions that extend their influence into the future and maybe hard to undo.

With good development practices it’s easy to see path dependence; if a player learns to extend their range to three pointers they are unlikely to forget on their next team, if a player learns team defensive concepts he should know them on another team. But, even here it’s possible that a less favorable environment will undo good development, for example, in a selfish offensive environment a player may elect to take more pull up twos rather than move the ball.

And just to be clear:

Line Up Fit <> Development

I think most people know this instinctively, but it can sometimes get confused when people talk about how one team situation may affect a young player over another. Going from a team with plenty of shooters to one with cramped spacing may well influence the player’s efficiency, but that effect can also bounce around from line up to line up on the same team.

In practice what we’re mostly interested in though is poor development, and can it be undone in a better environment. There are a few reasonable questions that come to mind.

- Lost Time: Is like language, are some skills easier to teach at a younger age?
- Shaky foundation: How much do skills build on each other? If a player fails to make improvements at one stop can they catch up later?
- Bad habits: When, if ever, do bad habits on the court become too ingrained to change?
- Loss of confidence: For some players, pretty clearly, lack of success at early on can sap their confidence on the court. Emmanuel Mudiay who has gone all year basically unable to talk to the media in Denver is a good example of this. How many players is that a primary issue? How much more likely is that to change somewhere else?

So even if there is a significant team component to development, we may not be able to pick up the effect looking at “second draft” onto new teams. The player’s development path may have already been permanently altered.

Most NBA teams past the twenty game mark over the holiday weekend, approximately a quarter of the season. Statistically, however, we are closer to the halfway point, at least in the sense that a team’s average margin of victory (MOV) should predict a bit over half of the variation in win percentage over the rest of the season.

In some ways that’s a low bar to jump over as what has happened in terms of margin of victory so far, based on history, also won’t explain nearly 50% of the typical team’s win percent variation from here on out. To give you a sense of what the relationship of MOV to win percentage from here on out, below is a plot of MOV in the first twenty games to win percentage to the end of the season for 2010 to 2017.

For those fans whose favorite team has been sub-par to date (hi Clips fans!), you can take solace in the turnaround in outliers over the line like the 2013-2014 Brooklyn Nets, who came out of the twenty game mark at 6 and 14 with an MOV of -7.9, but still won 61% of their games going forward.

And any fans on teams exceeding expectations, before you get ahead of yourselves (You know who you are), just remember the 2011-2012 Philadelphia 76ers, who came in at 14 and 6 with a plus 11.7 MOV. Yet they managed to win less than half of the rest of their games.

Another interesting thing is that, at this point the winning percentage of each team is almost as good a predictor as the MOV for performance over the rest of the season. In the last six seasons winning percent after twenty games has an R^2 of .489 with win percent for the remaining schedule.

In some ways that’s not that surprising, the whole reason we’re interested in MOV is that it’s correlated to winning and stabilizes more quickly than wins and losses. But, what we find in this analysis is that the gap by now is pretty small in terms of prediction for the rest of the year.

To go further, a few years ago Benjamin Morris found on his site found that using the MOV and record from the other 81 games in a season both factors, winning and MOV were statistically valid predictors, and using both improved the prediction. So, I wanted to use the MOV and win percent from the just first twenty games to predict the rest of the season wins. In an OLS model with the six seasons as well as various subsets of the data and in a Partial Least Squares regression, both the MOV and Win Pct were statistically important predictors and showed a very marginally improved prediction.

In the combined model 67% of the prediction came from he MOV, and 33% from a team’s win percentage, with some variation in the subsamples.1 Given how well correlated the two predictors are it is tough to draw too much from the split, though it was encouraging to see the relative stability in the sub-samples. On the other hand using a sampling method, called Bootstrap sampling, whether Win Pct was significant at the 95% level depended on which Bootstrap method was chosen.

So the most we can probably say from this is that average margin of victory is the stronger indicator of team quality at the twenty game mark, but that Win Pct is ***probably*** something of an indicator as well. At least given these two factors, if you’re favorite team is under-performing in close games, while that will mostly even out, it may not entirely.

The differences between the two models applied to the rest of the season are pretty small for most teams. Teams that have over performed their MOV to date, like the Boston Celtics and Detroit Pistons are projected to win just over one more game over the rest of the season than their MOV would indicate. The biggest movers by some distance are the Thunder, projected to win about three fewer games from here on out than in the MOV model.

Why winning might be a skill is tough to say definitively. Other research gives some interesting hypothesis, such as indications that ahead teams simply relax and give up part of their lead even controlling for who’s in the game. Or Ben Falk’s finding better projections without garbage time being included in the data. But there is probably need for more research.

Below are the model projections using both the MOV only and combined Win Pct and MOV model along with the differences:

A couple of years ago Evan Zamir built a model to convert Dean Oliver's Four Factors of basketball, effective field goal percentage (eFG%), rebounds, turnovers, and free throw rate, to net point differential. Last year I applied Zamir's formula to regressed early season four factor numbers to derive point differentials for each team.

This year I decided to redo Zamir's analysis using more recent seasons, as Zamir's had been done with seasons from the Tim Duncan and early Garnett Celtics era. One thing I noticed about Zamir's numbers is that there was slightly more weight to the defensive side of the four factors.

By contrast my similar analysis regressing the four factors on both sides of the ball as eight variables against point differential by team over the last four years I found that the offensive factors explained more of the net point differential than the defensive side. The break down, as shown below, is close to 55/45 in favor of offense.

The raw coefficients look a little closer than that. For example, the raw offensive coefficient is only 1.5% larger than the defensive coefficient. But, the spread between the best and worst eFG% teams has been much larger on offense than it has been on defense over the last four years. So that when we standardize the variable that indicates that eFG% has contributed more to winning and losing on offense over the last four years.

The table below has the model results with the raw coefficients, standard deviation for each factor, and the contribution to variance. In addition there is a column that gives the cumulative percentage by offense and defense.

The second big difference comes from the gap between offensive rebounds and defensive rebounds, which is somewhat surprising given the low regard offensive rebounds have fallen into for most teams.

In this case it is both the raw coefficient on the gap between teams that increases the variation. That is definitely interesting, and casts some question on how we value, or don't offensive rebounding especially from centers who can grab o-rebs without messing up a team's defensive floor balance. But, it doesn't tell the whole picture, as the reason teams have cut down on offensive rebounds is to increase shooters on the floor to increase their eFg% and focused on getting back on defense to cut down on their opponent's eFG%, the number one and two factors in effect on winning.

The only case case where the defensive factor has more impact than its offensive counter part is free throw rate. In that case we probably have Dwight Howard, Andre Drummond and DeAndre Jordan and any other Hack-a victim to blame.

So then we can apply the Four Factors Point Differential Model to this season to date as an out of sample test, and the model performs almost as well as on the training data. The image below is the model compared to the margin of victory via Basketball Reference as of November 15th, with an R^2 of 96%.

The one little dot higher on the model estimate than the current MOV in the middle of chart? Dwight Howard's Charlotte Hornets, where he's shooting 30& of their free throw attempts, and hitting only 41%. Otherwise the PD model explains point differential pretty well.

Lastly, here are a couple of the Added Variable Plots from the regression to get a visual of the difference in effect:

Contrasted with the less tightly aligned plot for ORebs.

Lastly, the offensive free throw rate, or FTA/FGA

The outlier dot on the lower left happens to be the Dwight Howard 2015 Houston Rockets.

Small sample size theater time, those first fifteen games or so of the season, is the period of the basketball calendar that gives me the most conflicted feelings. Standout rookie performances are exciting! New off season traded players underperforming is interesting! All of which is counterbalanced by the overreactions to miniscule on/off splits and unsustainable shooting streaks.

It’s less fun to play sample size cop on Twitter than you might think. But, there are two areas of expected regression that almost always deserve highlighting about this time of the year, opponent free throw percentage and opponent three point percentage.

Free throw defense is not a thing. Not only is there no correlation year to year, there is very little spread between teams by the end of the year. Last year opponent’s free throw percentage had a coefficient of variation (COV) of 1.1%. As of yesterday that COV was 3.8%, over three times more spread out.

Likewise three point defense is much less of a thing than it appears at this time of the year. The spread there between teams at the end of last year was a 3.7% COV, as opposed to a 10.2% COV as of Tuesday. In both cases by far the most reasonable expectation is that the outliers will regress significantly toward the mean.

If we take that expectation and apply it to each team we can get a sense of the degree in noise in the current team performances, at least on the defensive side of the ball. The table below has the top ten regression candidates in the downward direction.

The Defensive Efficiency Adjustment is calculated as if the team's opponents had shot at an average rate on both free throws and three point attempts thus far with some mitigation then applied with a generalized expected opponent's offensive rebounds based on the extra misses.

Below are the ten teams on the other side of basketball fortune so far. The Cavs may have reason to be a bit less worried than the overall defense numbers might indicate, as they have had the worst bounces in these two noisy measures, with the Phoenix Suns right behind.

To be clear, opponent free throw percentage and opponent three point percentage are not the only measures that have noise to them at this point of the season, on offense or defense. The number of threes or free throws surrendered in themselves have some variation in them at this point that will settle out over the season, as do turn overs and rebounds, But those measures have at least some more solidity to them at this point. So simply regressing say Utah's two point defense to league average without weighing how good their rim protection has been in the last two years probably does little to help actually understand their expected trajectory.

Free throw defense and three point defense act close enough to random for us to be able to pump the brakes a little on the Orlando banner raising and at least some of the Cav's October panic.

Given that the free agency period is winding down I decided to check in on the performance of the free agency models I built. Using data from the market over the last three years I built two different models to predict the average annual valuation of the contracts for this year’s free agent crop. One is a regression model. The other is a Bayesian machine learning variation of a random forest model.

The primary factors in both models at predicting the AAV were Win Shares, Age, Usage, cap spike and playing time. For both models I used the Percent of the Salary Cap for the first contract year as the target variable. The benefit of the regression model is that it gives straight forward coefficients, while the ML model gives the “importance” of each variable. However, the R stats package I worked with also provides a partial dependence graph that gives an idea of the shape and direction of each variable’s influence in the context of the model. (Context of model is important since the partial dependence is shaped by the other variables included in the model as well as the sample being modeled)

Below are a couple of the more interesting variables in the ML model.

The age variable, for example, shows age with little effect on the percent of cap on the player’s contract until he hits twenty-nine, and then it declines quickly.

Win Shares also shows a nonlinear pattern in the model, taking off at around two.

And while minutes played looks like it’s relatively linear, being a starter takes one jump right around 41 games started, then stays flat.

In addition to learning about the free agency market, part of my motivation was to expand my modeling skills and experiment with ML model. And, of course, I was interested to see which one would get better results out of sample. In terms of measuring overall success I used a simple mean absolute error (MAE), which is the average error regardless of the direction of the error. The ML model has so far slightly out performed the regression model, with a MAE of 3.4 million dollars for the regression and 3.3 for the ML. But, as it turns out the error of the models averaged together is slightly better than the either at 3.2.

In an overall perspective the error on simply guessing that every player gets the average contract is 7.7 million dollars per player, so the models net a decent improvement.

But it does look like there are some systematic errors between the model and this year’s market. To start, so far the model has overestimated the contracts of centers and underestimated the contracts of point guards on average. Below is the blend of the two models plotted by position.

Whether that is a part of the league’s continuing evolution, or a reflection on this year’s free agent group is tough to say.

I then looked at the residuals compared to out of sample individual statistics I found that Usage was still undervalued and age and blocks were overvalued. Though 40 year old Vince Carter’s one year $8 million deal seems to be more or less responsible for age affect.

Lastly, there are the individual outliers. In the cases where the model is much lower than the player’s contract it’s not clear if it’s a poor projection by the model, or an overpay by the team. Last year there were cases like Timofey Mozgov that proved to be a warning of an overpay. However, the models just give a rough baseline of where the market may fall. This year one of the biggest "overpays" via the model was Stephen Curry, who is not only part of the undervalued point guard class, but receiving a Super Max contract that did not exist in the training data. The other two "overpays" via the model are short term contracts that probably are a bit high on a per year basis, but that is purposely mitigated by attaching fewer contract years. JJ Redick and Paul Milsap, The last two could potentially be a bit more concerning for the signing team given the length of contract, Blake Griffin and Jrue Holiday. Both were projected to be about $9 million lower than their AAV, and were given five year deals to stay with teams that had little to no leverage.

The best value contracts (or where the model was the most over), were Luc Mbah Moute and Ersan Illyasova at around $7 million less than projected. The link to the list is attached here.

I couldn’t quite wait for Summer League to be over officially, run the numbers to see what, if anything, we can take from the rookies’ performance. But with the Lakers sitting virtually everyone of note for the Vegas Championship, I figured it’s over enough. (Note to the NBA: Vegas Summer League is too long, this is why teams start sitting their lottery talent).

In the link here, I have the full run down of per 40 one number performances via Kevin Ferrigan’s Daily RAPM Estimate (DRE) and Alt WIn Score (AWS), a metric I use quite often in my draft models with data from RealGM. But, to me, the real focus of Summer League has to be the rookies.

Some of that is informed by a Kevin Pelton ESPN Insider article, indicating that Summer League adds a bit of predictive power to rookie performance but not to 2nd year players. It’s also plain that we just have less information on rookies, so new info relatively more valuable as is seeing them in a new team setting.

In order to get a quick and estimate on evaluating the stats out of SL, and maybe providing a bit of perspective, I re-calculated my Rookie performance draft model, first by substituting the SL stats for the college or European league stats and then used the SL stats to just update my original Rookie Model.

First the Summer League Only (SLO) run. The SLO model seems to match the buzz coming out of Vegas very well, with Lonzo Ball on top followed by Dennis Smith Jr, and Jayson Tatum, at two and three. Laker fans might want to frame this table with three Lakers prospects in the top 10.

Of course, for reference I should add that the scale of the projections line up so that a 5 is roughly equal to average production, or a 0 in plus/minus terms. Indicating that the SLO model projects every rookie but Ball to be below average next year.

For Celtics fans SLO is more mixed, Tatum comes out the third best, Zizic performed respectably, and uh, Semi Ojeleye was also there.

It is hugely unfair to project Markelle Fultz based on what amounted to about 64 minutes of playing time. Though he was actually helped by the the regression I applied, since he had performed below average in his brief court time.

So then, how much to actually adjust our expectations, if at all? Going again based on the Pelton article, I ran the rookie model with the SL numbers added, weighted at a 25% of the pre-NBA numbers. This gives a much more realistic evaluation to Summer League importance.

For example, in the tables below ordered by the SL Updated Rookie Model (appropriately, perhaps- SLURM) Fultz is still appropriately in the top 2. And the most any player has been adjusted up is three tenths, and down is five tenths (again the model results are scaled so that it’s roughly the same as being projected to be .3 better in a plus/minus model).

And below is the 2nd half of the summer league rookies by SLURM:

Maybe, we can hold off on the Kuzma Rookie Of the Year ceremony and wait to bury Fultz or Zach Collins. At least until the second game of Preseason.