Retrodicting is a funny word modelers use for testing models against results that have already happened, but were outside of the model training. The results aren't new, but they're new to the model.
The two draft models I built, the P-AWS model and the % Player model, used the draft from 2002 to 2011 as the training data. The independent variable was the max Alternative Win Score (AWS) in the drafted player's third and fourth years, meaning that the 2012 draft was outside of the model parameters. So that draft class is available for out of sample testing, or retrodicting.
In truth, during the training phase the models were run multiple times with various draft classes dropped in order test the robustness of the model specification and ward against over-fitting the data, ie coming up with coefficients that only fit the player's in the drafts tested. The results from one class are hardly definitive, but looking at a brand new draft class is still a good trial for a model.
The first issue is that the 2012 draft class has only completed their second season after being drafted, one less than the minimum for the independent variable. As a result, the biggest issue in the retrodiction is that age is a significant factor in the draft model so younger players maybe expected to perform systematically worse than projected. So, the retrodiction looked at both the Actual-AWS and one with one year more year of the aging curve tacked on to the player's 2014 AWS.
There are a couple of different ways to look at the retrodiction to judge the performance of the model. One is did the model recognize that Draymond Green was a better prospect than Harrison Barnes?
Let's see, Harrison Barnes with a P-AWS of 3.47 ranked 31st in the class and 30% odds of being a quality starter in the % Player model 32nd in the class and Green with a 4.08 P-AWS ranked 20th and a 46% %Player ranked 16th. So, yes, it did, and Green indeed has performed better in Actual AWS in 2014. Model validated, right?
Another simple and intuitive, though not necessarily the best, way to look at the performance is a scatterplot and the out of sample coefficient of determination. There both models do well, especially the P-AWS model performing about as well as in the training data. The P-AWS model has .394 R2 with the actual AWS and a .47 R2 with the Aged 2014 AWS.
The %Player model performs a bit worse by this measure with a .34 R2, primarily because the scale of the logit model doesn't capture the separation between Anthony Davis, the scatter dot on the top right above, and the rest of the class. It still performs better than the best fit line for the actual draft order, which was .2. But, Anthony Davis is precisely why R2 is not necessarily the best measure of model success given that predicting his outlier performance has an outsized effect on the measure.
Since a team can only take one player with each pick I also took a look how the models perform in terms of model rank compared to the player's actual performance rank against his class. It allows for a more direct comparison with the league's draft order. For instance, Terrence Jones was projected by both P-AWS and % Player models ranked Jones as the 11th best prospect in the draft, while he was taken with the 18th pick by the Houston Rockets. Based on his stats this year Jones rated with a 7.68 AWs, 3rd in the draft class after Anthony Davis and Andre Drummond. So the deviation for the models is 8 spots and for the actual draft it is 15.
For the class the average absolute deviation for each model is 9.6 and the median absolute deviation is 8 places. For the actual draft the average absolute deviation was 12.5 and the median was 13 places.
It is probably as informative and more interesting to look at how the models, drafters and actual stats compare on individual players.
Biggest hits for the P-AWS model over the drafters are Kyle O'Quin, who has been an efficient player in Orlando and was pegged as a late first rounder by the model, Jared Sullinger, who actually fell due to his back issues, Austin Rivers, who the model didn't like but went in the lottery and has not played well, and Terrence Jones mentioned earlier.
Biggest misses are Miles Plumlee, who the model didn't like due to his age, and Terrence Ross, who is out performing his college stats.