I have an ongoing research question in mind: "How much should the Leafs pay Nazem Kadri?"
There's a few parts to this question:
1 -- How good is Nazem Kadri?
2 -- How do we determine if a player is good?
3 -- How does goodness impact player salary?
4 -- How do GMs set player salary?
I'm really dealing with the fourth question here. How do GMs set player salary?
I recently stumbled across Robert Vollman's bit 'ole spreadsheet of every stat you can imagine for NHL players. I ran some regressions to determine what actually predicts player salary. This approach would be considered fishing
by data purists but I'm okay with it. I'm focusing on data from 2014/2015. Ideally, I'd replicate findings with a split sample or across multiple years... but I'm not.
A few considerations:
- Rank. I'm not actually concerned about the actual metrics. I'm working with their ranks. For example, does the player with the 17th best GAR and 17th best TOI get the 17th best salary? This approach addresses some of the underlying issues with data linearity and normalcy.
- Full-time NHL centremen. I've reduced the set to centres who have played at least 15 games in 2015/2015.
- Limited ELCs. I've eliminated players who were drafted in 2012, 2013, or 2014 to minimize the impact of ELCs on overall cap hit.
- Cap cost. I'm not actually looking at salary. Specifically, I'm just exploring "Cost of player against salary cap (over 82 games)."
Raw correlations indicate that a whole lot of stuff correlates with cap cost. The best indicators seem to be related to actual TOI and overall useage. This finding isn't surprising. Better players will be both better paid and play more. The best scoring predictor actually seems to be assists. Let's crunch all this data into a regression model to see which factors survive.
The following measures contribute uniquely and significantly to cap cost:
- Rank.IPP -- Percentage of all team goals scored on the ice on which the player got a point
- Rank.PS/G -- Point Shares per Game
- Rank.OGIT -- Offensive ice time
- Rank.GVT/60 -- GVT per 60 minutes played
- Rank.FO GAR -- Face Off Gar
- Rank.Days -- Number of days player counted towards the salary cap
Whoa. I probably couldn't have predicted that these are the metrics that would really count. The model accounts for about 63% of the variance in Rank.Cap cost... which actually isn't too bad.
Here's the regression formula:
Rank.Cap Cost = -24.44 + 0.116 Rank.IPP + 0.416 Rank.PS/G + 0.291 Rank.OGIT - 0.292 Rank.GVT/60 + 0.149 Rank.FO GAR + 0.547 Rank.Days
Notice that GVT/60 is actually negative while Point Shares is positive! This analysis seems to indicate that GMs actually way undervalue GVT. It's also interest that face offs entered only as a component of GAR. It's interesting that drawn penalties doesn't seem to factor directly into any part of this analysis.
So, how did this model work out for our friend Kadri last year? His Rank.Cap Cost was 83 and the model predicts that is should be 81. Not bad considering that there is very little actual monetary difference between the 81st best paid player NHL centremen and the 83rd best.
Now I just have to figure out what Naz's current metrics are for the 2015/2016 season!