Re: What I Don't Understand About Last-Second Program Betting (1615 Views)
Posted by:
Mathcapper (IP Logged)
Date: June 15, 2017 01:43AM
Here's a little more flavor on what goes into his model, from the scholastic paper and the conference I referenced earlier in this thread.
The thing that's always struck me most is how little discussion there is about a horse's actual finishing time. I didn't come across any mention of daily variants, let alone inter-track variants, in either the paper or in any talks he's given that I've been privy to, unless he is in fact using speed/performance figures and is intentionally avoiding discussion of the topic.
Excerpt from Benter’s paper entitled “Computer Based Horse Race Handicapping
and Wagering Systems: A Report”:
>>
[i]The overall goal is to estimate each horse's current performance potential. "Current performance
potential" being a single overall summary index of a horse's expected performance in a particular
race. To construct a model to estimate current performance potential, one must investigate the
available data to find those variables or factors which have predictive significance. The profitability of
the resulting betting system will be largely determined by the predictive power of the factors chosen.
The odds set by the public betting yield a sophisticated estimate of the horses' win probabilities. In
order for a fundamental statistical model to be able to compete effectively, it must rival the public in
sophistication and comprehensiveness. Various types of factors can be classified into groups:
Current condition:
- performance in recent races
- time since last race
- recent workout data
- age of horse
Past performance:
- finishing position in past races
- lengths behind winner in past races
- normalized times of past races
Adjustments to past performance:
- strength of competition in past races
- weight carried in past races
- jockey's contribution to past performances
- compensation for bad luck in past races
- compensation for advantageous or disadvantageous post position in past races
Present race situational factors:
- weight to be carried
- today's jockey's ability
- advantages or disadvantages of the assigned post position
Preferences which could influence the horse's performance in today's race:
- distance preference
- surface preference (turf vs dirt)
- condition of surface preference (wet vs dry)
- specific track preference[/i]
>>
Excerpt from Benter’s talk given at the 12th International Conference on Gambling & Risk Taking:
>>
[i]An excellent estimator of present ability is a recency-weighted average of past demonstrated ability. An exponentially recency-weighted average with a 120-day half-life is very close to optimal. So an exponential recency weight, that would be if we did a weighted average of a horse’s past performances, a race occurring yesterday would have a weight of 1, a race occurring 120 days ago would have a weight of .5, 240 days ago would be .25, 360 days ago would be .125, decreasing by half every 120 days. In maybe 18 years of research, we’ve hardly been able to make any improvements on this basic 120-day exponential recency weighting…That’s an excellent estimator.
So, what would a typical measurement-of-ability factor look like? Well, recency-weighted past normalized finishing position is an extremely – if there’s one single variable that you should start your model with, if you’re building a model, I would start with this one…That’s a factor which can stand up alone in almost any jurisdiction around the world as a very good beginning estimate of a horse’s past performance.
Further estimating that, and this is the way we do our model, is that starting off with the recency-weighted past normalized finishing position, you can include a number of other recency-weighted past averages as separate factors. These are recency-weighted past averages of other measures of horse performance, along with recency-weighted averages of the influence on horse performance.
What would some of those be? Well, recency-weighted past normalized finishing position for a start. A second factor in your model would be recency-weighted past race competitive level. This would be sort of like recency-weighted past class of the race. There’s various metrics that can be used to measure the sort of race level – in the U.S. it could be the claiming price of the race or the amount of prize money involved or the class or different things with conditions.
Another factor you could throw in is recency-weighted past post position advantage…And the value of including this factor is that it acts as kind of a corrector to the normalized finishing position…Similar with the jockey advantage. You look at a recency-weighted past average of all of the jockey advantages, the relative jockey skill level that that horse enjoyed in the particular races. So if a horse has achieved a certain average past finishing position and he’s done it all with bad jockeys, that horse is better or probably has a higher ability level than one who’d achieved the same finishing position with relatively good jockeys. Also the recency-weighted past preferences enjoyed. If a horse has run most of its past races at its preferred distance, that means that his ability is probably not as good as what you saw because he was benefited by gee, every race was at his ideal distance. Similarly, a horse that had had the same average performance but had done so at distances that didn’t favor it would be relatively better.
So all of the above factors, along with any others that you can think of add up to form a good recency-weighted average of past demonstrated ability, which in turn becomes a good estimator of today’s ability.
The second term of the equation beyond ability is the various preferences. Well, horses can possess certain preferences. We try to quantify preferences by their effect on performance, and the value of each of these preferences should be applied to today’s races as separate factors. Distance preference for today’s race becomes a second factor. Surface preference, be it turf or dirt. Condition preference – what is the exact expected going of the grounds, is it going to be wet or dry? – that becomes a factor. So all of these go in as factors. This is sort of how you get up to 80 variables, by including all these little adjustments. As I was saying…the recency-weighted average of past preferences experienced should also be included as a separate factor in the model.
The incidental factors – these also need to be quantitatively measured in some way…and then thrown in as a separate variable in the model. So you’d have some variable for the jockey’s ability. This could be – a good one for jockeys is actually recency-weighted past jockey performance. Recency-weighted past jockey normalized finish. Look at all of the jockey’s past starts, calculate this recency-weighted average – and for jockeys we tend to use a half-life of about 1 year. Jockeys don’t change as quickly as horses do. Horses, a 120-day half-life for recency weighting, for jockeys around a year seems to be best.
Incidental factors – also trainers could be considered an incidental factor. The trainer effect, if the horse comes from a good stable it probably helps its performance. So some measure of the overall trainer win percentage can be thrown in. Weight to be carried today is also a factor that you’d use.[/i]
>>