Does this mean that this is actually worse than just selecting randomly ?
Thanks Michael. So as a starting point for Michael S's strategy the DSLGR criterion is more to do with reducing the field and increasing strike rate than producing a profit (i.e. beating the market). The latter comes from applying the other criteria* to the remaining runners and structuring the bet.
* which appear to be few and quite simple
So, having eliminated runners with DSLGR>180 the thought process could be, how many runners does that leave and how much of the market do they take?
If its say one runner at 3.5, that's the bet. If it's two runners taking 50% and even money feels like the 'right' bet, then dutch them. If not, are there strong factors to eliminate either (or both)?
If say it's five runners taking 80% and a dutch bet equivalent to 1.25 feels 'wrong', are there strong factors to eliminate either (or all) and leave a bet?
So as a starting point for Michael S's strategy the DSLGR criterion is more to do with reducing the field and increasing strike rate than producing a profit (i.e. beating the market).
Yes. Personally that's how I always tend to start because if I can't get a certain strike rate I won't be able to sustain through losing streaks, so I like to get rid of the lower probability horses, despite knowing that there are profits to be made on them if you can ride the swings.
As I understand it, Michael is then using the MC simulations to see which of the strongest are left, and making a decision based on that.
In terms of strong factors for elimination, @balimaar what race conditions are you currently focusing on?
Does this mean that this is actually worse than just selecting randomly ?
It's an indicator of a factors value or relative performance to odds. This means that if you bet on these you will make a small loss long-term (pre-commissions). If you are just selecting randomly you are likely to make a far greater loss.
Does this mean that this is actually worse than just selecting randomly ?
It's an indicator of a factors value or relative performance to odds. This means that if you bet on these you will make a small loss long-term (pre-commissions). If you are just selecting randomly you are likely to make a far greater loss.
I don't agree. Selecting randomly should produce an AE of 1 and a break-even (pre-commission).
Therefore I don't see the logic of excluding DSLGR < 180 as this will reduce the AE below 1 and lead to a loss pre-commission.
If you want to maintain a minimum strike rate and get rid of the lower profitability horses then surely using the odds as a filter will be the logical approach, e.g. exclude any with odds above 5/1.
I don't agree. Selecting randomly should produce an AE of 1 and a break-even (pre-commission).
Therefore I don't see the logic of excluding DSLGR < 180 as this will reduce the AE below 1 and lead to a loss pre-commission.
If a very large random sample gives an A/E =1
And DSLGR>180 has an A/E of <1
Then sample minus DSLGR >180 would give A/E>1 (ever so slightly) would it not?
This says nothing about strike rate. We don't have the SR for DSLGR>180 but I suspect it is higher than the strike rate of the large sample which will be around 10%.
Whatever the SR is, I agree it could definitely be increased by a price filter but you would have to check A/E of the reduced sample size. The trio of SR, A/E and Chi2 should be measured for each combination of factors and filters.
Some statistical software that does logistical regression will find the coefficients for each factor and by using several iterations will quickly eliminate those that bring nothing to the party in terms of predicting winners (strike rate). You're left with the most significant combination of factors for SR but you still need to measure the combined A/E.
Models I've seen take these coefficients apply them to fresh sample data which hopefully gives similar SR results to the original sample. Then the A/E calculation formulas are set up and an algorithm like Excel Solver is used to maximise the A/E by tweaking the coefficients whilst keeping the SR within certain bounds. It's not guaranteed to find a solution every time but if it doesn't you try some other factors (or composite factors) until something robust is found.
In my experience truly random selection tends to produce an A/E of around 0.98 or 0.99 over a lot of runners (100k or more). However, human random selection is often significantly lower because it's not truly random. So you're right, with truly random selection the A/E is going to be similar to less than <=180 DSLGR.
Then sample minus DSLGR >180 would give A/E>1 (ever so slightly) would it not?
It would indeed. This point highlights something very important, that negatives are also informative in terms of eliminations.
I think we all got our 'more than', 'less than' and what we're excluding mixed up. I certainly did.
If very large sample A/E = 1.00
and DSLGR <180 SR = 0.99
Then DSLGR >=180 SR = 1.01
Michael S is excluding DSLGR>=180 so reducing the profitability of the set (slightly) but increasing it's strike rate.
Is that right?
Is that right?
No it isn't. For SR read A/E (can't edit posts)
In my experience truly random selection tends to produce an A/E of around 0.98 or 0.99 over a lot of runners (100k or more).
Is the AE for all selections not 1? Therefore a truly random selection would also be 1?
All selections are effectively 1 (0.999). You're right, in theory it should be 1, but the subset never is, although this is most likely caused by too small a subset. If you had at least 100k horses in your subset then I would expect it to be 1. Although in practicality that would never be the case. The random subset I took for the scores above 0.98 and 0.99 had around 20k runners in each.
I agree it will be unlikely to be exactly 1 but there is as much chance of it being above 1 as there is of it being below 1. I'll do some random analysis just to prove this to myself.
I have taken 10 random samples of 100,000 from a base of 300,000 runners and the AE was 0.99 for 3 of the samples, 1.00 for 5 of the samples and 1.01 for 2 of the samples. The overall average was 1.00. This is what I would expect.
I don't see the logic in your comment.
In my experience truly random selection tends to produce an A/E of around 0.98 or 0.99 over a lot of runners (100k or more).