# Poisson Distribution | Horse Racing | Forum

November 21, 2018

Hi folk.

Apologies for the lengthy entry I can talk for Britain when I get my teeth into something lol.

I’m looking to see if anyone has tried using the Poisson Distribution with horse racing ?

and if so any outcomes or pit falls. Success or fails ?

I’m looking at different angles to add to Eddies Small field spread sheet.

If you have been following Stuarts hard work on the forum you will see I have a strong

belief that recent form definitely has a huge role to play in predicting the

current expectation of the horse. I am working on a few things with regards to this.

One of the other things that interest me is Probabilities. I love working out the

probability of an event then, seeing if there is value to be had with the odds given.

I use Poisson Distribution on a football system I developed with great results.

I know most of us will have heard of the Poisson distribution but for who have not,

very simply put –

If we know the average number of times an event will happen we can work out the probability

of events occurring in the future such as goals in a football match.

Lets say a game coming up involving Liverpool v Cardiff .

If we check the stats we see Liverpool score an average of 2.18 goals

at home whilst conceding an average of 1 goal.

We see Cardiff score an average of 1.87 away from home whilst

conceding an average of 0.95

Liverpool average

(Av goals scored by Liverpool + Av. goals Conceded by Cardiff) / 2

= (2.18 + 0.95)/2

= 1.57

Cardiff average

(Av goals scored by Cardiff + Av. goals Conceded by Liverpool ) / 2

= (1.87 + 1.00)/2

= 1.44

Therefor using Poisson we know –

Outcome Liverpool 40% chance of win

Cardiff 35.4 % chance of win

Draw probability 24.21.%

Odds we would expect to get would be,

1 – 2.45

x – 4.10

2 – 2.9

So if sky bet have priced this game at Liverpool win at 2.00 this would not be value bet

because we have worked out the odds using Poisson distribution as being higher at 2.45.

However, if betfair are offering odds on a Liverpool win of 2.6 This would be a good

value bet as we are being offered more than the probability of the event occurring.

I am trying to apply this theory to a horse race and believe me its no simple task lol.

I have picked two random races today and added a form of Poisson distribution to both.

Very very interestingly, in my first race, the race order I predict matches exactly the betting

odds on offer. Even my worse scoring horse was getting odds at 100/1 last night.

So that’s encouraging.

The method I’m using is not “odds making” thats a different method. I am applying

statistical data into a formula to predict Poisson Distribution and predictive outcomes.

Lets see how I get on with my first try at this. I have selected a Non Handicap and a

Handicap race. I have also identified top two predicted horses in each race to see if

we can find any winners. Thanks for any ideas.

Leicester 12.35

Horse | Rating | Probability | Position | Bookie Odds |

Ekayburg | 5.8 | 50% | 1st | 1.9 |

Laughing Luis | 9.9 | 15% | 2nd | 3.75 |

Eyreshill | 10.1 | 13% | 5.50 | |

Early learner | 10.7 | 8% | 10 | |

Burning Bright | 21.5 | -15% | 100 | |

Chelmsford 7.00

Horse | Rating | Probability | Position | Bookie odds |

Restive Spirit | 5.4 | 50% | 1st | 5 |

Able Jack | 7.3 | 33% | 2nd | 11 |

Glory awaits | 10.5 | 3% | 13 | |

Braden | 10.9 | 0% | 4 | |

Firmament | 13.1 | -21% | 5.5 | |

Brittanic | 13.4 | -23 | 9 | |

Greatest Journey | 15.3 | -41 | 4 |

November 21, 2018

This is the results from my first attempt with top 2 horses from random 2 races introducing a Poisson Distribution –

Leicester 2.35

Ekayburg Third

Laughing Luis First 3.75

Chelmsford 7.00

Restive Spirit Lost

Able Jack was a NR

This race looks to have been very competitive with not much between them all.

So we managed 1 First from our top 3 horses in the 2 randomly selected races. . I need to look now at all weather and expand the peramiters a bit see how it goes. I will just work away on it.

Betting all 3 would have been a slight profit on the day I think. Not too shabby.

Onwards as they say.

November 11, 2018

Hi Fidelcastro,

Regarding odds for soccer matches they are being made by this model Poisson Distribution.

In essence it goes -> take average number of goals per match and you make probability for each team for one goal scored, two goals, 3,4… than you derive probabilities for 1, X and 2 and basically all other markets are derived from this.

Even on most “basic” model if you do the job good, you will get bookies early odds.. If you go even further you will get odds (1/probability) accurate on 2nd decimal..

……

Regarding Horse Racing I didn’t play with it, because I think odds pretty much match the probability, so if you do the “testing” of the previous results you will get that favorite with odd =~ 5.0 will actually win 1 time in 5 races, etc… If you take BSP as your reference and do testing as per them you will find very small deviation and actually very correlated odds <-> results

SUMMARY: What I am saying is that I cannot recognize value here (which I want that you explain me), because we have quite accurate matching between selections odds and actual results if you observe Odd vs Result relation. So if you have odd 2.0 (50% prob) and 2.0 (50%) on coin toss and in 100 round result is 40 head and 40 number odds are OK! => exactly what we have with horses. I believe you got the point because I see you have some math knowledge regarding probability and perhaps combinatorics. 🙂

Cheers!

November 21, 2018

Thanks jodzares.

You are correct

The bookmakers use Poisson on football matches to work out probable outcomes then they edge the final figure. That’s why we can use it to our advantage to work out value bets. Poisson and horse racing in my opinion can not be used to find value.

What I’m looking to do is use the prediction side of the Poisson to work out how a horse should perform based on the historical data being imputed. In my head here is where I am going with this. If we can work out how a horse has performed and work out the expected outcome given a set of circumstances. We should be able to compare each horses predicted outcome.

This is just first stage. From that we then need to use other data to make an assumption. We could never just make selections using Soley the Poisson. BUT we might be able to say based on probability XYZ horse should perform better than ABC bases on historic data . That means we have 3 horses to look at. I’m only using it to identify contenders for further looking. Ideal to score then add to spreadsheet for small fields . That way the other set criteria adds substance to our selections.

Just to summarise. It can’t be used to make selections only to show probable outcome on previous factual evidence.

It’s like the first stage of a process where we identify say best contenders based on a predictive outcome. Again it’s with a view to score and add to our spread sheet.

Thanks again

November 21, 2018

Right my own take on poisson.

Poisson Distribution is great for football games. Thats why the bookies use it because we have a set of 4 simple facts we can use as historical evidence. Home goals scored, home games conceded, away teams goals scored away teams goals conceded. The Distribution is excellent for prediction of all possible outcomes. As I say I use it on football games with remarkable accuracy.

I have spent some time applying poisson to horses and find its far too complex for my simple brain. My P(X=x) and cumulative P(x>X) is literally bursting my head !!!!!!!!!!!!! Makes me think I need to geta life lol.

Right new track today and I will keep it on this thread so we keep the forum manageable.

Im going back to try very basic historical Statistical data with a prediction on possible outcomes. I am posting 2 races today and again hoping to find winners among the TOP TWO horse in each race. Both races are very tight on paper and good class horses so this should be interesting.

3.10 Exeter

Handicasp Chase Class3

8 Runners.

Horse | Score | Odds |

Garrane | 20.4 | 3.75 |

Malpie | 20.2 | 9.00 |

The Two Amigos | 19.8 | 3.50 |

Firebird Flyer | 19.7 | 6.00 |

Yanmare | 19.5 | 7.50 |

7.45 Kempton

Class 3 Handicap

Horse | Score | Odds |

Outrage | 21.1 | 6.50 |

Watchable | 20.4 | 6.00 |

Busby | 20.2 | 5.00 |

Exchequuer | 19.8 | 6.50 |

Al Asef | 18.2 | 9.50 |

November 21, 2018

Sorry I should have added the following in case anyone is wondering why I have not scored whole race,

we know that roughly 85% of all winners come from the top 5 in betting. Thats is where I look when making my selections and discard any horse below that. The Above races have 8 runners but Im just looking at the top 5 from each.

Great thread, following with interest. Have you considered using a factor that looks at how each horse has performed against the average horse in the race. For example a form competitive rating, if you had an average level for the current race, and then a level for each horse, then you score the horse in every race it’s been in and determine a rating for it’s likely performance against the average horse in this race, then you can do you distribution on that?

November 21, 2018

Michael,

Thanks for the input I am trying something similar.

You know what its like, we always over complicate everything by over thinking when the best answer is sometimes right in front of us.

The poisson was tying me up too much as it had too many influential factors.

So I have simplified my processes.

So here is my rationale and what Im trying to achieve.

I know your knowledge is very very extensive so please this is not me stating

anything you don’t already know, Im just trying to explain what my thought processes are

with it.

My Thought Process

If we have 8 horses in a race, we can PROVE how those horses have previously performed over a certain period with a set of historical data. So we can work out the X mean value of all its previous races.

Put 8 horses together in one race and as well as individual values we then have a total race mean value lets call it Y.

So when we obtain an average skill level for the field., we know that some horses , on average,

have previously performed better than Y and some will have performed worse.

So if we take say speed, the race will have a mean average speed of all horses. However, horse 7 or 8 might never have ever reached that speed in their entire history. So to win, not only do they have to beat 7 other horses but they also have to run faster than they have ever done in their entire career.

So in reality, the probability factor would be zilch and we can probably discard horses worse than the mean average ( Not always though. ) so we can then concentrate on the contenders. We can then work out the probability factor of how good the horse “might” be expect to perform against the mean average.

Similarly, not only against the mean average but we can work out how horse 1 might perform against horse 2 based on probable outcomes.

In theory the horse with the greater probability of beating the Y value the more chance of winning.

Thats the theory and doesnt it look nice and easy ? The reality is somewhat more elusive.

So thats my mindset.

The secret rests with the actual data we are imputing to establish the skill base on. Speed alone would never be enough.

That for me is the key, finding the right data to work out a meaningful probability distribution chart.

Practical Application.

What I am working on is looking at form in the first instance. ( I do love recent form) I am messing about with some perimeters such as lifetime, 2 season, 1 season, current season and last 6 down to last 3 etc.

I am also looking at ratings, speed, etc and a number of other factors or” influencers “ as I call them.

I also have a weighting system for key components such last 3 runs.

So I work out all the data for each individual horse and assign it a mean score. The horse will perform better and worse than its mean average ( Weather, travel, external factors etc) but in terms of probability if it runs to its true average then its probability value is as close to a true 1 as possible.

I then work out the total scores for all entries and work out my Y average. Once I have that I work out what the probability value is of each individual horse against the Y mean average. The higher the probability from the mean, in theory the greater the chance of the horse beating the rest of the field.

I deliberately chose 2 very close races today. In one race we had the top 4 horses separated by only 1.5 on the betting odds thats how close the race was. The probability factors reflected that.

We did managed 2 seconds from the top 2 horses in each race so the probability data I’m obtaining appears to mirror roughly the betting odds so I’m not a million miles off. In reality we would probably have avoided betting on these races due to the sheer competitive closeness of all the horses. I just wanted to test a close race where the skill base was high.

As you know, it will take months to get some meaningful data and at least 100 wins before we have any true value.

I will just keep messing about and see what develop’s and as usual any advice or input would be greatly appreciated.

You could maybe advise if you think I am on the right track or going off on a wrong tangent.

Many thanks

November 21, 2018

Just trying the one race today.

A good quality race with some good class horses.

Very close with the top 3 horses. See if we can get a first today.

Again, encouragingly, my probability factors match the same order as Betfair betting odds.

Wolverhampton 6.45

Class 2

9 Runners.

Horse |
Probability | Betfair Odds |

Quiet endeavour |
0.26 | 3.00 |

Charming |
0.24 | 3.25 |

Deputise |
0.20 | 6.00 |

You never can tell |
0.17 | 7.00 |

She can boogie |
0.13 | 11.00 |

November 21, 2018

Results from todays race

Quiet Endeavour Ran like a 3 legged Donkey

Charming Ran like a 2 legged Donkey

Third choice Deputise won the race. (Only a 6% difference between the top three.)

My two were not just beaten, they were Humiliated !!!

As the saying goes – onwards.

November 21, 2018

Two races for today.

I have been selecting races that are all very tight in terms of odds and ability. Just really to test the probability against horses with very similar skill.

I always concentrate on the first 5 in betting and within todays second race, the first 5 horses have no less than 3 course and distance winners as well as 2 distance winners. Indeed amongst all 8 runners there are no less than 4 course and distance winners and 5 distance winners. So a very open testing race.

The 4pm race had only 2 clear contenders which is reflected in the high probability of a win being at 0.66 for one of them to win.

The 6pm race was very close as I say between the top 4. Let’s see if we can get any firsts today. I will be very surprised if we can’t get a first in the first race.

Wolves 4 pm

Top 2 Horses | Probability Factor |

Kynance | 0.36 |

Involved | 0.30 |

Wolves 6pm

Top 2 Horses | Probability Factor |

Shamshon | 0.23 |

Landing Night | 0.22 |

Sounds exactly right. Always keep your model as simple as possible to begin with, don’t use more than a handful of factors, find the strongest ones and strip the rest.

This is very import:

However, horse 7 or 8 might never have ever reached that speed in their entire history. So to win, not only do they have to beat 7 other horses but they also have to run faster than they have ever done in their entire career.

November 21, 2018

Thanks John Lets see what happens today. We are not a million miles off.

Font 1.10

Horse | Probability Factor |

The Ogle Gogle man | 0.25 |

Findusatgorcombe | 0.22 |

Uttox 12.30

Horse | Probability Factor |

Stop Talking | 0.24 |

Katebird | 0.20 |

Uttox 3.00

Horse | Probability Factor |

Global Domination | 0.21 |

Lalaskadesemilley | 0.19 |

November 21, 2018

Today’s results

2 low odds winners today.

Font 1.10

The Ogle Gogle Man First @ 2.5

Findusatgorcombe Second

Uttox 12.30

Stop Talking First @ 2.2

Katebird Lost

Uttox 3.00

Global Domination Lost

Kalaskadesemilley Second

Interestingly the horse that didnt win had the lowest probability of all 3 races. I’m going to look at value now with min odds for single or Dutch betting etc. No value in low odd horses.

Onwards

Most Users Ever Online: 55

Currently Online: SimonG

4 Guest(s)

Currently Browsing this Page:

1 Guest(s)

Top Posters:

Fidelcastro: 886

William Hamilton: 624

SimonG: 498

Mick McCormack: 370

MICHAEL CLARKE: 252

Stuart: 232

Eamon Roberts: 212

Carl Newbury: 133

AndrewP: 87

John Ritchie: 62

Newest Members:

federicomacgre

bobbyvillegas4

mariamunz38013

fylhtqAcacife

bkzfliannevy

hiramcalder156

kristysalting

galuhaliannevy

hermine96x3014

antjesalinas2

Forum Stats:

Groups: 2

Forums: 3

Topics: 113

Posts: 4310

Member Stats:

Guest Posters: 0

Members: 16693

Moderators: 1

Admins: 5

Administrators: Abude Bayassi, Michael Wilding, Eddie Lloyd, Nicola Claire, Race Advisor

Moderators: Ian Hudson