Back in 2010, I wrote a series of articles about how to build your own racing database. It was pointed out to me recently, that I’d forgotten to finish it with part four.
Part four is about where you should get your racing data from.
But the chances are, you haven’t read the first three parts, so this is where you can read them…
Which leads us to… part four…
Where To Get Your Racing Data
You have two basic options when you get racing data, you can either:
a) Buy it
b) Scrape it from other websites
The only official solution is (a).
Racing data is owned by the people who provide it. That means when you visit a website like the Racing Post, they own that data. If you use another website, such as Racing UK, then they will have licensed the data to be able to provide it on their website.
So, when you go and scrape data, you are in fact stealing it.
Of course, if you’re going to use it for personal use, then almost certainly nobody will know. If you do that, it’s your choice, but I can’t recommend that you do that.
And… to be honest… it comes with a whole heap of issues!
When a website changes their format, your scraping feeds will stop working, when code is changed, your scraping feeds stop working. Unless you’re a developer yourself, it’s going to become very time consuming to keep up with the changes in your scraped feeds.
Which is when you need to ask yourself the question… How much is my time worth?
Answering that question, you may just find it’s cheaper to buy the data.
But where the heck do you buy racing data from?
The main providers are The Racing Post, Press Association and Timeform.
They all have different prices, and all can be negotiated with.
However, racing data is not made equal.
Of the three above, I’ve only used data from the Press Association and The Racing Post.
Currently, all the data for the Race Advisor is provided by The Racing Post.
And that’s because, quite simply, the Press Association’s data sucked.
When I was using it, they were simply providing raw data. There were no checks done on it. We had horses with winning times that meant they’d be running over 100mph. There were goings that were completely wrong. To be honest, the processes that we needed to build to check the data (before we could do anything with it) were becoming so big it was unmanageable.
Which is why we now use the Racing Post.
The only problem… it’s pretty damn expensive.
Waaaaay to expensive to want to use on a personal level.
Where does that leave you?
Well, of course, it still leaves you with the option of scraping data, or you could buy it from http://www.betwise.co.uk/
That’s a site who buy data from the Press Association and have a license to resell it to you. I think they do some cleaning of it first, but I can’t tell you how much. However Colin, the owner, I’m sure will be more than happy to let you know.
It’s a little bit techy, but after all, it’s a database. There are good guides to get you started, and with a bit persistence, you’ll be up and running in no time.
There will be a learning curve to interrogating the data. With a database, there’s no nice front-end to allow you to interrogate it easily. Which means you’ll have to learn how to do it.
Once you’ve learnt how to do it, you will never be restrained in what you can ask of the data. You can literally ask it anything, rather than being limited to whatever your system builder has enabled you to ask.
Of course, if that’s all too much, you could simply use the suite of Race Advisor software tools instead 😉
The Racing Dossier, for example, gives you access to the most comprehensive set of race ratings available for UK and IRE racing.
And… I’ll give you FREE strategy to warm you up!
All the best,
Michael and the Race Advisor team