This is the fourth part in the series where we have been looking at creating your own racing database. If you haven’t read the first three parts then you can do so using the links below.
In today’s post we are going to tackle one of the main problems in creating your own database, finding the data!
Unfortunately without the data all our efforts will have been for nothing. Horse racing in the UK has very poor data when compared to countries like the US. It is also very hard to find out where to get it from and we are going to look at the various options that are available to you.
Your first option is to use one of the various pieces of software that are available and allow you to mass export the data. Some examples would be Proform, Raceform Interactive and Dataform. All these pieces of software allow you to mass export the data as well as providing on screen handicapping facilities. When you export the data it is put into a csv file. You can then take these files and import the data into your own database splitting it into the tables that you setup earlier. There is nothing wrong with using this method and it is without a doubt the simplest. If you do not have the time to export and import the data every day though and would rather something that was automatable then this solution will not be for you.
This brings us to our next solution. All of the software mentioned above is based on a database, what you see when you use it is the user interface that allows you to interact with the database. This means that you can go in through the backend and straight into the database. According to Dataform this is not possible with their database and I have not tried so I cannot verify it. I would suggest that if this is your preferred method then you would want to choose a software provider that has built their database in the same program as you have, if possible. Proform is built on MSSQL and Raceform is built, I think, on Oracle (to be honest I am not 100% sure as I haven’t used it for quite a while, if you know then get in touch). You can go through the backend of this software and access the database directly. By doing this you can automate the exporting of the data every day into your own database. As far as I know Proform is the only software that currently updates their database automatically if you leave the software running, with the others you would still need to login and download the daily updates manually.
The third solution would be to use the Smartform database. This is provided by Betwise and is raw data split into MySQL tables. The structure is a basic four table setup and there is no user interface outside of what MySQL (or third parties) offer. A script is used to update this database every day and so is already automated and requires no manual update. All you would need to do is to transfer the data into your database each day.
Our final solution is to go to the Press Association. The Press Association is the official data provider for horse racing in the UK and IRE. This is where the Racing Post, Sporting Life and all other providers of information get their racing data. They can provide your data in almost any way you would like. They can provide historical data, daily data, live streaming data during in-running etc…. As you would expect this is by far the most expensive option and do not think that the data is of a better consistency. In fact most of the software providers have checks in place to make sure that the data they provide their users is accurate. The Press Association just provide raw data with no checks in place and they quite often make mistakes which means that you need to put the checks in yourself.
These are your main options to getting the data into your database and the one you choose will be very dependent on your budget and what level of automation you require.
Thanks to Eric, one of the Race Advisor readers we have some more information on the Raceform database. It is built in FoxPro, not Oracle. Interestingly FoxPro was what William Benter, one of the biggest HK syndicate owners, originally built his database in. Unfortunately it seems that the table structure, discussed in lesson 3, is less than good which will make working with it directly very difficult and in some situations have to use their interface in order to populate a table to get the data into it. The export function is good, if slow, but is limited to predefined fields. Again the query function is good but can only be used in the way prescribed by Raceform and there are issues with distance and time formats which require them to be manipulated for use rather than display.
I have been asked by a reader to include another piece of software called Computer Form Book. This has been around for a long time but I do not know whether the interface has been updated, from the look of the screen shots it hasn’t been for a while. There is not much that I can tell you about the data from this product as I haven’t seriously used it, but from what I have been told you can export it all into a csv file which you can then use however you wish.