I have had a lot of requests from readers who are interested in building their own database and are not sure how to go about it. In response to these requests I am going to write a series of articles outlining the process that I used to design and build my database. This is by no means the best way or the only way, I have made a lot of mistakes over the years and am still learning now.
Part 1 – What Do You Want To Use Your Database For?
Building your own database is a long and slow process. Please do not expect anything to happen overnight. In fact do not expect anything to happen within a few months unless you either have a lot of money to spend or have a lot of time and are an excellent relational database programmer.
With the warnings over it is time to start the planning of your racing database. As with any database the more time that you spend in the planning it then the less problems you will run into later and the less likely that you will not be able to achieve something with it that you want to.
The first stage in the planning of our database is to determine what we want to use it for. This may seem like an obvious question but it is a very serious consideration and one that you should give a considerable amount of time in thinking about.
It is very easy to have in mind what you may want to do with it now based on an idea that you may have and cannot complete with the current software you are using. I am sure though that you do not want to spend hours of time and possibly a lot of money developing something that 6 months or a year after finishing is not capable of doing what you want it to do purely because you didn’t put the time in at the beginning.
How you want to use your database is going to determine how you design it, what software you use to build it, how long it will take to build and how much it is going to cost to build. Below are some ideas of possible uses for your database:
- Basic system building using AND/IF/OR statements horses race data
- Calculating impact values
- Complex system building using AND/IF/OR statements on horses historical data
- Race trend analysis require in-depth historical queries into race types and their winners
- Complex system building using advanced statistical querying on horses historical data
- Complex system building using advanced statistical querying on both race and horse historical data
- Being able to plugin advanced data mining technology and other technology e.g. artificial intelligence
- Creating ratings
- Ability to connect to automated betting tools
Above you will find some of my suggestions for what you may wish to use your database for. Of course you are not limited to wanting to use it for just one of the above processes; you may wish to use it for multiple ones or even all of them. I am sure that you will also have other ideas of how you want to use the data once you have received it into your database.
Consideration must also be given to what elements of the data you want to access. For example separating the given class of a race is quite simple but do you want to be able to separate between classifications and race conditions, if so in how much detail?
It is important to write everything down, you can always cross-out anything that you wish to change but having a visual representation of what it is you want to be able to achieve is very important. If you do not write it down then you will probably forget some important parts that will then cost you unnecessarily later on in the process when it is more complicated to enter them. As well as financial costs time is also a consideration in horse racing databases. With such a vast amount of data available data processing can take a long time and a mistake can be the cause of months wasted. An example is one of my ratings which average’s around 24 hours to calculate per month of historical data. With 10 years of data this is 2880 hours of calculation which equals 120 days or 4 months of constant processing. A mistake in the script can cause serious delays especially if the error is only found towards the end of the processing.
I shall leave the first part here so that you can plan what it is you actually wish to build your database for.
My ratings service The Racing Dossier provides high quality ratings for every UK and IRE race each day.