Gambling Maths

**mathare** · 10th April 2006

(This is as good a place as any for this I guess...)

We're all gamblers here but how many of us understand and appreciate the maths behind what we're doing? I'm not talking about being able to work out the profits on a £5 e/w yankee in the blink of an eye, I'm thinking more about
the maths behind systems and so on.

This post may get a bit daunting in places but I will try and keep it as simple as I can.

Introduction

Suppose there is a football match this weekend between Win2Win Wanderers and RacingPost Rovers. You know that Wanderers score an average of 3 goals in a game and Rovers score 1 goal in an average game. So you just run to the bookie and lump on 3-1 since that's the obvious winning scoreline, yeah? No. At least not necessarily. It may be that the bookies have priced it up all wrong and the odds for a Wanderers 3-1 win offers great value. But do you know what the odds should be?

We can use mathematical and statistical modelling here to help us...

The Normal Distribution

So far we know that on average Wanderers score 3 goals a game and Rovers score 1. But they are only averages. It would be highly unlikely that Wanderers scored 3 times in every game they played. If they did I would certainly suspect something dodgy is happening. Similarly Rovers are extremely unlikely to find the net exactly once a game.

We need to consider how the number of goals scored by each side in previous games is distributed about this average. As I said before, Wanderers will not have scored exactly 3 goals in each and every game. There will be games when they have failed to score, or just got 1 or 2 goals to their name. There will be games when they have scored exactly 3 goals though. There will also be games when they have scored 4, 5 or 6 times, perhaps even more. The further away the scoreline from the average, the less often we expect it to occur. We can expect 3 goals most often, then 2 or 4 goals with roughly equal frequencies. 1 goal or 5 goals will be seen even less frequency with 0 or 6 occurring rarely. We expect the number of goals scored to follow what is called the Normal Distribution.

It is called the Normal Distribution because it is just that - the standard distribution that most things follow. It is usually described as a bell-shaped curve. It peaks at the average and then tapers off to the extremes away from that average. It is symmetrical around the average.

Have a look at the attached picture...

Note that for the example of Wanderers' goals the values are discrete. This means that the number of goals scored in any game is an integer value and we can't have 2.82 goals in a game. Wanderers score goals in whole number
increments. So while the average may not be an integer value, the number of goals scored in a game will be. Why am I telling you this? It means that were we to plot out the number of goals scored by Wanderers in all their games so
far the distribution would look lumpier than the Normal Distribution I have posted. This means we are only modelling the goals with the Normal Distributions since it is an approximation rather than an accurate representation.

Standard Deviation

When using the Normal Distribution we can look at how spread out data is from the average. This is measured using the standard deviation.

Take the example of Win2Win Wanderers averaging 3 goals a game. I said earlier that it is extremely unlikely that each game in the sample of games we are looking at would have exactly 3 goals, but let's suppose for a few seconds that suddenly that is the case. Here the standard deviation would be 0 as there is no deviation from the average; each value in our sample is exactly on the average.

Now suppose we had 60 games in our sample and that 17 had 2 goals, 26 had 3 goals and 17 had 4 goals giving us a very crude Normal Distribution. Note that the average here is still 3 as (17*2)+(26*3)+(17*4) = 180 and 180/60 = 3.

We now have a distribution of values around the average. 26 of the games finished with the average number of goals but 17 games were 1 goal under and 17 games were 1 goal over. If we were to compute the standard deviation here we'd see it comes out as 0.76.

Let us taken an even more extreme case: 60 games again but 30 games where Wanderers fail to score and 30 games where they score 6. Again the average is 3 as (30*0)+(30*6) = 180 and 180/60 = 3. But now the standard deviation is 3.03.

So you can see - the more the data is spread out around the average the greater the standard deviation is.

Exactly how you calculate the standard deviation is outside the scope of this post (but may be the subject of another future post if the demand is there for it). All you really need to know is that it is a measure of how spread out around the average your data is. And that Excel will calculate it for you in seconds :)

NB For this measure to be meaningful you should have a sample size of at least 30. The bigger the sample the more accurate and meaningful the standard deviation is.

Confidence levels

OK, so we've computed the standard deviation - what use is it? It can be used in conjunction with the Normal Distribution to work out what are known as confidence levels. Eh, what's that all about then?

Go back and look at the bell-shaped curve of the Normal distribution. It's peaked around the average value. In fact 67% of the area under that curve is +/- 1 standard deviation of the average. That means if we had a average of 3 and a standard deviation of 1 then 67% of our data would lie in the range 2 to 4 i.e. one standard deviation either side of the average.

Obviously this means a third of our data lies outside this range. Look again at that Normal Distribution curve and see how low it is at the extremes. In fact 95% of our data is within 2 standard deviations of the average, i.e. 1 to
5 and 99.7% of our data is in within 3 standard deviations of the average i.e. 0 to 6 using an average of 3 and a standard deviation of 1.

OK, so what does this mean and what are these confidence levels? For a known average and standard deviation we can be 67% confident that the value will lie within one standard deviation of the average. Suppose we go back to Win2Win Wanderers and their 3 goals a game. Suppose also that we have a standard deviation of 0.5. For Wanderers next game we can then be 67% sure that they will score between 2.5 and 3.5 goals. Similarly we can be 95% confident they will score 2 to 4 goals and 99% confident they will score 1.5 to 4.5 times.

How does this maths apply to our gambling?

There are limitations to using the normal distribution when it comes to gambling, unfortunately. Take another look at that Normal Distribution picture. It's symmetrical about the average so for every value above the average there is a corresponding value equally below the average. This is not true of gambling.

When you place a 1pt bet it either wins or it loses. If it wins you make a profit that depends on the odds. But when it loses you lose the 1pt stake. Your losses on each bet are fixed but the profit is variable. If we were to plot
the frequency with which we recorded each possible profit or loss on our bets we would not get a curve that looked like the Normal Distribution curve. That means we can't use standard deviations doesn't it? Not necessarily...

Standard deviations can be used for distributions that aren't Normal but it's a poor approximation. What we can do instead is change our measure. Instead of assessing each bet we group them into say batches of 10. So taking each
set of 10 bets at a time we work out our total profit/loss on those 10 bets. Were we to plot the frequency of these figures we would get a much better approximation to the Normal Distribution.

So what? If you have over 300 bets and group them into 10s you have at least 30 values in your new data sample of profit/loss per 10 bets. If you then compute the standard deviation for these values you can work out confidence levels for your profit per 10 bets. You'll then know that 67% of the time your profit over 10 bets will lie within 1 standard deviation of the average, 95% of the time it is within 2 standard deviations and 99% of the times it is within 3 standard deviations of the average.

Why would I want to do any of this?

You may never want to, it depends on what level you take your gambling to. Suppose someone told you they had analysed their system and for every 100 bets at 1pt they expected a profit of 15 points with a standard deviation of
6 points. You ignore the standard deviation bit because you don't understand what it means but think 15 points profit on 100pts invested is OK and you'll follow the system. 100 bets later you are a couple of points down but remain convinced to try the same system for another 100 bets. At the end of that you are now 5 points down overall and getting all grumpy. You are 35 points away from where you think you should be. You expected to be about 30
points up but are in fact 5 points down. You complain to the guy who runs the system and get all mad at him. Are you right to be angry?

Let's go back and look at that standard deviation again, just in case it was important. After 100 bets you were 2 points down when you should have been 15 points up. Should you? 99.8% of the time your profit over this period will be between -3 points and 33 points i.e +/- 3 standard deviations from the average. OK, in this example you were maybe slightly unlucky to be near the worst case. Surely it'll get better from here though. Another 100 bets goes by and you lose a further 3 points, again an extreme case perhaps. But it happens, and the system is still performing as well as it ever has done. The system isn't broken, what you are experiencing is statistical fluctuations within acceptable ranges for the data.

So you see, that standard deviation could be important. Someone else comes along with a system that profits to the tune of 10 points per 100 points invested and has a standard deviation of 2 points. This system is 50% less
profitable that the previous one isn't it? This makes 10 points every 100 bets and the other maked 15 points. Ahh, but here you can expect to make 10 points +/- 6 points every hundred bets and remain within the realms of acceptable statistical fluctuations. That's 4 points to 16 points every hundred bets. This is a lower risk system as you should make at least a small profit every 100 bets. With the other system you could make a loss.

And don't be fooled into thinking that things won't carry on the same for hundreds of bets. With the first system you could make -3 points each and every hundred bets and be 30 points in the hole after 1000 bets and the system would be working as advertised. But followers of the second system would have at least 40 points over the same period. OK, going to the other extreme you could be looking at 330 points over 1000 bets on the first system and only 160 on the second system but would you face that risk?

Conclusion

There is a good chance that many of you will never need to worry about Normal Distributions, standard deviations or any of that sort of stuff. But it does you no harm to know about it and hopefully understand some of it. There is a lot of maths at work and many tricks and tools that can be used to help you analyse your gambling and systems. Anyone turning pro would probably benefit from understanding this sort of stuff to balance steady growth systems (low standard deviation) with more high risk systems (high standard deviation).

By the way, I reckon the odds on a 3-1 scoreline in the mythical game between Win2Win Wanderers and RacingPost Rovers should be a touch over 11.0 (in fact anything over 12.13 in decimal odds would be value) given the goal averages stated. Why this is so, along with a discussion on the Poisson Distribution plus dependent and independent events is the subject of a follow-up post of there is enough interest.

**silax** · 10th April 2006

i don't think you can apply it to football :)

**mathare** · 10th April 2006

Originally Posted by silax

i don't think you can apply it to football :)

I think you can, if you are selective in the way you do it.

Plus it's much easier to use football examples for what I was trying to show than it is to try a racing example due to the number of variables involved in racing

**silax** · 10th April 2006

nice short answer
but really you have 2 teams with 11 players in each could be completly different players to what your stats are based on they could all be at different fitness confidence ability levels a top player could get injured sentoff what ever after 5 mins.
someone could take a wild shot score against all the odd the tactics change all the time.

**silax** · 10th April 2006

damn this means i have to read it all now
back in 10 miutes

**silax** · 10th April 2006

way over my head

**clotty** · 10th April 2006

Originally Posted by silax

way over my head

It is late and I had to read some of that stuff a second time round, but all in all I found it a great post. Some new things there I hadn't even thought about and the rest was a good refresher.

Must read for anyone who wants to take their gambling to a serious level, in my opinion.

**John** · 10th April 2006

Originally Posted by silax

nice short answer
but really you have 2 teams with 11 players in each could be completly different players to what your stats are based on they could all be at different fitness confidence ability levels a top player could get injured sentoff what ever after 5 mins.

True silax, but what you have to remember is Mat is looking at the raw stats here, and over a long period of time, we can make an accurate prediction that two teams perform consistently in the same manner, over such a period of say, 1 season. I think with this kind of data you would apply it to the same teams week in week out, so that you can ensure consistency is maintained throughout.

A good post Mat, and an insight into something I'd not really thought about before. It would be interesting to find out how much attention bookmakers and exchanges pay to this sort of mathematical modelling when forming odds for Win and Correct Score markets.

Thanks for taking the time to post this and share the information with us.

**silax** · 10th April 2006

i'll read it tomorrow

wb · 11th April 2006

Interesting stuff mat. I was never too good at maths at school, but this is probrally because I was not applying it to anything interesting (like trying to make money :)) I am trying my best to learn better maths now that I am laying quite a bit. It is nice to get different angles on all gambling matters. That was well explained. A good post

**mathare** · 11th April 2006

Cheers guys.

The players will change game to game Silax, yes. But as John says if we average out over a long period (several seasons) then this variation in playing personnel should be negligible. After all the strength of opposition changes each game, as does the importance of the game, morale at the club and hundreds of other factors.

What we are doing is looking for a mathematical model as we can never fully analyse every aspect of the game. By monitoring stats such as average goals scored/conceded and the standard deviation on those distributions you can begin to formulate a model you can use to look at correct score betting, for example.

**Win2Win** · 11th April 2006

Vegy got confused with your post Mat right after the word 'Gambling.....' :D

**vegyjones** · 11th April 2006

What is the poisson distribution ? :wink :D

**Onlyforfun** · 11th April 2006

You might like this Mat

http://www.timesonline.co.uk/section/0,,7973,00.html

**mathare** · 11th April 2006

I do like the Fink Tank - yeah. I love all these football prediction systems. I worked one of my own out last summer but it was becoming a bugger to maintain so I ditched it.

I hope to resurrect it next season

**mathare** · 11th April 2006

Originally Posted by vegyjones

What is the poisson distribution ? :wink :D

It's used when people say "there are plenty more fish in the sea"

**clotty** · 7th July 2006

On the topic of maths for gambling purposes, I was wondering if someone could help me out.

I have two main questions, involving determining a probability from ratings. I'll start off with the easier question.

Let's assume I ussed an ELO type ratings system and had one team with 1400 playing at home and another team playing awat with 1600 points (with the starting points total being 1000). How, from this data, could I work out the implied % chance of a home win, away win and draw?

Second question, is this possible to do from one value? I'll use an extremely simplified version of my over/under/ predictions ratings.

Let's say I add the number of goals scored by a team with the number they conceed and divide it by the the number of games played. I do this for the away team they are playing, too.

Assume the home team have, on average, 2.37 goals in a game, while the away team have 3.46.

I then average these two figures to produce a very, very rough estimate that, on average, games between these two sides will produce 2.92 goals (to 2 d.p).

Is there a way for me to use this figure of 2.92 to predice the probability of over/under 2.5 goals?

Thanks for any feedback.

**mathare** · 7th July 2006

Originally Posted by clotty

Let's assume I ussed an ELO type ratings system and had one team with 1400 playing at home and another team playing awat with 1600 points (with the starting points total being 1000). How, from this data, could I work out the implied % chance of a home win, away win and draw?

I don't remember the Electric Light Orchestra being big on ratings (unlike Professor Elo :wink) but what is important here is the difference in the teams ratings really. The match has a rating of -200 assuming no home advantage (it is common to add on around 100 as home advantage). That says the away team is more likely to win but that a draw is not that unlikely. As for how unlikely or otherwise these events are, not sure. I'm trying to work it through in my mind as I type. Your division is a closed system - the total number of points for all teams at the start of the season will be the same at the end. Does that help any? My initial thoughts are you can't use this to work out implied chances of home win etc without further data. But that's just initial reaction at the minute.

The away team has 1600 out of the 20000 points in the division (for the Prem) - is that meaningful. Not sure. They have 1600 of the 3000 points in this match, that might be more useful. But you have three results to consider so three unknowns. Therefore you need at least three equations. How do you rate the chances for the draw? You need some sort of tolerance (e.g. +/-50) for the draw I think. So a match rating between -50 and +50 is most likely to be a draw.

I think what you need is a set of historical match ratings (which is easy enough to produce) and results and then optimise the data based on that. Maybe work out average match rating for each result (H/D/A) and compare that to the rating of this match, knowing from past results what percentage of games end with each result. You can then scale that according to the scaled match odds against the average.

Is there a way for me to use this figure of 2.92 to predice the probability of over/under 2.5 goals?

Yes, use a Poisson distribution to work out the chance of 0, 1 and 2 goals using an average of 2.96 and sum those for your chance of the match going under 2.5 goals. 1 minus that is the over chance. That assumes the distribution of goals follows Poisson but you have to assume something as you are modelling a real system.

**Win2Win** · 7th July 2006

Have you noticed Vegy never appears in thread with numbers in? :)

**clotty** · 7th July 2006

I was thinking of doing what you suggested by recording the % of each outcome when teams meet each other with certain set ELO ratings, but I didn't think it would be easily doable as you'd need at least dozens of matches for each individual possible rating, to have reliable data and I already have the research and data for that exact same thing, with thousands of games analysed, only with goal supremacy ratings.

As for poisson distribution, I've never heard of it before. What is it exactly and does require a lot of mathmatical mumbo jumbo, or is it quite straight forward to implement?

**mathare** · 8th July 2006

Originally Posted by clotty

As for poisson distribution, I've never heard of it before. What is it exactly and does require a lot of mathmatical mumbo jumbo, or is it quite straight forward to implement?

Look it up on Wikipedia. The actual details are not that important though.

It's a doddle to implement as long as you have Excel. Look up POISSON in the formulae within Excel. You need to supply three parameters if I remember correctly:

=POISSON(x, ave, TRUE/FALSE)

where x is the number of events you want the probability for, ave is the average number of events and the final parameter is exactlly X events or up to and including x events.

You can use this to work out the probability of 0, 1 or 2 goals (or any number) when you have a known average. You can either do 3 separate calculations and sum them (exactly 0 + exactly 1 + exactly 2) or just do one calculation (up to and including 2).

**gingertipster** · 2nd May 2007

Originally Posted by mathare

Cheers guys.

The players will change game to game Silax, yes. But as John says if we average out over a long period (several seasons) then this variation in playing personnel should be negligible. After all the strength of opposition changes each game, as does the importance of the game, morale at the club and hundreds of other factors.

What we are doing is looking for a mathematical model as we can never fully analyse every aspect of the game. By monitoring stats such as average goals scored/conceded and the standard deviation on those distributions you can begin to formulate a model you can use to look at correct score betting, for example.

I agree maths Maths is all important in any betting but only in certain ways.

Yes there are hundreds of things to take into account when accessing a good bet in racing or football. But by going back so far is surely not the right thing to do. Are Leeds United now as good a side as say Sheffield United? Taking in a 5 year study you might think so.
I believe in football a punter should look at how well the teams are playing now, say the last 7 games. Of course you have to allow for how good their oponents were in those 7 games.
How good the 11 players / squad are in each team, are they confident and playing well at the time. What weaknesses they have as a team and individualy. It does not matter if they scored freely last season if they are now struggling. Though the record against each other should be considered.

Ginge

**TheOldhamWhisper** · 3rd May 2007

Shut up Ginge.

**MattR** · 3rd May 2007

Originally Posted by TheOldhamWhisper

Shut up Ginge.

Actually he made some good points...... then ruined it by putting on the same record again!

**paul183195** · 3rd May 2007

**mat_s99** · 5th May 2007

Very good jargon-free explanantion Mat

FYI Wiki (take it with a pinch of salt!) has a superb explanation on Poisson distributions....

Thread: Gambling Maths

Thread Tools

Rate This Thread

Display

Gambling Maths

Thread Information

Users Browsing this Thread

Similar Threads

Maths Help

Maths Help

it's only maths

Maths.

Maths

Tags for this Thread

Posting Permissions