PDA

View Full Version : Statistical Modelling


gjm112
08-30-2005, 03:52 PM
I wrote a thesis on modelling the NFL using bayesian linear regression with time varying parameters to accoutn for strength changes over the course of teh season. It has done very well in seasons past and I am going to try it this season for some real money.

I liek the idea of seeing what the data says, but I also lack experience in betting the NFL. The problem i foresee in tehg NFL is that poins aren't scored Copnsecutively. They are scored in jumps which makes it hard to model. Still even with this jsut looking at nubmers i think i cna expect a 56 percent win rate. I think i can improve this if I add common sense.

So, Has anyone ever heard of anyone beatiung sports using a regression model like the one i mentioned? And what are some things to look for after i make my predictions against the spread and compare them to the posted line to show a profit.

Gjm112

PS I am murdering baseball right now, I am up about 54 units in July and August. I'm not sure if thats luck or a fantastic model but it seems to be doing something right.

08-30-2005, 04:03 PM
No but I find it fascinating. There's certainly an angle for this type of modeling.

gjm112
08-30-2005, 04:05 PM
Alright well I am gongi to post my picks in a friendly duel with Sygamel through out the season and see whos system works better. I imagine they are fairly similar in the returns we can see.

08-30-2005, 04:06 PM
Cool. I think you will go a bit deeper with your analysis so I'm going to give you the preseason edge.

gjm112
08-30-2005, 06:30 PM
In most sports i think i have an edge, bu tfootball is hard to model using classic statistical methods because of the odd scoring system. Its hard to account for teh fact that 7 to 7.5 is a much bigger jump than 4.5 to 5 for example.

Also, the spread is a very difficult thing to beat. If it were easy lots of people would wake up at noon and bet on sports all day. Even with statistics, its not a guarantee.

"ALl models are wrong, Some are useful"
- I forget who said this but its not my quote.

tech
08-30-2005, 07:19 PM
[ QUOTE ]
Its hard to account for teh fact that 7 to 7.5 is a much bigger jump than 4.5 to 5 for example.

[/ QUOTE ]

Using parametric statistics, yes. Using historical percentages and some basic math, not hard at all.

tech
08-30-2005, 07:21 PM
[ QUOTE ]
what are some things to look for after i make my predictions against the spread and compare them to the posted line to show a profit.

[/ QUOTE ]

The accuracy of your line is what matters. After that, it is just basic math to figure out if your line is far enough from the posted line to show an expected profit.

08-31-2005, 11:09 PM
I have heard of this thing being done. As an operations research masters student, i was interested in creating a similar model. I must say, using Baysean stats is a remarkably good idea. I have a few questions. First, how do you deal with the discrete nature in baseball and hockey. (The continuous approximation seems appropriate for basketball). Second, how much does the model change over time? Third, do you do better in the later portions of the season?

crazy canuck
09-02-2005, 12:08 PM
Hello,

I have read of neural network models that preicts games.

Here is an example:

http://www.mymait.com/Paper/Paper2/

I've tried to implement it for hockey but didn't have much success.

Anyhow, if you want to work on it, send me a PM.

gjm112
09-02-2005, 02:12 PM
I have heard of this thing being done. As an operations research masters student, i was interested in creating a similar model. I must say, using Baysean stats is a remarkably good idea. I have a few questions. First, how do you deal with the discrete nature in baseball and hockey. (The continuous approximation seems appropriate for basketball). Second, how much does the model change over time? Third, do you do better in the later portions of the season?

Well I haven't tried it for hockey yet. Baseball is doing remarkably well, as I am up 50 units for July, August, and a few weeks in May.
I am examining point differentials in baseball. So if the home team wins by 5, the data for that game is five. If the away team wins by 5 the data is -5. While it does fit very closely in a normal model. Essentially the data is normal anough. I have read some papers about this normality assumption in football, and they all say it is ok. (I'll have to look up where those papers are.) Baseball alos appears to be normal enough.

The model changes a lot over time. Zito is a great example of this. It changes because mnore data is added everyday and from the time veriability of the model. Its kind of an in depth thing and i don't know how much you know about Bayesian modelling.

Thirdly, After about two weeks there is enough data in baseball that it seems to at least capture the essense of the true team strengths. Later in teh season it weights the games from May as almost not happening putting heavier weight upon the mire recent games.

I hope this answered you question. feel free to ask more questions.

gjm112