PDA

View Full Version : Running regressions in excel?


gergery
04-19-2005, 07:19 PM
I have a poker database. I’d like to run some regressions in excel.

For example, I have 3 columns: Winrate/100, VPIP, PFR

How much of Winrate is explained by VPIP and PFR?

--g

olavfo
04-20-2005, 12:01 AM
I'm not an Excel expert, but multivariate analysis is what you want to do. A quick google search gave me this (http://www.econ.ucdavis.edu/faculty/cameron/excel/exmreg.html) page wich has some examples.

olavfo

Ass Master
04-20-2005, 07:19 AM
If you're going to do this type of regression then you will want to use a more elaborate model then

bb/100 = a + b * VPIP + c * PFR + epsilon

since such a model clearly cannot account for, e.g., the fact that being too loose or too tight is detrimental to your win rate. The first model you'd probably want to look at would be something more along the lines of

bb/100 = a + b1 * VPIP + b2 * VPIP^2 + c1 * PFR + c2 * PFR^2 + d * VPIP * PFR + epsilon.

Excel comes with a stats add-in that includes a regression tool. However IMO Excel is an inferior tool for statistical analysis. For one thing, doing operations on large datasets is very clumsy with Excel's spreadsheet interface. A far superior alternative, if you are willing to invest the time, is R (http://www.r-project.org/).

Ass Master
04-20-2005, 10:34 AM
Another thing I should mention is that you will probably want to do a weighted regression, weighted by the number of hands (I am assuming each data point corresponds to the results of a single player). Basically data points with more hands should be weighted more heavily than those with fewer hands. I know that R comes with weighted regression capabilities whereas I do not know if the Excel stats add-in package does.

gergery
04-20-2005, 09:37 PM
What does the squaring the variables do to help in regressions?

Thanks for the "R" site.

-g

tlnini
04-21-2005, 12:28 AM
[ QUOTE ]
What does the squaring the variables do to help in regressions?


[/ QUOTE ]

it's to examine the non-linear effect of your independent variables, in another word, you can see how the outliers in your sample affect your winrate. For multivariate regression, make sure you look at modified Rsquare

Ass Master
04-21-2005, 05:12 AM
It makes it possible to model a relationship that is more than just directly linear between the explanatory variables and, in this case, the win rate. This is necessary in order for the model's predictions to fit in with various known facts, e.g. that playing too loose or too tight is detrimental to your win rate.

For example, say we try to model win rate (WR) as a function of VPIP only. First consider the linear model:

WR = a + b * VPIP + epsilon

There are 3 possibities for the fitted value of b:

(1) b = 0. In this case the model thinks win rate is independent of VPIP.
(2) b > 0. In this case the model thinks the looser you play, the higher your win rate, with a VPIP of 100 giving the highest win rate.
(3) b < 0. In this case the model thinks that in order to maximize your win rate, you should never play any hands.

However if we expand the model to

WR = a + b * VPIP + c * VPIP^2 + epsilon

then with any reasonable data set you should find a fitted value of c < 0, which means that the predicted win rate as a function of VPIP will look like a parabola that opens downwards and has a maximum point corresponding to the optimal VPIP.