PDA

View Full Version : Some Analyses of Poker Data - What Accounts for Our Winnings?


shermn27
12-04-2005, 04:58 PM
Well, we don't have a statistics forum, so I figured this was the next best place to post something like this. Note to the reader - This post is long and contains some statistical terminology. Please don't get caught up in that and understand the take home message in the summary and conclusion section.

Over my short playing career (over 20,000 hands at various micro limit games) I have compiled a small data set that I believe contains some valuable information. When thinking about what it takes to be a winning poker player and money won, I can't help but think about what accounts for that money. Of course the tight/aggressive gets the money theory has been professed countless times. But in reality, and to my knowledge, no one has come out and showed how much of their profits come from how they play.

Because PokerTracker keeps so many stats, I exported a data set to determine what factors account for a significant amount of money won and how much do these factors (specifically tightness and aggressiveness) account for.

Note to readers - This post is long.

Method
I exported from PT stats on every player with whom I have played over 50 hands with (including myself) of full ring game (8 or more at the table) at micro limit texas hold'em ($.50/1.00--$3/6). As I have not been playing long, this data set included a total of 709 players.

I included many of their various statistics kept by poker tracker inlcuding: VPIP, PFR, WSD, VPIPSB, FSB, FBB, ASB, AFTotal, AF-Flop, AF-Turn, AF-River, Cold CallPFR, $ Won, Limp/RR, When Folds-No Fold (WFNF), When Folds-PreFlop (WFPF), When Folds-Flop (WFF), When Folds-Turn (WFT), When Folds-River (WFR), Fold to River Bet (FRB), C/R, W$SD with Bet/Raise, and W$SD with Call. I also coded my data for tightness and aggressiveness. The rules for these were defined as follows:
Tight = VPIP < 20.00
Semi-Loose = 20.00 < VPIP < 30.00
Loose = VPIP > 30.00
Aggressive PreF = PFR > 5.00
Passive PreF = PFR < 5.00
Aggressive PostF = AF > 1.50
Passive PostF = AF < 1.50

Results
Because I was interested in knowing what percentage of $ won is controlled by these factors, I conducted a multiple stepwise regression predicting money won using all of these factors. The results of this analysis revealed that 10 variables were significantly predictive of $ won. Those variables and their partial r-squares are listed below.
W$SDBR - .08
W$SDBC - .06
WFR - .05
AFRiver - .04
AFTot - .02
WFNF - .01
VPIP - .01
FRB - .01

Overall, they model R-Square was .27. In other words, these factors predicted 27% of the variances in $ won sufficiently. First, this analysis implies that only about 25% of one's winnings are accounted for by these figures - thus 75% must be accounted for by something else. My guess is that much of it is error, as in every given poker hand it is not always necessarily better to fold at the river, or on the turn. It just depends on the situation. This analysis looks across all situations.

Secondly, this analysis also appears to reveal the importance of river play, aggression, and preflop play. The first 4 factors loading all have to do with river play, as do factors 6 & 8, WFNF and FRB. Additionally, the aggression factor Total was a significant predictor of money won as was VPIP.

Because I feared many of these variables were related to one another (such as WF-River and FRB) I then conducted a factor analysis on all of the aformentioned variables except for $ won. A factor analysis attempts to group similar variables into one factor. This analysis revealed that there are 7 factors:
Factor 1 - Folding. Variable loading on this factor are WFPF, VPIP, WFNF, Tightness, VPIPSB, CCPF, WFT, WFR, WFF.
These variables all appear to be related to folding both before and after the flop.

Factor 2 - Early Aggression (PFR, Aggression, ASB, AF-Flop, AF-Total). Each of these variables has to do with aggression on the early betting rounds.

Factor 3 - Late Aggression (AF-Total, AF-Turn, AF-River, W$SDB/R).

Factor 4 - Went to Showdown (WSD). This was the only variable loading on this factor.

Factor 5 - Fold to River Bet (FRB). Again, this was the only variable loading here.

Factor 6 - Folding in the Blinds/Blind Defense (FSB, FBB).
The appearance of these two together is not surprising and shows their importance.

Factor 7 - Miscellaneous (W$SDC, C/R). Not sure why these two are paired together, but they both loaded here.

From this analysis, it appears that there are seven factors in poker. (Sort of, I realize that I certainly don't have every possible variable in poker included). In an effort to determine how much $ won these seven factors account for, I created new variables consisting of the means of these of the variables loading on these 7 factors. These 7 new variables (factor1--factor7) were then used in a multiple stepwise regression to predict $ won. The factors predicting $ won and their partial r-squares are as follows:
Late Round Aggression - .09
Miscellaneous - .06
Folding - .01
Early Round Aggression - .01
Went to Showdown - <.01

Overall, the model r-square was equal to .18. Thus 18% of the amount won can be accounted for by these factors.

Many Hold'em experts (including our own Ed Miller and David Sklansky) profess that money is made after the flop and not Preflop. To test this hypothesis, multiple stepwise regressions were conducted for each different round of play. The model R-square is presented for each round of play below:
Pre-Flop - .03
Flop - .03
Turn - .04
River - .09

Note that river play appears to be the most predictive of $ won.

Finally, the overarching reason for doing this study was to examine tightness and aggressiveness. To do this, a 3(Tightness: Tight, Semi-Loose, Loose) x 4 (Aggressiveness: P/P, P/A, A/P, A/A) within subjects Analysis of Variance (ANOVA) was conducted on money won. It revealed a main effect for Aggression such that Passive/Passives (M = -8.62, SD = 58.39), Passive/Aggressive (M = 8.51, SD = 49.97), and Aggressive/Passives (M = -11.77, SD = 67.36), were making less than Aggressive/Aggressives (M = 30.34, SD = 103.41). (Note. Contrast analyses on Aggressiveness does reveal that A/A are making more than any of the other three and that the other three are not making significantly differently more than one another). There was no main effect for tightness although Tight players had a mean of 8.45, Semi-Loose a mean of 6.46, and Loose a mean of -4.73. Finally, there was no interaction between tightness and aggressiveness.

This ANOVA means that aggressiveness is a factor in amount of money won, while tightness does not appear to have as big of an effect.

Summary & Conclusion
So what does all this mean? If you are one is caught up in the statistical mumbo jumbo - forget it and read this.

These analyses essentially mean that Sklansky and Co. are right. (Not that we had any doubts, its just that no one has yet examined the data that I know of). Tight/Aggressive gets the money. But Semi-Loose/Aggressive and even Loose/Aggressives can get some money too. Being aggressive in poker appears to help make you a winner. Of course this makes since because by being aggressive we give ourselves two ways to win (they fold, they call with a better hand).

Additionally, we have found out that post flop play (especially River Play) is important. Much more important than Pre-flop play in terms of winning money.

Finally, some may criticize that only small portions of $ won can by predicted by these analyses (25%). First of all, realize that this is not necessarily 25 out of 100 percent because I think that no set of predictors could account for all 100% of winnings given the randomness and luck factors of poker. So the fact that 25% can be predicted is pretty huge. On top of that, think about how much 25% is in terms of your winnings? I argue that it is quite a significant portion.

Though I am reviewing this post for mistakes, please recognize due to its length and my anticipation for replies, that I am posting it rather hastily. I have tried intentionally to leave much of the statistical mumbo jumbo out and to present the results in a matter of fact manner. However, I would be more than responsive to those with questions about the analysis. I hope this is interesting to some and appreciate critical and supportive feedback. Thanks.

Shermn27

WhiteWolf
12-04-2005, 05:18 PM
I can't comment on the analysis, but it seems to me (especially with a small sample size) that it would be impossible to seperate cause from effect here. Much of the correlation can come from just running good. If I'm catching good cards, I'm going to be more aggressive than normal with my betting, and I will also be winning more than normal. As a result, the statistical model will show the aggression/money won correlation without accounting for the fact that both of those values are a result of merely catching a good run of cards.

shermn27
12-04-2005, 05:31 PM
An interesting point, however, this analysis is not just my own playing it is on data for 709 players who have each played over 50 hands. Thus, it should account for good runs and bad runs of cards. Additionally, of course someone is more aggressive when hitting cards. But somehow, over lots of hands some people are more aggressive than others? Why? Either because they are aggressive in nature or they are better at poker and know when they have the best of it more often than their opponents and are thus aggressive in situations where their opponent would not be. As a side note, this analysis is not intended to cite blind aggression as a way to win at poker. It is rather intelligent aggression. Missing opportunities to bet and raise with the best of it or when your opponent will readily lay down a better hand is -EV and in the long run -$.

Thanks for reading my post and for the comments.

benkath1
12-04-2005, 11:54 PM
I think this is a good idea. But I think your sample size is way too small. 50 hands just isn't enough. Do this in a year when you have players with over 1000 hands and see how the results compare.

shermn27
12-05-2005, 12:56 AM
I think the same. I would much rather obtain a data set from someone who has many more hands and run a similar set of analyses.

shutupndeal
12-05-2005, 01:51 AM
Just keep on doing what your doing, you have a very interesting experiment going and it would seem very worthwhile to follow up on.
While Shermn is making totally significant points as well. There are times when you will play according to how your running like being more aggressive when cards have been coming and then theres the days where they have you pretty gun-shy because you just got your aces cracked the last 2 times they were dealt. (and you thought you were lucky to have them dealt twice within an hour too right?)
To further comment, I am kind waiting to see the loose-aggressive player (in the right game) make up a lot of ground soon.
Its often been a staple of mine when I get to a table that is way too tight and to re-raise a few guys that I catch who like to limp with A's and K's and raise with AK.
It only normally costs me one small bet to play like this BUT you would be suprised how they start to "Lose composure" when you crack em say twice, and now they are gunning for you the game has taken on a fresh loose-aggressive style while now after laying the bait you play the lion in the tall grass and revert to tighter play.
Remember too that when the game changes like this that it is NOW CORRECT to make a lot of calls that you otherwise couldnt have before according to pot odds and such so you will still be there to grab another nice hand with an ace-babys or a nice str+flush draw and remember to pop it if you have the position not to see a third bet when U make your flop, you now build the pot yet you have the option if they check around to you on a miss.
Ok, lemme shut up but nice work, this shuld be good!