View Single Post
  #1  
Old 12-04-2005, 04:58 PM
shermn27 shermn27 is offline
Senior Member
 
Join Date: Jun 2005
Location: IL
Posts: 173
Default Some Analyses of Poker Data - What Accounts for Our Winnings?

Well, we don't have a statistics forum, so I figured this was the next best place to post something like this. Note to the reader - This post is long and contains some statistical terminology. Please don't get caught up in that and understand the take home message in the summary and conclusion section.

Over my short playing career (over 20,000 hands at various micro limit games) I have compiled a small data set that I believe contains some valuable information. When thinking about what it takes to be a winning poker player and money won, I can't help but think about what accounts for that money. Of course the tight/aggressive gets the money theory has been professed countless times. But in reality, and to my knowledge, no one has come out and showed how much of their profits come from how they play.

Because PokerTracker keeps so many stats, I exported a data set to determine what factors account for a significant amount of money won and how much do these factors (specifically tightness and aggressiveness) account for.

Note to readers - This post is long.

Method
I exported from PT stats on every player with whom I have played over 50 hands with (including myself) of full ring game (8 or more at the table) at micro limit texas hold'em ($.50/1.00--$3/6). As I have not been playing long, this data set included a total of 709 players.

I included many of their various statistics kept by poker tracker inlcuding: VPIP, PFR, WSD, VPIPSB, FSB, FBB, ASB, AFTotal, AF-Flop, AF-Turn, AF-River, Cold CallPFR, $ Won, Limp/RR, When Folds-No Fold (WFNF), When Folds-PreFlop (WFPF), When Folds-Flop (WFF), When Folds-Turn (WFT), When Folds-River (WFR), Fold to River Bet (FRB), C/R, W$SD with Bet/Raise, and W$SD with Call. I also coded my data for tightness and aggressiveness. The rules for these were defined as follows:
Tight = VPIP < 20.00
Semi-Loose = 20.00 < VPIP < 30.00
Loose = VPIP > 30.00
Aggressive PreF = PFR > 5.00
Passive PreF = PFR < 5.00
Aggressive PostF = AF > 1.50
Passive PostF = AF < 1.50

Results
Because I was interested in knowing what percentage of $ won is controlled by these factors, I conducted a multiple stepwise regression predicting money won using all of these factors. The results of this analysis revealed that 10 variables were significantly predictive of $ won. Those variables and their partial r-squares are listed below.
W$SDBR - .08
W$SDBC - .06
WFR - .05
AFRiver - .04
AFTot - .02
WFNF - .01
VPIP - .01
FRB - .01

Overall, they model R-Square was .27. In other words, these factors predicted 27% of the variances in $ won sufficiently. First, this analysis implies that only about 25% of one's winnings are accounted for by these figures - thus 75% must be accounted for by something else. My guess is that much of it is error, as in every given poker hand it is not always necessarily better to fold at the river, or on the turn. It just depends on the situation. This analysis looks across all situations.

Secondly, this analysis also appears to reveal the importance of river play, aggression, and preflop play. The first 4 factors loading all have to do with river play, as do factors 6 & 8, WFNF and FRB. Additionally, the aggression factor Total was a significant predictor of money won as was VPIP.

Because I feared many of these variables were related to one another (such as WF-River and FRB) I then conducted a factor analysis on all of the aformentioned variables except for $ won. A factor analysis attempts to group similar variables into one factor. This analysis revealed that there are 7 factors:
Factor 1 - Folding. Variable loading on this factor are WFPF, VPIP, WFNF, Tightness, VPIPSB, CCPF, WFT, WFR, WFF.
These variables all appear to be related to folding both before and after the flop.

Factor 2 - Early Aggression (PFR, Aggression, ASB, AF-Flop, AF-Total). Each of these variables has to do with aggression on the early betting rounds.

Factor 3 - Late Aggression (AF-Total, AF-Turn, AF-River, W$SDB/R).

Factor 4 - Went to Showdown (WSD). This was the only variable loading on this factor.

Factor 5 - Fold to River Bet (FRB). Again, this was the only variable loading here.

Factor 6 - Folding in the Blinds/Blind Defense (FSB, FBB).
The appearance of these two together is not surprising and shows their importance.

Factor 7 - Miscellaneous (W$SDC, C/R). Not sure why these two are paired together, but they both loaded here.

From this analysis, it appears that there are seven factors in poker. (Sort of, I realize that I certainly don't have every possible variable in poker included). In an effort to determine how much $ won these seven factors account for, I created new variables consisting of the means of these of the variables loading on these 7 factors. These 7 new variables (factor1--factor7) were then used in a multiple stepwise regression to predict $ won. The factors predicting $ won and their partial r-squares are as follows:
Late Round Aggression - .09
Miscellaneous - .06
Folding - .01
Early Round Aggression - .01
Went to Showdown - <.01

Overall, the model r-square was equal to .18. Thus 18% of the amount won can be accounted for by these factors.

Many Hold'em experts (including our own Ed Miller and David Sklansky) profess that money is made after the flop and not Preflop. To test this hypothesis, multiple stepwise regressions were conducted for each different round of play. The model R-square is presented for each round of play below:
Pre-Flop - .03
Flop - .03
Turn - .04
River - .09

Note that river play appears to be the most predictive of $ won.

Finally, the overarching reason for doing this study was to examine tightness and aggressiveness. To do this, a 3(Tightness: Tight, Semi-Loose, Loose) x 4 (Aggressiveness: P/P, P/A, A/P, A/A) within subjects Analysis of Variance (ANOVA) was conducted on money won. It revealed a main effect for Aggression such that Passive/Passives (M = -8.62, SD = 58.39), Passive/Aggressive (M = 8.51, SD = 49.97), and Aggressive/Passives (M = -11.77, SD = 67.36), were making less than Aggressive/Aggressives (M = 30.34, SD = 103.41). (Note. Contrast analyses on Aggressiveness does reveal that A/A are making more than any of the other three and that the other three are not making significantly differently more than one another). There was no main effect for tightness although Tight players had a mean of 8.45, Semi-Loose a mean of 6.46, and Loose a mean of -4.73. Finally, there was no interaction between tightness and aggressiveness.

This ANOVA means that aggressiveness is a factor in amount of money won, while tightness does not appear to have as big of an effect.

Summary & Conclusion
So what does all this mean? If you are one is caught up in the statistical mumbo jumbo - forget it and read this.

These analyses essentially mean that Sklansky and Co. are right. (Not that we had any doubts, its just that no one has yet examined the data that I know of). Tight/Aggressive gets the money. But Semi-Loose/Aggressive and even Loose/Aggressives can get some money too. Being aggressive in poker appears to help make you a winner. Of course this makes since because by being aggressive we give ourselves two ways to win (they fold, they call with a better hand).

Additionally, we have found out that post flop play (especially River Play) is important. Much more important than Pre-flop play in terms of winning money.

Finally, some may criticize that only small portions of $ won can by predicted by these analyses (25%). First of all, realize that this is not necessarily 25 out of 100 percent because I think that no set of predictors could account for all 100% of winnings given the randomness and luck factors of poker. So the fact that 25% can be predicted is pretty huge. On top of that, think about how much 25% is in terms of your winnings? I argue that it is quite a significant portion.

Though I am reviewing this post for mistakes, please recognize due to its length and my anticipation for replies, that I am posting it rather hastily. I have tried intentionally to leave much of the statistical mumbo jumbo out and to present the results in a matter of fact manner. However, I would be more than responsive to those with questions about the analysis. I hope this is interesting to some and appreciate critical and supportive feedback. Thanks.

Shermn27
Reply With Quote