Some post-facto all-in analysis [Archive] - Two Plus Two Older Archives

View Full Version : Some post-facto all-in analysis

eastbay

09-07-2004, 06:39 PM

Reader beware. I'm kind of playing with numbers here and I'm not sure what it means yet, if anything...

I've been a significant winning player at $55 for quite a few months now, but I always seem to fumble when I try to move up to $109. It feels like my luck goes south, the river does me wrong, the bankroll beating I take gets uncomfortable, and I slink back into the $55 games to build bankroll and courage for awhile.

But I got to the point where I really wondered if I really was getting outdrawn more often, or if that was just an illusion. It's easy to shrug off an outdraw when you've been dominating a table and it only costs you 1/4 of your stack. It hurts a lot more to fight all the way, only to get outdrawn to place 4th... again. I wanted to see some hard numbers.

So I wrote some code to go through hand histories and find all the hands where I got all my money in preflop, either by pushing or calling. Then I went through and kept some running tallies:

chipEV (the usual expected value)
chipAV (actual value - what I made or lost on the hand)

The difference between these two figures is a luck indicator (at least for the preflop all-ins.)

For the 109s I've played this month, I found:

total preflop all-in chipEV: 46k
total preflop all-in chipAV: 29k

So I am indeed running cold, only squeezing out about 63% of the chips I "deserved." This is over 45 all-ins spread over 22 tournaments. Definitely short-term, but also definitely long enough to cause a frighteningly precipitous bankroll drop. So I guess it wasn't totally my imagination at work that I was taking more than my share of beats.

(As a sanity check, I ran the EV vs AV numbers on a month when I remember running better than average, and got EV/AV 400/370. So it seems about right.)

Edit: Here's a running graph of the "hot" $55s vs. the "cold" $109s: http://rwa.homelinux.net/109vs55.pdf

But once I had this in place, some other numbers were easy to take a look at. Numbers which might indicate my skill relative to the field in the $55s vs the $109s. For example, the avg chipEV/all-in, or the avg edge when the chips went in. Comparing some numbers from a bunch of $55s I played in a previous month:

chipEV/all-in
$109: ~1100
$55: ~1500

I think that's an interesting number which may be a good indicator of the difference in quality of field between $55 and $109. In the $55s, I'm getting bigger edges on bigger pots. Again, maybe the $109 data is too sparse to really say that yet.

Average edge per confrontation:

$109: .53
$55: .47

This is also kind of interesting. I think the .47 isn't necessarily bad. Since I'm pushing all-in much more often than calling, I'm going to get called by hands that often have me beat, but more often than that I'm going to pick up the pot uncontested.

But it looks like I may be playing scared in the $109s, playing too tightly and waiting too long for big hands, if I even have an edge on average when called.

Anyway, that's enough for now but I thought some of this might be interesting for discussion.

Edit2: If somebody has a raw mbox file of a bunch of hand histories, I could run it through my processor to compare numbers. That might be interesting.

eastbay

PrayingMantis

09-07-2004, 07:00 PM

Your ideas of how to analyze this game are very interesting, as always.

I have many different thoughts, but I'd just like to make sure I understand one thing:

[ QUOTE ]
hipEV (the usual expected value)
chipAV (actual value - what I made or lost on the hand)

The difference between these two figures is a luck indicator (at least for the preflop all-ins.)

For the 109s I've played this month, I found:

total preflop all-in chipEV: 46k
total preflop all-in chipAV: 29k

So I am indeed running cold, only squeezing out about 63% of the chips I "deserved." This is over 45 all-ins spread over 22 tournaments. Definitely short-term, but also definitely long enough to cause a frighteningly precipitous bankroll drop. So I guess it wasn't totally my imagination at work that I was taking more than my share of beats.

(As a sanity check, I ran the EV vs AV numbers on a month when I remember running better than average, and got EV/AV 400/370. So it seems about right.)

[/ QUOTE ]

When you ran bad, in those 22 tournaments, you had EV/AV of 46/29.

But when you ran *better than avarage*, as you say, you had EV/AV of 400/370 (I assume that it's from a much bigger sample).

How come 400/370 is the ratio when you run *better* than avarage? I'd think "avarage" should mean 400/400 (EV=AV), in this case, isn't it? i.e, earning exactly what you expect to earn, no "bad" or "good" luck.

According to this, "better than avarage" should mean EV<AV.

Can you make it clear?

LinusKS

09-07-2004, 07:00 PM

I think it's very interesting, and I'd be indebted to you if you'd give me some hints on how I can apply those formulas to my own hands.

I've always wanted a way to put some hard numbers on how luck affects my play.

eastbay

09-07-2004, 07:11 PM

[ QUOTE ]

According to this, "better than avarage" should mean EV<AV.

Can you make it clear?

[/ QUOTE ]

Oops. My screwup. I meant AV/EV was 400/370, not EV/AV.

And yes, the sample was much bigger. See the attached graph for more detail.

eastbay

eastbay

09-07-2004, 07:25 PM

[ QUOTE ]
I think it's very interesting, and I'd be indebted to you if you'd give me some hints on how I can apply those formulas to my own hands.

[/ QUOTE ]

Do you mean the theory of it, or the mechanics of applying it in practice? I'll let you respond before giving much more detail.

eastbay

Gramps

09-07-2004, 07:39 PM

The bad beat D.A. would like to prosecute, but it is his belief that the facts are too complex for a jury of average people to comprehend your post and find you guilty beyond a reasonable doubt...

...you are free to go as you please...

eastbay

09-07-2004, 07:46 PM

[ QUOTE ]
The bad beat D.A. would like to prosecute, but it is his belief that the facts are too complex for a jury of average people to comprehend your post and find you guilty beyond a reasonable doubt...

...you are free to go as you please...

[/ QUOTE ]

lol. Yeah, it does start out as the most pedantic bad beat post in the history of 2+2... sort of...

However, it does atone for itself by getting slightly more interesting when I get to comparing some stats between play at $55 and $109.

eastbay

eastbay

09-07-2004, 07:49 PM

[ QUOTE ]
For example, the avg chipEV/all-in, or the avg edge when the chips went in. Comparing some numbers from a bunch of $55s I played in a previous month:

chipEV/all-in
$109: ~1100
$55: ~1500

[/ QUOTE ]

It occurs to me that chipEV/all-in probably isn't the best number here.

I think a more interesting number would be %stackEV/all-in.

Hmm...

eastbay

Gramps

09-07-2004, 07:53 PM

True, I think it's a valid post, I just couldn't resist.

ethan

09-07-2004, 09:03 PM

The chipEV/all-in is skewed a bit by the fact that you've been running better in the $55 tournaments, so you'll have spent a larger percentage of your time with a lot of chips in shorthanded situations at the end.

eastbay

09-07-2004, 09:07 PM

[ QUOTE ]
The chipEV/all-in is skewed a bit by the fact that you've been running better in the $55 tournaments, so you'll have spent a larger percentage of your time with a lot of chips in shorthanded situations at the end.

[/ QUOTE ]

Exactly. Which is why scaling it by stack size makes sense.

eastbay

stupidsucker

09-07-2004, 11:50 PM

As always a thought provoking post.

I would love for poker tracker to have this as an option.

When running bad its always nice to see concrete proof that it wasnt your own fault. It really puts the long run into prospective. I want to make the dive back into the 50s shortly, but they sure did kill me the last time I tried.

ethan

09-08-2004, 02:43 AM

[ QUOTE ]
[ QUOTE ]
The chipEV/all-in is skewed a bit by the fact that you've been running better in the $55 tournaments, so you'll have spent a larger percentage of your time with a lot of chips in shorthanded situations at the end.

[/ QUOTE ]

Exactly. Which is why scaling it by stack size makes sense.

eastbay

[/ QUOTE ]

Yup. Although those results may be skewed by the same phenomenon, since you're spending more time in the $55s with higher blinds and thus a much smaller stack in terms of BB (so more of your all-ins will be to steal blinds). You're also likely to spend more time near the end with a stack that's significantly above or below the mean, and your opposition will be better. I'm not certain how significant the impact of any of these might be, but I could see them skewing results.

It might also be worth considering what sort of hand you're going to get all-in with earlier in the tournament versus what you could have later on. This might be part of the .53/.47 discrepancy.

KJ o

09-08-2004, 05:50 AM

[ QUOTE ]
I would love for poker tracker to have this as an option.

[/ QUOTE ]

Absolutely. There has been some talk about developing an alternative to PT geared only to tournaments (or perhaps just OTT?). While I can see the merits of that, I still think PT has too much good stuff to be replaced, and I see no particular reason why stuff like this couldn't be added to it.

It would also be interesting to see how I do when my opponent is all-in. Do I make them double up too often, for instance?

PrayingMantis

09-08-2004, 06:27 AM

OK,

One thought I had in regard to this, is that (like others have said) it is a great way to measure the "luck" factor,or in other words: the "true" skill of a player (let's not get into the question if it should be EV/AV or stack-size related EV/AV).

I'd think that it's already time that someone will come with a way to say how skillful someone is, without regard to ROI. In this sense, ROI could mean something like your AV, while EXPECTED ROI will be something like EV.

For instance - with your method, or something similar, it is possible to see what is the correlation between your EV/AV in a specific tourney, and your ROI there. Then it will be easy to come out with a kind of expected ROI factor: Let's now call this EROI, just for the sake of simplicity.

Now, we will be able to compare the skills of two players, without regard to their real ROI. Say one player is playing the $55 for 400 games, with ROI of 19%, and another playing the same sample with an ROI of 35%. However, if we will look at the EV/AV ratio, we will find out that player A was running extra cold, while player B was running extra hot. The EROI of player A will be higher than of player B.

In the very long run, EROI should basically be very close to ROI,for each specific player. However, in the shorter run, which is usually the case, EROI will tell us much more about the skill of a player, and also about the profitability of a game, without a need for a really big sample.

Am I making sense?

eastbay

09-08-2004, 11:10 AM

[ QUOTE ]
OK,

One thought I had in regard to this, is that (like others have said) it is a great way to measure the "luck" factor,or in other words: the "true" skill of a player (let's not get into the question if it should be EV/AV or stack-size related EV/AV).

I'd think that it's already time that someone will come with a way to say how skillful someone is, without regard to ROI. In this sense, ROI could mean something like your AV, while EXPECTED ROI will be something like EV.

For instance - with your method, or something similar, it is possible to see what is the correlation between your EV/AV in a specific tourney, and your ROI there. Then it will be easy to come out with a kind of expected ROI factor: Let's now call this EROI, just for the sake of simplicity.

Now, we will be able to compare the skills of two players, without regard to their real ROI. Say one player is playing the $55 for 400 games, with ROI of 19%, and another playing the same sample with an ROI of 35%. However, if we will look at the EV/AV ratio, we will find out that player A was running extra cold, while player B was running extra hot. The EROI of player A will be higher than of player B.

In the very long run, EROI should basically be very close to ROI,for each specific player. However, in the shorter run, which is usually the case, EROI will tell us much more about the skill of a player, and also about the profitability of a game, without a need for a really big sample.

Am I making sense?

[/ QUOTE ]

Yes, but I think you'd need a much more sophisticated way of calculating EROI than simply comparing all-in EV vs AV given the cards after they're shown.

For example, let's say I get KK on the SB on the bubble and in 10 tournaments in a row, move in, and get shown AA on the BB. The "EV" as I've calculated here would be very low. But of course you were unlucky as hell to run into AA in the first place.

There's many facets to getting lucky in a SnG, and the luck of the all-in showdown is only one of them.

eastbay

PrayingMantis

09-08-2004, 11:53 AM

[ QUOTE ]
Yes, but I think you'd need a much more sophisticated way of calculating EROI than simply comparing all-in EV vs AV given the cards after they're shown.

For example, let's say I get KK on the SB on the bubble and in 10 tournaments in a row, move in, and get shown AA on the BB. The "EV" as I've calculated here would be very low. But of course you were unlucky as hell to run into AA in the first place.

There's many facets to getting lucky in a SnG, and the luck of the all-in showdown is only one of them.

[/ QUOTE ]

I agree of course, and it's a good point. But I'm pretty sure it's possible to find some kind of a way to express a relation between AV, EV, stack size, probability of facing a better (or worse) hand (and some other factors), for when you push or call, and to find out how it changes (for a specific player), in relation to real ROI.

The equation might be very complicated (and it will always be arbitrary in some way), but I believe you are in the right way for coming out with one... /images/graemlins/wink.gif

I'm saying all this, because I still find it quite amazing (and it's true also for ring game) that we don't have any other method to judge how skillful a player is, outside of how much money he/she actually makes. Making money is the purpose of the game (for us, at least) so It's a great criteria, and the most important one in real life, but I'd think that in a game that is so much focused around luck, some others ways for judging skill should be invented, if only to improve one's ability in actually making more money.

poboys

09-08-2004, 12:16 PM

Great post.

I have been thinking/experimenting with ways to measure the quality of my all-in decisions independent of ROI. I like your EV versus AV calculations, and am going to spend some more time thinking about it.

I have developed a few very simple metrics that give me a general idea of how I'm running. Now, I admit that they have some significant limitations (in the sense that they are counts) but they're useful for me. I'm working on modifying them to a weighted formula. Thoughts would be appreciated.

let:
* W = a win,
* L = a loss,
* F = you are a favorite
* D = you are a dog
* c(x) = number of times event X happens in a session.
* c{x|y) = number of times event X happens, given that Y happened.

Here are the metrics I track:
DF = c(F) / c(D)
BBF = c(L|F) / c(W|F)
SOF = c(W|D) / c(L|D)
LF = c(W|D) / c(L|F)

DF (decision factor) is the number of times you make good decisions by going all-in as the favorite.

BBF (bad-beat factor)- how often do you lose when you should have won

SOF (suck-out-factor)- how often do you suck out

LF (luck factor) - are you getting more than your fair share of luck.

In cold runs, I have had a BBF of over 2.5, which is insane. Like I said before, these numbers generally give me a feel for what's going on, and help me at least get some sleep at night.

eastbay

09-09-2004, 04:24 AM

[ QUOTE ]
Great post.

I have been thinking/experimenting with ways to measure the quality of my all-in decisions independent of ROI. I like your EV versus AV calculations, and am going to spend some more time thinking about it.

I have developed a few very simple metrics that give me a general idea of how I'm running. Now, I admit that they have some significant limitations (in the sense that they are counts) but they're useful for me. I'm working on modifying them to a weighted formula. Thoughts would be appreciated.

let:
* W = a win,
* L = a loss,
* F = you are a favorite
* D = you are a dog
* c(x) = number of times event X happens in a session.
* c{x|y) = number of times event X happens, given that Y happened.

Here are the metrics I track:
DF = c(F) / c(D)
BBF = c(L|F) / c(W|F)
SOF = c(W|D) / c(L|D)
LF = c(W|D) / c(L|F)

DF (decision factor) is the number of times you make good decisions by going all-in as the favorite.

BBF (bad-beat factor)- how often do you lose when you should have won

SOF (suck-out-factor)- how often do you suck out

LF (luck factor) - are you getting more than your fair share of luck.

In cold runs, I have had a BBF of over 2.5, which is insane. Like I said before, these numbers generally give me a feel for what's going on, and help me at least get some sleep at night.

[/ QUOTE ]

I like it, but I think it misses an important SnG luck factor that I've mentioned in an earlier post to Mantis.

I was trying to think up another metric along these lines: probability of getting "caught" while preflop all-in vs. actual incidences of getting caught.

You could define getting "caught" as any situation where you're -EV once the cards are turned up.

Say, for example, we're 4-handed and I push JJ from the cutoff with my 1000 chips over the 100/200 blinds. Clearly this is a winning play no matter what. But sometimes you will be shown QQ,KK,AA. This is unlucky.

You could compute a mean probability of getting caught in a -EV situation, assuming any player who could catch you will call (this number is unrealistic for a pure bluff, but anything else like devising calling standards according to context is probably too complicated). Then you could compare that to the actual probability as computed by times you get caught vs. times you don't.

I like that so much I may even implement it. I think it's a huge part of the luck factor in SnGs.

eastbay

poboys

09-09-2004, 01:48 PM

[ QUOTE ]

I like it, but I think it misses an important SnG luck factor that I've mentioned in an earlier post to Mantis.

I was trying to think up another metric along these lines: probability of getting "caught" while preflop all-in vs. actual incidences of getting caught.

[/ QUOTE ]

Interresting point. I definately agree that there is some luck factor in going all-in with KKs and the BB calls and turns over AAs. But, where do you draw the line on luck? It's unlucky to go all-in with 88's when your opponent turns over 99s.

[ QUOTE ]
You could compute a mean probability of getting caught in a -EV situation, assuming any player who could catch you will call (this number is unrealistic for a pure bluff, but anything else like devising calling standards according to context is probably too complicated). Then you could compare that to the actual probability as computed by times you get caught vs. times you don't.

[/ QUOTE ]

You could compute those probabilities. The problem is that you'd either have to compute the prob of getting caught versus all possible two-card hands or the prob of getting caught considering only the calling standards of the Villan. If you choose the former, then I'd bet that this form of luck is not material (in other words, discounting the -EV when you are put against AA or KK with QQ has got to be so close to the EV given any two cards versus QQ that it doesn't matter).

Another option is to count the number of times you had a +EV hand and went all in against an better +EV hand. This only makes sense late in the game.

[ QUOTE ]

I like that so much I may even implement it. I think it's a huge part of the luck factor in SnGs.

[/ QUOTE ]
I'm very interrested to see the results of these calculations, if you are willing to share.

I have some perl that will review party-poker hand histories and calculate the simple factors I discussed (along with tell yuo what your odds where, etc). I'd be interrested to see what your numbers were on a small sample.

eastbay

09-09-2004, 02:14 PM

[ QUOTE ]

You could compute those probabilities. The problem is that you'd either have to compute the prob of getting caught versus all possible two-card hands

[/ QUOTE ]

Right. (as opposed to what?)

[ QUOTE ]

or the prob of getting caught considering only the calling standards of the Villan.

[/ QUOTE ]

The calling standards of the villian are assumed to be perfect.

I guess I don't really follow your objection.

[ QUOTE ]

I have some perl that will review party-poker hand histories and calculate the simple factors I discussed (along with tell yuo what your odds where, etc). I'd be interrested to see what your numbers were on a small sample.

[/ QUOTE ]

PM me for an email address.

eastbay