[0,1] game: Methods

Aisthesis · #1 06-17-2004, 11:16 AM

After one of the supposedly "simple" exercises led to an inability (on my part at least) to figure out where exactly my attemt at solving the problem by logic went wrong, it seems like a good idea to consider various methods of solving these things. We've basically seen three of them (including Jerrod's "primer material" link):

1) Partial derivatives
2) Indifference equations (elaborated by Jerrod in the link)
3) The logic of pot odds

Applied correctly, all 3 of these methods have to lead to the same solution. The logic of pot odds (method 3) has the big advantage that it is typically much simpler, hence much less likelihood for clerical errors--as long as the logic is applied correctly every step of the way (which I must not have been doing). Jerrod, whom I'm assuming is the person here probably most versed in this type of problem, apparently solved David's #4 primarily by the logic method.

So, anyhow, while for more complicated [0,1] games some kind of combined approach may be necessary (as Jerrod also seems to do on occasion), I'd like to explore a (simple) game or 2 trying all 3 methods in the hope that that might illuminate how they fit together.

What seems to me to be the case is that there are at least certain peculiarities in games where both players have put money in the pot and have the option of folding. These peculiarities are what I think is giving me some problems with making the correct logical steps all the way through.

I'll start off with a seemingly quite simple game, which I'll call game #5:

A and B both ante $1. B is first to act and can check or raise one additional dollar. A only has the option of calling or folding. He can't raise.

I'll try to solve this by all 3 of the methods given in subsequent posts. Hopefully they'll all be the same.

First, however, I think I'll try proving a "lemma" for games of this kind.

well · #2 06-17-2004, 12:13 PM

1)

Player A:
[0,a] : fold
[a,1] : raise
Player B:
[0,w] : raise
[w,x] : fold
[x,1] : raise

EV for player B, with w<=a<=x:

EVb=-w+3*w*a-w^2-a+a*x+x-x^2

Partial deratives equal zero for {a,w,x} = {0.1,0.4,0.7}, EVb = 0.1

Aisthesis · #3 06-17-2004, 12:31 PM

Let's say we have a [0,1] game with players A and B. There is a current pot of P. At the current stage of the game, B has the option of checking or raising by $1. If B checks, the hands will be shown down. If B raises, A will have the option of folding or calling, and then the game will be over. B further knows that A will call his raise on [a,1] and fold on [0,a]. What is the optimal value b for which he will make a value-raise on [b,1] and check [0,b]?

Lemma 1: b = 1/2 + a/2
That is, independent of the size of the pot, B will raise the top half of the hands that A will call.

Proof by indifference equations:
Value of checking at b: If A has [0,b], then B wins P; otherwise the value is 0, hence: b*P

Value of raising at b:
A has [0,a]: a*P
A has [a,b]: (b-a)(P+1)=b*P + b - a*P - a
A has [b,1]: -(1-b)=b-1
Total value of raising is: bP + 2*b - a - 1

Now, b is the point where B is indifferent to checking or raising, hence we have:
b*P = b*P + 2*b - a - 1
a+1 = 2*b
1/2 + a/2 = b

Proof by differentiating over the variable b, set as the value above which B raises:
What is EVb in this situation?
1) A has [0,a] and B has [a,1] B always wins P, and the probability is independent of the variable b, hence is a constant that will disappear on differentiation

2) A has [a,b] and B has [b,1]: (b-a)*(1-b)*(P+1) = (P+1)*(b - b~2 - a + a*b) = b*(P+1) - (P+1)*b~2 - (P+1)*a + (P+1)*a*b

3) All other ranges have 0 EV.

So, we just need to differentiate equation 2) over the variable b and solve for a 0 value:

0 = (P+1) - 2*b*(P+1) + a*(P+1)
0 = 1 - 2*b + a
2*b = 1 + a
b = 1/2 + a/2

I'm assuming (please correct me, someone, if I'm wrong!) that this will also apply on ranges for A's hand that are smaller than [0,1], for example, [c,d] where 0<c and d<1. Regardless of pot size, if B knows that A will call his raise at a with c<=a<=d, then B's raise-threshold will be midway between a and d--i.e., b = d/2 + a/2

Ok, that's a first contribution toward a "logical solution" on games like this. I'm sure this is trivial to many, but on some of these problems, something that seems logical to me obviously isn't.

Aisthesis · #4 06-17-2004, 12:38 PM

Whew! I'm glad we got the same values on this one!!!

I note that the decimal numbers you mention are not rounded at all but exact values (if my calculation late last night was correct--and I'm assuming it was, since it's the same as yours).

The strange part is that it's not going to agree with my initial results by method 3). So, hopefully, this will get things going toward figuring out where my logic has been off!

I'll probably go through the indifference equations first, just to make sure they agree with the presumably correct results through partial derivatives.

Aisthesis · #5 06-17-2004, 05:51 PM

This lemma has nothing to do with the various methods issues but is meant just to seal up another point regarding the status of optimal solutions. Again, I'm sure it's trivial to many.

The going definition for optimal strategies seems to be: alpha and beta are optimal strategies for A and B respectively iff (def.):

1) B cannot improve his EV by changing strategies as long as A continues to play according to alpha, and

2) A cannot improve his EV by changing strategies as long as B continues to play according to beta.

Lemma 2: If alpha and beta and alpha' and beta' are distinct optimal strategies for A and B respectively, then EVa(alpha,beta) = EVa(alpha',beta')

Proof: If they are unequal, then one is greater than the other. I'll just show that it leads to a contradiction if EVa(alpha,beta) > EVa(alpha',beta'). This is enough since just substituting strategies would prove the same thing the other way around.

Suppose EVa(alpha,beta) > EVa(alpha',beta').

Since beta is an optimal strategy against alpha (and EVb = -EVa), we know EVa(alpha,beta') >= EVa(alpha,beta). But since alpha' is an optimal strategy against beta', we similarly know that EVa(alpha',beta') >= EVa(alpha, beta'). Due to transitivity, this implies EVa(alpha',beta') >= EVa(alpha,beta), contrary to supposition.

This means basically that, while there may be a number of co-optimal solutions for both players, the same EV will always result from optimal strategies of both players (even though there may be many optimal strategies).

Simple enough, but I just wanted to make it explicit.

Aisthesis · #6 06-17-2004, 10:08 PM

Actually, going through this, I realized that the indifference equations (at least in this case) are really just a more formalized version of the logic of pot odds. So, I'll make a few comments on there relationship as I go along.

I'll take as the critical values a, x, and y where:

B bluff-raises [0,x]
B value-raises [y,1]
A calls a raise [a,1]

Obviously x <= a <= y

1) A wants to make B indifferent to raising or checking at x.

Value (to B) of checking at x:
A has [0,x]: B wins 2 for a value of 2*x.
Otherwise 0

Value of raising at x:
A has [0,a]: B wins 2, hence 2*a
A has [a,1]: B loses 1, hence -(1-a) = a-1
Total value of raising: 3*a - 1

For B to be indifferent to raising or checking A will choose a such that these values are equal, hence
Ind. eq. 1) 2*x = 3*a - 1

Actually, I think this is the tricky equation to come up with in terms of the "logic of pot odds."

2) B wants to make A indifferent to calling or folding at a:

Value (to A) of calling at a:
B has [0,x]: A wins 3, hence 3*x
B has [y,1]: A loses 1, hence y-1
Total value of calling at a is thus 3*x + y - 1

The value of folding at a is simply 0

So, the indifference equation is 0 = 3*x + y - 1 or

2) 3*x = 1 - y

Basically, this just says that since A is getting 3:1 on the call, B must bluff-raise once for every 3 value-raises.

3) A wants to make B indifferent to checking or raising at y.

Value of checking at y:
A has [0,y]: B wins 2, hence 2*y. Otherwise 0

Value of raising at y:
A has [0,a]: B wins 2, hence 2*a
A has [a,y]: B wins 3, hence 3*y - 3*a
A has [y,1]: B loses 1, hence y - 1
Total value of raising at y: 4*y - a - 1

Setting the 2 values equal, we get 2*y = 4*y - a - 1 or
3) a + 1 = 2*y

This equation is basically saying that B will raise the top half of A's call hands, as in lemma 1.

We now have 3 equations:
1) 2*x = 3*a - 1
2) 3*x = 1 - y
3) 1/2 + a/2 = y

Using the substitution from 3), equation 2) yields
3*x = 1 - 1/2 - a/2
6*x = 1 - a, but
6*x = 9*a - 3 according to 1), so
1 - a = 9*a - 3
4 = 10*a
2/5 = a, as in the solution by partial derivatives.
2*x = 6/5 - 1 = 1/5
x = 1/10
y = 7/10

Ok, same result this time as with the partial derivatives. And I think my mistake in the more complex situation where A can raise was due to incorrect thinking on equation 1).

It's interesting to note that in this case B raises a little less than in the case where A can raise--in contrast to David's #3 and #4, where the raising hands for B are the same.

Aisthesis · #7 06-18-2004, 07:51 AM

The logic of pot odds was actually the way I came up with raise-values that were correct if A is allowed to raise but incorrect in this case.

I'm going to divide this into steps, and step 1 is the one that's wrong:
1) Since B is getting 2:1 on the bluff-raise, A must call 2/3 of the time to prevent excessive bluffing, hence on [1/3,1]

2) B raises the top half of the hands on which A will call (correct principle), hence [2/3,1] (wrong result because A's raise threshold is incorrect).

3) B bluffs once for every 3 value raises (also correct principle), hence [0,1/9].

The problem is step 1), where it's obviously not quite as simple as that. The indifference equation relating B's bluff-raises to A's call threshold was
2*x = 3*a - 1
2*x + 1 = 3*a
(2/3)*x + 1/3 = a

So, this value is going to be higher than 1/3. I'm still not quite seeing the relationship of a and x directly in terms of the logic of pot odds here--although one could say that A must call on 2/3 of his hands that are greater than x: a = x + (1/3)*(1-x).

I'm just going to conclude that one has to be careful on this step. For me, at the moment, it's probably best to stick to the indifference equation here.

Aisthesis · #8 06-18-2004, 11:15 AM

Ok, I think I've finally found the bug in my previous solution to the same game when A can raise, too:

Let's take a, b, c, x, y and z as the following:

B: bluff-raises [0,x]
value-raises [y,1]
calls a raise [z,y]

A: bluff-raises [0,a]
value-raises [b,1]
calls a raise [c,1]

We know x < a < b < y and a < z < b, and x < c < y given these definitions. The relationship between c and z isn't clear to me in advance, but it shouldn't matter in the indifference equations.

First, A wants to make B indifferent to raising or checking at x (bluff-raise).

Value to B of raising at x is the same as before:
A has [0,c] means B wins 2, hence 2*c
A has [c,1] means B loses 1, hence c - 1
Total value is 3*c - 1

But there's a big difference when B checks at x.
Now, B never wins on those hands because A's bluff threshold is a > x and B's call threshold is z > x.

Hence c = 1/3
That already has to mean that y = 2/3 and x = 1/9, since c alone determines the values of x and y. B's bluff- and value-raises are what will make A indifferent to calling or folding at c.

So, basically, the few hands that B could win in game #5 by checking will now make it better for B to bluff more often. It also explains why this didn't matter in David's #3 and #4 because in that case, B has the option of folding these bad hands for an EV of 0! On #5, B already has his money in the pot, so if B checks and A doesn't have the option of bluff-raising, then B will actually win a little bit, hence making the check option slightly more attractive.

Now, the question of hands where B checks [1/9,2/3].

First, B wants to make A indifferent to bluffing at a (>1/9 obviously):

Value (to A) of raising at a:
If B has [1/9,z], then A wins 2, hence 2*z - 2/9
If B has [z,2/3], then A loses 1, hence z - 2/3
Value of raising is: 3*z - 2/3 - 2/9 = 3*z - 8/9

Value of checking at a:
A wins 2 if B has [1/9,a], hence 2*a - 2/9.

So, 2*a - 2/9 = 3*z - 8/9
2*a + 6/9 = 3*z
2*a + 2/3 = 3*z
1) 6*a + 2 = 9*z

Secondly, A wants to make B indifferent to calling a raise at z:

Value to B of calling a raise at z:
If A has [0,a], then B wins 3: 3*a
If A has [b,1], then B loses 1: b - 1
Total value of calling is 3*a + b - 1

Since the value of folding is 0, we have
0 = 3*a + b - 1
2) 1 - b = 3*a

Finally, B wants to make A indifferent to raising or checking at b.

Here, A wins an extra bet when B has [z,b] and loses an extra bet when B has [b,2/3]. So,
b - z = 2/3 - b
2*b = 2/3 + z
3) 6*b = 2 + 3*z

So, we have:
1) 6*a + 2 = 9*z
2) 1 - b = 3*a
3) 6*b = 2 + 3*z

6*a + 2 = 9*z by 1)
6*a = 2 - 2*b from 2)
2 = 9*z - 2 + 2*b
4 = 9*z + 2*b
18*b = 9*z + 6 from 3)
4 - 18*b = 2*b - 6
10 = 20*b
1/2 = b
1/2 = 3*a
1/6 = a
1 + 2 = 9*z
1/3 = z

Ok, finally we have the indifference equations identical with the solution through partial derivatives. I could have taken a few shortcuts on the second part but wanted to spell it out explicitly to make sure everything was right.

By the way, I also worked out A's raising strategy (and B's calling strategy) in general given that B has checked on the range [x,y]. I won't write out the explicit equation because it will also depend on pot odds in any specific instance. But the interesting fact is that the raising strategy is completely independent of x!!!! It's only y that is of any concern at all here--although in the general case, x will set a minimum value for A's bluff-raise (which might create some complications in games where we get to this situation only after several actions by the various players).