PDA

View Full Version : An interesting proposition


Gator
12-03-2003, 03:12 PM
Here’s the proposition. You have 1,000 identical slips of paper. You write a distinct number on each piece of paper. There is no bound. The only limitations are that each number be unique and that a person of high school intelligence can quickly determine that one number is higher then another (i.e. 1.5345 is permissable as is 9,999,999.02 as is - 53 -- but i-squared is not). These 1,000 slips of paper are then mxed up in a hat. You randomly select a number and read it out loud. After reading a number, I either tell you to stop or go ahead and read the next number. I win if I stop you after you've read the highest number (before reading another number). I lose if you read the highest number and I tell you to go to the next number. I am only allowed to stop you one time. What percentage of the time will I win this proposition using an optimal strategy (and what is that strategy). There are no tricks (i.e. reading body language, marking the cards, etc.).

bigpooch
12-03-2003, 04:08 PM
See the envelope paradox: it is quite remarkable that the
probability is close to 1/e of being successful (actually,
the real probability differs from 1/e by less than 1/1000!).
Here, ! denotes factorial.

The standard answer is this: if there are N pieces of paper,
go through N/e of them and remember the maximum number seen.
Then pick the next piece of paper that has a higher number
(obviously, you may lose if the first 1/e fraction of them
includes the highest). The remarkable result is that even
as N approaches infinity, the probability of success
approaches 1/e or about 0.36788. This is not intuitively
obvious!

A much more interesting question is this: under what
conditions would you believe that the above strategy will
not do nearly as well as an alternative strategy? Also,
suppose that the numbers need not be unique. There are some
very clear examples especially when N is very large!

thylacine
12-05-2003, 07:32 PM
So can you do much better than 1/e regardless of the number writers strategy?

bigpooch
12-05-2003, 11:00 PM
Not exactly. The optimal strategy with no information about
the numbers that are seen gives the best probability of
being successful. If the writer is trying his best to make
it difficult for the picker, there is no question that the
aforementioned strategy will maximize the chances of picking
the correct number.

Look at some extreme cases when N = 1 billion. Suppose the
uniqueness restraint is lifted and after going through the
first 10 million slips of paper, you only find one digit
integers from 0 to 9 inclusive with seemingly uniform
distribution. It seems extremely likely that the largest
number will be 9!

Suppose N = 1 billion and the numbers are all unique but
after going through the first 10 million of them, it seems
that every number is an integer strictly between 1 and
1 billion and statistically resembles picking (without
replacement) an integer between 1 and 1 billion inclusive
(fortunately, the number 1 billion had not been picked yet).
It would seem very likely that the highest number will be
1 billion.

More generally, suppose N = 1 billion and after examining
the numbers of the first 10 million, it seems that these
real numbers come from a distribution function. If that
distribtution function is any well-known one, it seems
likely that the numbers do indeed come from that specific
distribution function so there would be a very good idea of
what the largest number could be.

Even if there were some noise introduced: for example, if
there were only a handful of ridiculously large or small
numbers among the other numbers which seem to come from
some probability distribution function, would you think that
you would have to sample 1000000000/e numbers?

Gator
12-08-2003, 04:46 PM
Big - You post the original answer as being greater then 30% (in the case of 1,000 numbers). I always understood the answer to be >25% (I'm not doubting you - your answer is more precise). My solution was based on the concept that I only care about two numbers (the highest and the second highest - call them X and Y). I let you read half the numbers (500 in the case of 1,000 - 1/2 billion in the case of a billion). I hope that Y is in the first half. Then, once you read half the numbers, I stop you as soon as you read a higher number. The chance of X being in the first half and Y in the second is 25% and I win every time that happens. There are additional ways to win that put my chances at greater then 25%. Two questions - one is, based on 1,000 numbers, how many should I let you read (I always thought exactly half, but you equation suggests a different number) and second - am I correct that the more numbers you have (i.e. a billion vs. 1,000) the correct answer asymtomically -- sorry about spelling -- approaches 25%?

thylacine
12-08-2003, 05:15 PM
There is a lot going on here. You suggest the following strategy for the Picker (P) when the Writer (W) chooses N distinct numbers. (I'll assume distinct.) For k about N/e, the picker looks at the first k numbers and then stop as soon as there is a number bigger than those first k. (P certanly loses if the biggest was in the first k.) Let's call this 'Basic Strategy' with N and k.

I figure P's probability of success is

f(N,k):=(1/C(N,k))\sum_{i=1}^{n-k} C(n-i-1,k-1)/i

and I believe it is best to choose k about N/e, so that probability of success is about 1/e for big N.

Now suppose that as the numbers are pulled from the hat, that P is not told the actual numbers, but is only told where each new number slots into the ordering of the numbers drawn so far, (e.g. 'the 10th number drawn is the 7th biggest of the first 10 numbers drawn). Then I could easily believe that the above strategy is the best possible.

But if P is told the actual numbers, then he may be able to do better than Basic Strategy. Let's suppose that W selects from some probability distribution of sets of N numbers, and that P knows this probability distribution. Then maybe P has a lot more information and can do better. It could be that after looking at, say, the first m numbers, that P can figure the conditional probability distribution of N-tuples containing those numbers, and can take advantage of that.

In the simplest case N=2, If using Basic Strategy then P could go with k=0 or k=1, since f(2,0)=f(2,1)=1/2, so P might as well flip a coin. But it could be that if the first number drawn is, say, x, then P knows that in the conditional probability distribution of pairs of numbers including x has the other number in the pair being bigger than s say 53% of the time, and so P could go for the 2nd number. W can only thwart this if he can come up with a probability distribution where for every x, the conditional probability that a pair containing x, has x being the smaller, is exactly 50%. BUT THERE IS NO PROBABILITY DISTRIBUTION!! (Do you believe me?) ON THE OTHER HAND, AT LEAST W CAN MAKE P's SUCCESS AS CLOSE TO 50% AS DESIRED, JUST NOT EXACTLY 50% !! (Do you believe me?)

Now consider N=4. For Basic Strategy use k=1. In other words look at the first number drawn, and then stop at the next number that is bigger. This Basic Strategy has success rate f(4,1)=11/24. But now P will always see at least 2 numbers, and it may be that after seeing, say, x then y, that in some cases even when y>x, that P may modify Basic Strategy by not stopping at y, in cases where the conditional probability that y is the biggest of the 4 numbers, given that the 4 numbers include x and y, is less than 1/2.

Can W come up with any probability distribution such that P can have success no better than 11/24?

Can W even come up with any probability distributions such that P can have success no better than something arbitrarily close to 11/24, similarly to the N=2 case? Or is it instead that P can always have some success rate S>11/24, regardless of W's chosen probability distribution?

I will start a related thread 'Another interesting proposition' on a related problem.

thylacine
12-08-2003, 05:26 PM
See my post above. If you look at the first k out of the N numbers, your success rate is


f(N,k):=(1/C(N,k))\sum_{i=1}^{n-k} C(n-i-1,k-1)/i

For given N you want to choose the k that maximises this.

When N is big, then setting p=k/N this is about

-p ln(p) (natural log)

which is maximised when p=1/e, so k is about N/e.

Actually your 25% is a lower bound for

-0.5 ln(0.5) =approx 0.34657359

but this is still smaller than

-(1/e) ln(1/e) = (1/e) =approx 0.36789441.

My post above asks if a different type of strategy can do even better!

bigpooch
12-08-2003, 07:02 PM
Just noticing a typo for 1/e is about 0.367879441.

bigpooch
12-09-2003, 03:42 AM
I am actually not sure what the optimal slips of paper to
read through but it can only one of 367 or 368. I recall
reading about this problem in some probability book but the
continuous approximation you can find buried somewhere in
one of my posts concerning the "envelope paradox". I do
think that intuitively, the number for 1000 slips of paper
is 367.

As the number of slips of paper approaches infinity, the
probability for the optimal strategy of being successful
approaches 1/e or about 0.367879441.

thylacine
12-09-2003, 11:04 AM
I gave a formula for chance of success in this thread (12/08/03 03:15 PM). It was

f(N,k):=(1/C(N,k))\sum_{i=1}^{n-k} C(n-i-1,k-1)/i

and best k is about N/e, presumably rounding up or down (rather than some slightly further number), but I don't know how you decide which without calcuating f(N,k) for both possible k. So for N=1000 I guess you would just try k=367 and 368 in the formula. It wouldn't make much difference though.