PDA

View Full Version : birthdays and probabilities


hazeelnut
02-18-2004, 02:46 PM
Hi there

How many people do you need for it to be at least 50% chance of exactly three of them having the same birthday?

What i really need to know is how you calculate this, not just an answer.

Homer, if you read this, i never thanked you for helping me out last time. I will do that now. Thanks. Hope you can help me again.

bigpooch
02-18-2004, 05:56 PM
How many people do you need for it to be at least 50% chance
of exactly three of them having the same birthday?


You really do mean "exactly", right? Sounds like a homework
problem! /images/graemlins/smile.gif

Even though this year is a leap year, suppose there are only
365 different birthdays (those that were born on Feb. 29
usually choose another day to celebrate) and each occurs
with the same probability 1/365 (also, assume independence
of events).

Let C(m,n) denote the number of combinations of m objects
chosen n at a time = m!/ ( n!.(m-n)! ) where k! = 1x2x...xk.

For a specific date, this is just one example of the
binomial distribution. If there are N people, there are
C(N,3) ways of choosing three of them and the probability of
these 3 having the specific birthday and the other (N-3) not
having this specific birthday is just

(1/365)^3 x (364/365)^(N-3).

Multiplying this by the possible ways of choosing three of
these individuals gives the probability

C(N,3)x364^(N-3)/(365^N). Let this probability be denoted
by q.

Note that C(N,3)= Nx(N-1)x(N-2)/6 and the number x =
364^(N-3)/365^N can be calculated by using logarithms
since most calculators will cause an overflow (just take
ln x = (N-3)ln(364)-Nln(365) etc.).

Now, from the general inclusion-exclusion formula, if B(1)
is the event of exactly three birthdays on Jan. 1, and B(2)
is the event of exactly three birthdays on Jan. 2, etc.,
then

P = P(B(1) or B(2) or ...B(365)) =
(SUM over i)P(B(i)) - (SUM i<j)P(B(i) AND B(j))
+(SUM i<j<k)P(B(i) AND B(j) AND B(k)) - ...

The first term is just 365xq. The other terms are just
specific instances of the multinomial distribution.

P(B(i) AND B(j)) =
C(N,3)C(N-3,3) x (1/365)^6 x (363/365)^(N-6) and summing
over i<j gives C(365,2) combinations so the second term is
just C(365,2) x C(N,3) x C(N-3,3) x 363^(N-6)/365^N.

The third term is C(365,3) x C(N,3) x C(N-3,3) x C(N-6,3) x
362^(N-9)/365^N.

etc.

Since P is determined from an alternating series with
decreasing terms (from some point on!), one can actually
just use a calculator to find N.

If you were to compute the first four terms for N=89, you
can see that P>0.500018.

Note that one can compute this combinatorially instead and
divide by 365^N, but you would have to use log10 on your
calculator!

irchans
02-20-2004, 10:04 AM
Bigpooch,

I tried to reproduce your numbers, but my results did not quite match yours. Here is the program I wrote in mathematica and the answer given by mathematica.

Program:

<font class="small">Code:</font><hr /><pre>
c = Binomial;

exa1[n_] := c[n,3]*(1/365)^3* (364./365)^(n-3);
exa2[n_] := c[n,3]*c[n-3,3]*(1/365)^6* (364./365)^(n-6);
exa3[n_] := c[n,3]*c[n-3,3]*c[n-6,3]*(1/365)^9*(364./365)^(n-9);
exa4[n_] := c[n,3]*c[n-3,3]*c[n-6,3]*c[n-9,3]*(1/365)^12*(364./365)^(n-12);

Print[exa1[89]*365 - c[365,2]*exa2[89] + c[365,3]*exa3[89] - c[365,4]*exa4[89] ]
</pre><hr />

Output:

0.463949

Here are the intermediate values:

exa1[89] = 0.001844568289232847
exa2[89] = 3.9141367966673345 * 10^-6
exa3[89] = 7.456887004811742 * 10^-9
exa4[89] = 1.270322065528938 * 10^-11


Do you see a mistake?

Also, there may be another valid way to interpret the question. Your method answers this question, "What is the minimum number of random birthdays so that there is a greater than 50% chance of one or more days having exactly three occurences?" In your notation, B(1) or B(2) or ... or B(365).

Another interpretation of the question is, "What is the minimum number of random birthdays so that there is a greater than 50% chance of exactly one day having exactly three occurences?" In your notation,

B(1) or
(not(B(1)) and B(2)) or
(not(B(1)) and not(B(2)) and B(3)) or ...
(not(B(1)) and not(B(2)) and ... not(B(364)) and B(365)).


Cheers,
Irchans

bigpooch
02-20-2004, 10:49 AM
irchans:

You made exactly the same mistake I committed when making a
first attempt to crunch out the numbers on a calculator!


Program:

Code:


c = Binomial;


exa1[n_] := c[n,3]*(1/365)^3* (364./365)^(n-3);

exa2[n_] := c[n,3]*c[n-3,3]*(1/365)^6* (364./365)^(n-6);

The last factor should be (363./365)^(n-6)


exa3[n_] := c[n,3]*c[n-3,3]*c[n-6,3]*(1/365)^9*(364./365)^(n-9);

Again, last factor should be (362/365)^(n-9)


exa4[n_] := c[n,3]*c[n-3,3]*c[n-6,3]*c[n-9,3]*(1/365)^12*(364./365)^(n-12);

...(361/365)^(n-12)


Print[exa1[89]*365 - c[365,2]*exa2[89] + c[365,3]*exa3[89] - c[365,4]*exa4[89] ]



Here are the first five terms according to my trusty
calculator! :

+0.673267425
-0.206935637
+0.038572886
-0.004886074
+0.000446727

= approx. 0.5004653

The first term checks with your calculation.


You did make an important point: the question could be
interpreted a little differently, and the more complicated
question is the last one you stated!

Cheers,

bigpooch

JTrout
02-20-2004, 11:41 AM
So, how many people do you need?!

I tell you, every time I come to this forum and see how smart my competition is, it scares the hell out of me! /images/graemlins/grin.gif

irchans
02-20-2004, 03:55 PM