Two Plus Two Older Archives  

Go Back   Two Plus Two Older Archives > General Gambling > Probability
FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools Display Modes
  #1  
Old 02-23-2005, 12:04 AM
gaming_mouse gaming_mouse is offline
Senior Member
 
Join Date: Oct 2004
Location: my hero is sfer
Posts: 2,480
Default More Complex Birthday Problem

Inspired by this post in OOT.

We ask random people for their birthday, and we want to know how long it will be before we get a representative for every day of the year. Obviously the minimum required is 365 people, but what is the distribution of the waiting time till we get a person for every bday?

I was thinking about it like this... number every day in the year 1,2,... 365. So the problem can be reduced to generating random numbers between 1 and 365. After asking N people their birthday (or generating N random numbers), the chance that a particular birthday (say Jan 1) has not yet been found is:

(364/365)^N

Letting E1 = event that birthday 1 is not yet found
E2 = same for bday 2
etc

The chance that we have not completed the full bday spectrum after N should be:

P(E1 or E2 or E3... or E365)

We should then be able to use inclusion-exlcusion. However, there must be an error in my thinking b/c I get

365*(364/365)^N - (365 choose 2)*(363/365)^N

for the first 2 terms, which gives you a negative number for N=400. What's wrong with this approach? What a better way to do it?

gm
Reply With Quote
  #2  
Old 02-23-2005, 01:29 AM
reubenf reubenf is offline
Member
 
Join Date: Oct 2004
Location: Seattle, WA
Posts: 85
Default Re: More Complex Birthday Problem

NM [img]/images/graemlins/frown.gif[/img]
Reply With Quote
  #3  
Old 02-23-2005, 03:35 AM
pzhon pzhon is offline
Member
 
Join Date: Mar 2004
Posts: 66
Default Re: More Complex Birthday Problem

This situation is called the coupon collector problem.

Your approach is ok, though you need many more terms to get a good approximations when N is small. It's like e^-x = 1-x+x^2/2-x^3/6+x^4/24-... for x=10. If you stop after a couple of terms, your approximation will be horrible. You need to continue until the terms are smaller than your error tolerance.

The way I usually think of the problem is that there is an independent series of 365 waits for the next new birthday. Each of these has a geometric distribution (based at 1), so you want a convolution of a geometric distributions of mean 365/i for i=1 to 365. You may be able to get a good approximation by using a normal approximation for most of the terms, then convolving the last few together. However, you can also compute each term directly by inclusion-exclusion.

The probability for N=400 is extremely low. You do not want to trust the normal approximation that far from the mean (2365). I get a probability of 1.470387 x 10^-117 for the probability of collecting all 365 birthdays on the 400th person, and 1.771290 x 10^-117 for the probability of collecting all 365 birthdays on or before the 400th person.
Reply With Quote
  #4  
Old 02-23-2005, 01:56 PM
gaming_mouse gaming_mouse is offline
Senior Member
 
Join Date: Oct 2004
Location: my hero is sfer
Posts: 2,480
Default Re: More Complex Birthday Problem

[ QUOTE ]
NM [img]/images/graemlins/frown.gif[/img]

[/ QUOTE ]

What does NM mean?
Reply With Quote
  #5  
Old 02-23-2005, 02:04 PM
gaming_mouse gaming_mouse is offline
Senior Member
 
Join Date: Oct 2004
Location: my hero is sfer
Posts: 2,480
Default Re: More Complex Birthday Problem

[ QUOTE ]
Your approach is ok, though you need many more terms to get a good approximations when N is small.

[/ QUOTE ]

pzhon,

thanks. the one thing i was also curious about is.... Will this approach yield an answer of 0 whenever N <= 365?
Reply With Quote
  #6  
Old 02-23-2005, 02:32 PM
reubenf reubenf is offline
Member
 
Join Date: Oct 2004
Location: Seattle, WA
Posts: 85
Default Re: More Complex Birthday Problem

"Nevermind" -- I posted some dumb stuff at first [img]/images/graemlins/laugh.gif[/img]
Reply With Quote
  #7  
Old 02-23-2005, 02:33 PM
gaming_mouse gaming_mouse is offline
Senior Member
 
Join Date: Oct 2004
Location: my hero is sfer
Posts: 2,480
Default Re: More Complex Birthday Problem

[ QUOTE ]
Each of these has a geometric distribution (based at 1), so you want a convolution of a geometric distributions of mean 365/i for i=1 to 365.

[/ QUOTE ]

Can you explain this more? Maybe work it out a little. It's been a while since I did any convolutions...
Reply With Quote
  #8  
Old 02-24-2005, 08:14 AM
pzhon pzhon is offline
Member
 
Join Date: Mar 2004
Posts: 66
Default Re: More Complex Birthday Problem

[ QUOTE ]
Will this approach yield an answer of 0 whenever N <= 365?

[/ QUOTE ]
I think what you wrote was the probability not all birthdays would be covered by the nth person, and inclusion-exclusion correctly computes that the probability is 1 when N<365.

I think you are vulnerable to round-off errors if you try inclusion-exclusion directly, since you may get some very large terms in relation to the final answer. If you use a program that handles exact arithmetic, such as Mathematica, you should be ok.

Probability of getting all 365 birthdays among 400 people=

52483783422874597804160994139395052025933928631726 725852252
99768854194834360516158398330245464957400255876549 462098238
47228687271798613101893209981804672821136797825938 362553538
25049977758566210242062865782721137615697095321391 577013114
25676752994679811065943416002065242556195356480054 219759684
31210746116923468802817449397369613633099595400370 320464218
24872536041413773310765315008498940023780391424078 265768602
19987070046017307154967050540477087738245139006423 901519499
01389454812286268745961138241066130684236022645943 939185388
77700816709566362914353680068270916089700797886807 635786185
61899033640283175798640933254274115346564829172120 638441654
17041313354684823539241894869119994267587969671204 469096824
96880975959627142155839914533457990127206730956608 829624310
50981276975114189260121962841396152335583744120632 158434885
632

/

29630252322857330632860424262750594471896783621785 973743878
41241915001644078374133983027032394826074040462250 328944831
38752795141245056701059379755101369221623724578878 905323513
47169340119880781481602513373705371474348886379318 578519619
00275516891935071175600989306565258776254981567031 087846506
12907337087388163058618770889105501750377059626426 176323012
93726956610244222002778589678328490750986439279558 940328110
07993934014055469891109772226591317066138091345357 944630363
15195939166023536301499226617746210919458226962382 015628472
57533834343265983699483897086207754231171856141729 326450570
70588325353207133182276562122241667612147364191719 951173014
73621509459492954942517135016421871450666811790994 892406807
25957806076959867245679135355421611586685398429027 388875114
27694051624908740292708935377520917848398986895760 820807587
07346219272122014985846478290749386512588048286793 800129024
47174999221097998732335487837019627477275207638740 539550781
25
Reply With Quote
  #9  
Old 02-24-2005, 08:32 AM
pzhon pzhon is offline
Member
 
Join Date: Mar 2004
Posts: 66
Default Re: More Complex Birthday Problem

[ QUOTE ]
[ QUOTE ]
Each of these has a geometric distribution (based at 1), so you want a convolution of a geometric distributions of mean 365/i for i=1 to 365.

[/ QUOTE ]

Can you explain this more? Maybe work it out a little. It's been a while since I did any convolutions...

[/ QUOTE ]

A discrete convolution is what happens if you multiply two polynomials or power series.

A generating function for the waiting time between the first birthday and the second new birthday is

f_2 = 364/365 x + (364/365)(1/365) x^2 + (364/365)(1/365)^2 x^3 + ...

The coefficient of x^n is the probability that the wait is n.

A generating function for the waiting time between the second new birthday and the third new birthday is

f_3 = 363/365 x + (363/365)(2/365) x^2 + ... + (363/365)(2/365)^(i-1) x^i + ...

To get a generating function from the first birthday until the third, multiply f_2 by f_3.

f_n = x (366-n)/365 / (1 - (n-1) x / 365) (as a power series)

I'm not sure this helps much for computing any individual probability. I used it, but I still had to do some work to avoid losing numerical precision or relying on slow exact arithmetic. However, recognizing that the total waiting time is a sum of independent random variables lets you compute the mean and standard deviation easily from the means and standard deviations of the time between the n-1st and nth new birthday for n=1,...,365.
Reply With Quote
  #10  
Old 02-24-2005, 09:53 AM
rivermeelmo rivermeelmo is offline
Junior Member
 
Join Date: Feb 2005
Posts: 2
Default Re: More Complex Birthday Problem

isn't this basically a pvnp type deal
Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -4. The time now is 07:46 PM.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.