Two Plus Two Older Archives  

Go Back   Two Plus Two Older Archives > General Gambling > Probability
FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools Display Modes
  #11  
Old 10-26-2005, 10:26 PM
BruceZ BruceZ is offline
Senior Member
 
Join Date: Sep 2002
Posts: 1,636
Default Re: Riddle -- Probability of Expectation

[ QUOTE ]
[ QUOTE ]
[ QUOTE ]
If one of the letters is an H it really doesn't effect the chance of an E showing up because if it wasn't an H it would still be something other than E.

[/ QUOTE ]
But it does affect it. If the first letter is an "h," then there are only 35 chances to get an "e." If it is not an "h," then there is 1/25 instead of 1/26 that it is an "e."

[/ QUOTE ]

Doing a binomial distribution calculation or taking an events liklihood to the power of trials takes this into account.

In this example the binomial distribtion of an event with the liklihood of an event that has the probability of occuring equal to 1/26 not occuring in 36 trials is 0.24366872185316. This is the same as (1/26)^36.

I am sorry I don't know how to explain it any better.

[/ QUOTE ]

Of course Aaron and alThor are correct. This step is incorrect:

[ QUOTE ]
The chance of all 3 letters occurring in the same 36-character line is

0.756331278146836^3 = 0.432649477099284

[/ QUOTE ]

Raising this probability to the 3rd power would only be correct if the events of each letter occuring in 36 trials were independent of each other, and they are not. The selection of each of the 36 letters are independent, but you are confusing that with the overall selection of each letter in 36 trials, and these do not form independent Bernoulli trials, so the binomial distribution does not apply. If one of these letters occurs, then the probability that one of the other letters occurs decreases. You want to multiply P(h occurs)*P(e occurs | h occurs)*P(s occurs | h and e occur). This is not the same as [P(h occurs)]^3 since these events are not independent.

You also cannot multiply the probability that h,e, and s occur by the probability that u or w occur, as you do at the end of the calculation.
Reply With Quote
  #12  
Old 10-26-2005, 11:10 PM
Guest
 
Posts: n/a
Default Re: Riddle -- Probability of Expectation

Thanks for your explanation. I understand the errors in my calculations now.

Regards,

Baker
Reply With Quote
  #13  
Old 10-27-2005, 07:03 AM
BruceZ BruceZ is offline
Senior Member
 
Join Date: Sep 2002
Posts: 1,636
Default Re: Riddle -- Probability of Expectation

[ QUOTE ]
I was given a sheet of riddles for extra credit in my statistics class and got all but one of them right.

Here's the one that totally stumped me and pissed me off.

In The Art of Shakespeare's Sonnets (Harvard Univ. Press, 1997), author Helen Vendler noted that each of the 14 lines of Sonnet 20 (one of 154 sonnets written by Shakespeare) includes the letters of the word "hues" and or the letters of the word "hews."

Suppose that 154 monkeys sitting at 154 keyboards pounded out one sonnet apiece, each consisting of 14 lines and 36 alphabet letters each, with each letter equally likely. What is the probability that in at least one of the sonnets, every line includes the letters of the word "hues" and/or the letters of the word "hews?"


What say youse?

Prof. will not be giving the answer until the end of the semester.

[/ QUOTE ]


This problem can be solved very quickly and exactly by the inclusion-exclusion principle. As we have seen, the only difficult part of the problem is to compute the probability that the letters of "hues" or "hews" do not occur in a 36 letter line. If we call this probability P, then our final answer will be

P(at least 1 sonnet has "hues" or "hews" in every line) = 1 - [1 - (1 - P)^14]^154.

That is, 1 - P is the probability that it does occur in a 36 character line. (1 - P)^14 is the probability that it occurs in all 14 lines of a sonnet. 1 - (1 - P)^14 is the probability that it does not occur in every line of a sonnet. [1 - (1 - P)^14]^154 is the probability that it does not occur in every line of any of the 154 sonnets. 1 - [1 - (1 - P)^14]^154 is the probability that it does occur in every line of at least 1 of 154 sonnets.

We are assuming here that every line contains exactly 36 letters, since otherwise we would need to have been given the probability distribution of the length of a line. We were told that each of the 36 letters can be 1 of 26 equally probable letters, so we are ignoring spaces.

Now we simply need to find P, the probability that neither the letters of "hues" or "hews" occur in a 36 letter line. Note that this condition will be satisfied if there is no h, or if there is no e, or if there is no s, or if there is both no u AND no w. We simply want the probability of the union of these four conditions. We can compute the probability of this union by adding their probabilities, and then using the inclusion-exclusion principle to adjust for the over counting as follows:

P = P(no "h-u-e-s" AND no "h-e-w-s) =

P(no h) + P(no e) + P(no s) + P(no u AND no w) -

P(no h AND no e) - P(no h AND no s) - P(no e AND no s) - P[no h AND (no u AND no w)] - P[no e AND (no u AND no w)]- P[no s AND (no u AND no w)] +

P(no h AND no e AND no s) + P[no h AND no e AND (no u AND no w)] + P[no h AND no s AND (no u AND no w)] + P[no e AND no s AND (no u AND no w)] -

P[no h AND no e AND no s AND (no u AND no w)]


This is much simpler than it looks since we can combine many probabilities that are equal. We can rewrite the above as:

P = P(no "h-u-e-s" AND no "h-e-w-s") =

3*P(no h) + P(no u AND no w) -

3*P(no h AND no e) - 3*P[no h AND (no u AND no w)] +

P(no h AND no e AND no s) + 3*P[no h AND no e AND (no u AND no w)] +

P[no h AND no e AND no s AND (no u AND no w)]


These terms are all easy to compute as:

P = P(no "h-u-e-s" AND no "h-e-w-s) =

3*(25/26)^36 + (24/26)^36 -

3*(24/26)^36 - 3*(23/26)^36 +

(23/26)^36 + 3*(22/26)^36 -

(21/26)^36

=~ 60.16%

Note that using the independence approximation, Baker computed 1 minus this number as 40.84%, so it was off by about 1% which is reasonable, and it was slightly larger than the exact value, which we expected.

Substituting this value of P into our above equation produces the final answer:

1 - [1 - (1 - P)^14]^154 =~ 0.039%.
Reply With Quote
  #14  
Old 10-27-2005, 01:09 PM
EverettKings EverettKings is offline
Member
 
Join Date: Jun 2004
Location: Williamsburg, VA
Posts: 86
Default Re: Riddle -- Probability of Expectation

Check if this works....

There are (36 choose 4) ways to pick 4 letters in a line. The chance that a given set of 4 letters is some scrambled form of "hues" is (4*3*2*1)/(26^4). Same for "hews" so the chance that a given set of 4 letters matches either is 2*(4*3*2*1)/(26^4) (let's call this number P for simplicity). So the chance that a given set doesn't match either is 1-P. The chance that a line contains no instances of these words is (1-P)^(36 choose 4). (This is based on the assumption that knowing that one 4 letter combination does not contain "hues" or "hews" does not affect the chances of another combination containing them).

The chance that a line DOES contain at least one instance is:
(1 - (1-P)^(36 choose 4)).
So the chance that 14 lines contain one of the words is:
(1 - (1-P)^(36 choose 4))^14.
The chance that a monkey sonnet fails at least one line is therefore:
1 - (1 - (1-P)^(36 choose 4))^14
And the chance that one in 154 monkeys succeeds is:
1 - (1 - (1 - (1-P)^(36 choose 4))^14)^154)
Note that the bold section here is (hopefully) equal to the P that BruceZ used, if that makes the formula easier to read.

Theres a nonzero chance that some of my work is flawed but I *think* this works.

Everett
Reply With Quote
  #15  
Old 10-27-2005, 03:35 PM
AaronBrown AaronBrown is offline
Senior Member
 
Join Date: May 2005
Location: New York
Posts: 505
Default Re: Riddle -- Probability of Expectation

Sorry, this doesn't work.

It's true that the chance four randomly picked letters will be some permutation of "hues" is (4*3*2*1)/(26^4). But it's not true that the chance it will be a permutation of either "hues" or "hews" will be twice that. If that were true, then C(26,4)*(4*3*2*1)/(26^4) would have to equal 1. In fact it equals 3.93. The problem is that some of the random 4 letter combinations have some letters the same.

You have a similar problem with your next step. If I know that the first four letters of the line are "hues" then the probability that the first three letters plus the fifth contain "hues" is 1/26, not 120/26^4.
Reply With Quote
  #16  
Old 10-27-2005, 05:41 PM
Guest
 
Posts: n/a
Default Re: Riddle -- Probability of Expectation

Hmm. I get .000214...
Reply With Quote
  #17  
Old 10-27-2005, 07:10 PM
Vex Vex is offline
Junior Member
 
Join Date: Oct 2004
Posts: 18
Default Re: Riddle -- Probability of Expectation

[ QUOTE ]
Hmm. I get .000214...

[/ QUOTE ]

This is a challenging problem indeed. I thought I had worked out that the chance of a single line containing the requisite combinations of letters as about 18%, but when I tried to verify by simulation I found out that the actual value is likely to be a little more than twice that.

Back to the drawing board.

Over all, I think that the chances of one of 154 monkey-sonnets qualifying is in the neighborhood of 0.04%. I just need to figure out what's wrong with my math.

My simulation could be invalid as well; I whipped it up in a hurry and didn't double-check too thoroughly. With 8 lines of code, there is an above zero probability of a bug being in there... [img]/images/graemlins/smile.gif[/img]
Reply With Quote
  #18  
Old 10-27-2005, 07:19 PM
BruceZ BruceZ is offline
Senior Member
 
Join Date: Sep 2002
Posts: 1,636
Default Re: Riddle -- Probability of Expectation

[ QUOTE ]
It's true that the chance four randomly picked letters will be some permutation of "hues" is (4*3*2*1)/(26^4). But it's not true that the chance it will be a permutation of either "hues" or "hews" will be twice that.

[/ QUOTE ]

It is precisely twice that because these are mutually exclusive.


[ QUOTE ]
If that were true, then C(26,4)*(4*3*2*1)/(26^4) would have to equal 1. In fact it equals 3.93. The problem is that some of the random 4 letter combinations have some letters the same.

[/ QUOTE ]

This is the sum of the probabilities of the C(26,4) letter combinations with 4 distinct letters. It is not 1 because this is less than the total number 26^4 of letter combinations, some of which have repeat letters.


I agree that Everett's method does not work though. From Everett's post:

[ QUOTE ]
There are (36 choose 4) ways to pick 4 letters in a line. The chance that a given set of 4 letters is some scrambled form of "hues" is (4*3*2*1)/(26^4). Same for "hews" so the chance that a given set of 4 letters matches either is 2*(4*3*2*1)/(26^4) (let's call this number P for simplicity). So the chance that a given set doesn't match either is 1-P. The chance that a line contains no instances of these words is (1-P)^(36 choose 4). (This is based on the assumption that knowing that one 4 letter combination does not contain "hues" or "hews" does not affect the chances of another combination containing them).

[/ QUOTE ]

This last assumption is the problem. The C(36,4) sets of 4 letters are certainly not independent of each other with respect to containing "hues" or "hews". This would be the case if they did not overlap. Since they share the same letters, the probability of one set containing these letters is strongly dependent on other sets containing these letters. Aaron has given an example of two sets which share 3 of the 4 letters. If one is "hues" and the other is "hue_", then the probability that the second is also "hues" has jumped from (1/26)^4 to just 1/26.


[ QUOTE ]
And the chance that one in 154 monkeys succeeds is:
1 - (1 - (1 - (1-P)^(36 choose 4))^14)^154)
Note that the bold section here is (hopefully) equal to the P that BruceZ used, if that makes the formula easier to read.

[/ QUOTE ]

I compute your value in bold to be [1 - 2*(4*3*2*1)/(26^4)]^C(36,4) =~ 0.2%, while my P is about 60.84%.
Reply With Quote
  #19  
Old 10-27-2005, 07:24 PM
BruceZ BruceZ is offline
Senior Member
 
Join Date: Sep 2002
Posts: 1,636
Default Re: Riddle -- Probability of Expectation

[ QUOTE ]
Hmm. I get .000214...

[/ QUOTE ]

You get that for what? The final answer? What method are you using? This is about half of correct final answer, which would correspond to just one of the hues/hews variations.
Reply With Quote
  #20  
Old 10-27-2005, 07:30 PM
BruceZ BruceZ is offline
Senior Member
 
Join Date: Sep 2002
Posts: 1,636
Default Re: Riddle -- Probability of Expectation

[ QUOTE ]
This is a challenging problem indeed. I thought I had worked out that the chance of a single line containing the requisite combinations of letters as about 18%, but when I tried to verify by simulation I found out that the actual value is likely to be a little more than twice that.

[/ QUOTE ]

The simulation is correct. I showed in this post that this value is about 39.84%. It is the value that I refer to as 1 - P. I derived P by inclusion-exclusion, which gives a fairly simple expression.


[ QUOTE ]
Over all, I think that the chances of one of 154 monkey-sonnets qualifying is in the neighborhood of 0.04%.

[/ QUOTE ]

Right, I got 0.039%.
Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -4. The time now is 07:14 AM.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.