PDA

View Full Version : Baseball Question


RocketManJames
10-12-2005, 07:44 PM
To the baseball and math buffs out there, I have a question.

Take two baseball teams, A and B. Have them play an N game series.

What information would we need so that we could compute the number of games required such that we were X% confident that the winner of the series was truly the better team?

Is this question answerable? And, what other information would we need?

-RMJ

Siegmund
10-12-2005, 09:54 PM
You'll need to be more specific about your notions of "confidence" and "better" before it can be answered - if you pin those down narrowly enough it will become answerable.

You can, for instance, specify that you want a series long enough that a team that wins 52% or more of its games has att least a 95% chance of winning the series. (Bad news: that takes a couple thousand games. Make it 55% and it's only a couple hundred.)

You can get a somewhat more efficient use of the players' time by ending a series when one team has a certain margin of a lead, akin to the way tennis and volleyball games have to be won by at least a 2-point lead; but this would require you to either accept the possibility of a series running all winter, or to sanction the concept of a tied series if no conclusive leader emerged after some modest number of games.

Or you can accept the fact that the outcome of a single series, or even a single season, has a significant degree of randomness, and spend countless hours discussing the players' strengths and weaknesses to try to determine who is "really" better - which, as far as I can tell, is something serious baseball fans enjoy even more than watching a game. Be a shame to spoil their fun, wouldnt it?

10-13-2005, 01:34 AM
isnt that a simple answer?

wins / games = x

so that's the confidence % = x.

AaronBrown
10-13-2005, 11:21 AM
Siegmund gave you an excellent answer. I'll try a different tack.

When you have a question that you’re not sure makes sense, it’s often a good idea to find a similar question you can answer simply and precisely. Then you can consider whether it has answered your original question or not, and if not, why not. You’ll either get your answer or refine your thinking about your question.

Let’s reverse the question and say we know team A is better than team B and, in fact, has probability p of winning each game. That probability is constant and games are independent, so there is no home field advantage, nor difference among starting pitchers, nor is one team getting better or worse, nor does play change when a team is ahead or behind in games. These are strong assumptions, and not likely to be true in practice. Slight deviations could make a significant difference to your answer.

If team A and team B play N games, the number of games won by team A has a binomial distribution with mean N*p and variance N*p*(1-p). Therefore, if N is odd, the number of games we will need is proportional to p*(1-p)/(p – 0.5)^2. We multiply this by a number determined by the confidence level we want, this is approximately the square of the inverse Normal function of our confidence, [Normsinv(c)]^2 in Excel.

If p = 0.51, the first term is about 2,499 games. p = 0.55 gives about 99 games, p = 0.6 gives about 24 games, p = 0.7 gives about 5 games. That number of games gives you about 84% confidence. For 90% confidence you multiply by 1.64, for 95% you multiply by 2.71, 99% by 5.41, 99.9% by 9.55.

Therefore, if you think p = 0.7, you’ll need about 15 games for 95% confidence that the winning team is the better team.

AaronBrown
10-13-2005, 11:25 AM
It doesn't work that way. For example, suppose you watched ten hands of heads-up Poker and saw one player win seven of the pots. Would you then be 70% confident he was the better player?

Your formula overestimates the confidence we have in a small number of observations. If we see one person win one hand, we're certainly not 100% confident he's better. On the other hand, it underestimates the confidence we have in a large number. If one player wins 51,000 of 100,000 hands, we're a lot more than 51% confident that he's the better player.