Matt Walker
06-28-2005, 02:32 PM
I realize that this may come across as heresy on these boards but I think we are all getting way to preoccupied with huge sample sizes. Someone posts "here are my results through 200 SNGs" and everyone says "Come back after 1000 more."
Large sample sizes have just as many problems as small ones. After playing 1000 tournaments it's unreasonalbe to use the entire data set to try to model someones ROI because any donkey is a better player after that much practice. Also in the several months that it took to play these extra tourneys, the competition is better. How can you reasonably construct a confidence interval around this, and why is it better than after 200 tourneys?
Also I think it would be better to do all these problems using Bayesian statistics. For example if ZeeJustin says I think I'm a 25% ROI player and playes 500 and has an ROI of 25%, it would be nonsense to simple construct a simple confidence interval around his data and say "No way, statistics says this data is only accurate to x percent with 95 percent confidence." or "In the last 2 years, your total ROI was 10 percent at this level so you're just on a heater." Bullshit. We should all try to use some sort of previous assumptions when working these problems, and then the smaller sample sizes will be accurate enough.
Matt
Large sample sizes have just as many problems as small ones. After playing 1000 tournaments it's unreasonalbe to use the entire data set to try to model someones ROI because any donkey is a better player after that much practice. Also in the several months that it took to play these extra tourneys, the competition is better. How can you reasonably construct a confidence interval around this, and why is it better than after 200 tourneys?
Also I think it would be better to do all these problems using Bayesian statistics. For example if ZeeJustin says I think I'm a 25% ROI player and playes 500 and has an ROI of 25%, it would be nonsense to simple construct a simple confidence interval around his data and say "No way, statistics says this data is only accurate to x percent with 95 percent confidence." or "In the last 2 years, your total ROI was 10 percent at this level so you're just on a heater." Bullshit. We should all try to use some sort of previous assumptions when working these problems, and then the smaller sample sizes will be accurate enough.
Matt