![]() |
#23
|
|||
|
|||
![]()
I think the reason is that we don't have a large sample size for most players in our PT databases. A small sample size means we will see a bigger deviation from the player's actual results, making some losers look like winners and some winners look like losers. Since most players are losers, we will misclassify more long-term losers as winners than the other way around, inflating the winning player percentage in our PT databases.
Here's a simplified scenario that illustrates this (warning, math follows): I have 1000 players in the database. 90% of the players are long term losers, 10% are long-term winners. However, I only have 1000 hands on each players. Because of my small sample size, there is a 40% chance I will misclassify a long-term loser as a winner, and a 40% chance I will misclassify a long-term winner as a loser. Here are the results I would get: Classified as Winners = (.4 x 900 losers) + (.6 * 100 winners) = 360 + 60 = 420 winners, vs the true number of 100. Classified as Losers = (.6 x 900 losers) + (.4 * 100 winnners) = 540 + 40 = 580 losers, vs the true number of 900. |
|
|