PDA

View Full Version : Help me complete my degree in Bracketology


CrazyEyez
03-05-2005, 03:11 PM
I have a revolutionary theory for filling out NCAA brackets that I've been thinking of trying for years. I'm finally going to put it into action this year, but I need help filling in a hole in the system.

I'm going to look at matchups based on seeds only, and choose the winner using probabilities equal to historical win percentages. For example, we all know the first round W/L rates. 1 wins 100% vs 16, 2 95% vs 15, etc. So my spreadsheet will choose 2 as the winner vs 15 95% of the time. I continue this throughout all rounds. The 2nd round probabilities will be specific to historical 2nd round matchups.

Problem: Beginning with the third round, there are matchups that could occur on my spreadsheet, but have never happened in reality. For example, a 9 and a 12 have both made it to the third round, but never at the same time. So I don't have a win% to use as probability to determine the winner. What's the best system I could use to derive a win probability for unprecedented matchups? I have a few ideas but I want to get other opinions.

tech
03-05-2005, 03:23 PM
Before I answer, I want to make sure I understand. I am confused on your system. How do you actually fill out the final bracket? Your simulation results over a large number of trials are merely a function of the probabilities that you use. So you don't even really have to have a spreadsheet to see the results. What am I missing?

CrazyEyez
03-05-2005, 03:33 PM
Rd 1: 2 vs 15
I put 95 red marbles in a bag, 5 blue marbles. I draw one marble. If it's red, I put the 2 seed as the winner, if it's blue the 15 seed wins. I figured out a way to do this via Excel. I could spit out 20 sheets and theoretically I'd be the statistical favorite to win all my pools, get my own infomercial, sell my system, and become a multimillionaire. /images/graemlins/grin.gifBut in later rounds, I don't know how many marbles to use because it's never happened.
Make sense?

ttleistdci
03-05-2005, 03:44 PM
[ QUOTE ]
Rd 1: 2 vs 15
I put 95 red marbles in a bag, 5 blue marbles. I draw one marble. If it's red, I put the 2 seed as the winner, if it's blue the 15 seed wins. I figured out a way to do this via Excel. I could spit out 20 sheets and theoretically I'd be the statistical favorite to win all my pools, get my own infomercial, sell my system, and become a multimillionaire. /images/graemlins/grin.gifBut in later rounds, I don't know how many marbles to use because it's never happened.
Make sense?

[/ QUOTE ]

Maybe I'm missing something, but...

Your theory is good if you're betting individual games. But in the NCAA bracket, outcomes depend on each other. So if you have Louisville making it into the Elite 8, but they actually lose the 1st round, the rest of your mathematical picks in that particular bracket don't mean squat.

Again, I could be missing something.

CrazyEyez
03-05-2005, 04:06 PM
[ QUOTE ]
[ QUOTE ]
Rd 1: 2 vs 15
I put 95 red marbles in a bag, 5 blue marbles. I draw one marble. If it's red, I put the 2 seed as the winner, if it's blue the 15 seed wins. I figured out a way to do this via Excel. I could spit out 20 sheets and theoretically I'd be the statistical favorite to win all my pools, get my own infomercial, sell my system, and become a multimillionaire. /images/graemlins/grin.gifBut in later rounds, I don't know how many marbles to use because it's never happened.
Make sense?

[/ QUOTE ]

Maybe I'm missing something, but...

Your theory is good if you're betting individual games. But in the NCAA bracket, outcomes depend on each other. So if you have Louisville making it into the Elite 8, but they actually lose the 1st round, the rest of your mathematical picks in that particular bracket don't mean squat.

Again, I could be missing something.

[/ QUOTE ]
I'm not sure if I understand you, but maybe this clears it up:
After I choose the first round, I look at the matchups for the second round. If I have 1 vs 8, I look up the historical results for all 1 vs 8 matchups in the 2nd round. That percentage is different than, say, 1 vs 9. So I then base that selection on the appropriate percentage.
I may be doing a lousy job of explaining it, but the people I've showed it to in person think it makes sense. Maybe I should post a link to my sample Excel file.

micacka
03-05-2005, 07:12 PM
I've also been thinking of trying this system, but don't have the coding/mathematics experience to pull it off. Also, do you think it would be wise to weight the actual teams to change the probabilities, like a lower seeded team with upset potential, or a comparatively weak 1 seed? You might get more accurate results.

Good luck with this, definitely post your results here, it'd be interesting stuff.

WarDekar
03-05-2005, 08:06 PM
I don't think this is going to be much better than "guessing" yourself, or just picking based on seed. Sometimes seeds vary for reasons such as geographical location and the location of first round games. You can't just look at the results of 4 vs. 7 seeds without accounting for how they got there and why they were seeded whta they were. Plus, obviously, these teams are completely independent of any past tournament teams.

tech
03-05-2005, 08:11 PM
[ QUOTE ]
Make sense?

[/ QUOTE ]

Yeah, it does. It means that your whole tournament bracket is based on one random draw. The fact that you weighted it doesn't keep it from being random.

tech
03-05-2005, 08:13 PM
[ QUOTE ]
Problem: Beginning with the third round, there are matchups that could occur on my spreadsheet, but have never happened in reality. For example, a 9 and a 12 have both made it to the third round, but never at the same time. So I don't have a win% to use as probability to determine the winner. What's the best system I could use to derive a win probability for unprecedented matchups? I have a few ideas but I want to get other opinions.

[/ QUOTE ]

You could look at similar seeds, if available. Like if you need 9/12, you could use 8/11 or 8/12. Not sure it matters though, for the reasons in my other post.

03-05-2005, 08:37 PM
If you're talking about an office pool, or some other pool where 90% of the people don't know what they are doing, I suggest using the Sagarin ratings. There have been a few years where they have been quite good in hinting at upsets. Weber State over North Carolina a few years ago comes to mind.

CrazyEyez
03-05-2005, 10:42 PM
[ QUOTE ]
[ QUOTE ]
Make sense?

[/ QUOTE ]

Yeah, it does. It means that your whole tournament bracket is based on one random draw. The fact that you weighted it doesn't keep it from being random.

[/ QUOTE ]
I don't understand. It's based on 63 random (weighted) draws, not 1.
Also, I don't see how it's anymore "random" than other methods. Any choice you make on a particular game is based on some criteria you choose include. If you have a thought on why the particular criteria I want to use are bad, please share.

CrazyEyez
03-05-2005, 10:55 PM
[ QUOTE ]
I don't think this is going to be much better than "guessing" yourself, or just picking based on seed. Sometimes seeds vary for reasons such as geographical location and the location of first round games. You can't just look at the results of 4 vs. 7 seeds without accounting for how they got there and why they were seeded whta they were. Plus, obviously, these teams are completely independent of any past tournament teams.

[/ QUOTE ]
Of course there are many factors that go into winning and losing. But I argue that if you take those other types criteria into consideration, you are always missing something. That is, you never have equal knowledge of all teams, so your decisions are flawed.

This theory is a way to take personal biases/lack of knowledge out of the equation. Within it's own constraints, it would base decisions on "perfect" data. Of course, it's entirely up for debate how good or bad a predictor of results said data is.

I don't expect that this method will spit out a winner every time - more like, I expect if I run off 50 of these things, my average score could be pretty good. Obviously 100 years from now this thing should work better. Sample Size Man would shoot me with only 20 tourneys worth of data.
Whether or not it's better or worse than traditional "guessing" is precisely what I hope to find out.

CrazyEyez
03-05-2005, 11:01 PM
[ QUOTE ]

You could look at similar seeds, if available. Like if you need 9/12, you could use 8/11 or 8/12. Not sure it matters though, for the reasons in my other post.

[/ QUOTE ]

That was my initial thought. But it just seems so imprecise.
Another thought is this:
Take the 9 vs 12 example. They've both been to rd 3, but never against each other. Take 9's win percentage in rd 3 regardless of opponent, and take 12's win % in rd3 regardless of opponent, and make a ratio. If 9 won 20% of the time, and 12 won 5%, then 9vs12 theoretical win% is 20:5 or 80% in favor of 9.
Although I'm sure I'll encounter a matchup where neither seed has advanced past that round, which would leave me with a ratio of 0:0.

tech
03-05-2005, 11:29 PM
[ QUOTE ]
Also, I don't see how it's anymore "random" than other methods. Any choice you make on a particular game is based on some criteria you choose include. If you have a thought on why the particular criteria I want to use are bad, please share.


[/ QUOTE ]

Essentially what you are saying is that no one can handicap the games better than simply the historical records of the seedings. I strongly disagree, as I have seen too much evidence to the contrary. Let's say that an average #13 seed has a 20% chance to win in your system (I made this number up; I don't know what the real % is). A good handicapper will be able to spot which #13 seeds in this year's tournament have a greater than 20% chance to win and which have a less than 20% chance to win.

CrazyEyez
03-06-2005, 12:07 AM
[ QUOTE ]
[ QUOTE ]
Also, I don't see how it's anymore "random" than other methods. Any choice you make on a particular game is based on some criteria you choose include. If you have a thought on why the particular criteria I want to use are bad, please share.


[/ QUOTE ]

Essentially what you are saying is that no one can handicap the games better than simply the historical records of the seedings. I strongly disagree, as I have seen too much evidence to the contrary. Let's say that an average #13 seed has a 20% chance to win in your system (I made this number up; I don't know what the real % is). A good handicapper will be able to spot which #13 seeds in this year's tournament have a greater than 20% chance to win and which have a less than 20% chance to win.

[/ QUOTE ]
Gotcha. That's a valid point.
For me personally, there's no way I follow college hoops enough to make informed decisions on more than a small handful of games.
Anyway, I'll report back in case anyone's curious.

Sluss
03-07-2005, 09:00 AM
[ QUOTE ]
Take the 9 vs 12 example. They've both been to rd 3, but never against each other. Take 9's win percentage in rd 3 regardless of opponent, and take 12's win % in rd3 regardless of opponent, and make a ratio. If 9 won 20% of the time, and 12 won 5%, then 9vs12 theoretical win% is 20:5 or 80% in favor of 9.
Although I'm sure I'll encounter a matchup where neither seed has advanced past that round, which would leave me with a ratio of 0:0.


[/ QUOTE ]

Not that I completly understand the mathematical formula here, but I think if you make it to the point where you have a 14 vs a 9 in the elite eight, you have to put the system to bed.

TomBrooks
04-07-2005, 01:48 AM
I think a problem you may have with this system is that the amount of historical data your using to make future projections may be too small to be reliable or may not be reliable enough in general.

While it may be possible to predict a 1 seed will beat a 16 seed something like 90% of the time, I wonder if you can find a reliable pattern of something like a 7 vs.10 seed.