PDA

View Full Version : Evasive 50% words revisited.


Duke
08-28-2004, 09:40 AM
Does everyone remember the old "What would be a good example of a word that half the population knows?" thread? Anyhow, I was thinking about that a little and came up with another question that I think is interesting.

Say 2 people have pretty good vocabularies. Maybe 150,000 words each. Not a SOWPODS Scrabble expert who has 400k words or whatever in their head (plus a separate vocabulary for their native language if you're talking about a guy like Pakorn), but a pretty decently large vocabulary. Certainly some of the words would be specific to whatever they do with their life, and therefore they wouldn't share those words in common.

Anyhow, my question is: What percentage of the words that someone knows would, on average, be distinct from words that someone who has an equally large vocabulary knows?

Words that come to mind for various reasons (either I learned them for a weird reason or I saw it and thought: "gee, I bet 99% of the people reading this wouldn't know this word"):

abecederian
quincunx
hoopoe
toquet

They're weird words, which is part of the point. How many of these sort of "odd" words would each posess? How many would seem perfectly normal to one person but not to another - assuming the same total vocabulary? What would the most normal word be that one of the people had never heard?

I think that's a more interesting question than what a 50% word would be, since that's just a self-selecting subgroup made up of people with above average vocabularies arguing over how ignorant they think the average person is. The logical conclusion in that would be to construct a test and vote on who the most average person we all know is, and have them take it. I think that generating our own Salieri would be kinda cruel.

~D

RocketManJames
08-29-2004, 04:28 AM
I would guess that two people with equal vocabulary sizes would have overlap of approximately 90%. So, I suppose the answer to your question, there would be 10% of each vocabulary that would be distinct.

I believe that size of vocabulary is negatively correlated with with overlap. The larger the vocabulary, the smaller the % overlap.

What was your guess?

-RMJ

Duke
08-29-2004, 04:49 AM
I thought 95-98, but wasn't sure at all.

~D

Michael Davis
08-29-2004, 05:55 AM
I think you are closer, and that the number gets higher as vocabulary increases.

The very best classicists in the world are going to have a difficult time pointing out a new word to colleagues.

Anyways, I have e-mailed someone who I know will give a reasoned response and will post it here.

-Michael

Duke
08-29-2004, 07:52 AM
Kick ass.

The reason I thought it to be possibly as high as 5% is that my own vocabulary is filled with many technical terms in various disciplines, and also is artificially grown by at least a few thousand words because of spradic Scrabbling.

~D

Michael Davis
08-29-2004, 10:10 AM
Duke,

I guess we sort of have to define our terms here. The issue of Scrabbling is really interesting and something I have to consider.

If somebody can recognize what a word means when they see it but could never volunteer this word in a sentence, does it count as part of their vocabulary. As technical vocabulary goes, the words are very easy to read if you know the roots.

The top people just know so many damn words that the sheer volume of their vocabulary means they are going to have to have a lot of words that don't overlap just to fall down to, say, 98%. That's unlikely.

-Michael