Friday, May 15, 2009

Who Would Have Known?

TW: The graphs below (click to enlarge) show the distribution of boys by the last letter of their first names for the periods- 1906/1956/2006. The names have converged.



From Andrew Gelman's Stat Modeling blog:
"...The quick story is that 100 years ago, there were about 10 last letters that dominated; 50 years ago, the number of popular last letters declined slightly, to about 6; but now, a single letter stands out: an amazing 36% of baby boys in America have names ending in N.

This is super-cool. As a commenter wrote, there should be some sort of award for finding the largest effect "in plain sight" that nobody has noticed before.

But, beyond pure data-coolness, what does this mean? My story, based on reading Wattenberg's blog, goes as follows:

100 years ago, parents felt very constrained in their choice of names (especially for boys). A small set of very common names (John, William, etc.) dominated. And, beyond that, people would often choose names of male relatives. Little flexibility, a few names being extremely common, resulting in a random (in some sense) distribution of last letters.

Nowadays, parents have a lot of freedom in choosing their names. As a result, there are lots and lots of names that seem acceptable, but the most common names are not so common as they were fifty or a hundred years ago. With so much choice, what do people do? Wattenberg suggests they go with popular soundalikes (for example, Aidan/Jaden/Hayden), which leads to clustering in the last letter. Even so, the pattern with N is so striking, there's gotta be more to say about it.

But I like the paradox: 100 years ago, the distribution of names was more concentrated but the distribution of sounds (as indicated by last letters) was broader. Nowadays, the distribution of names is more diffuse but the distribution of sounds is more concentrated.

Less constraint -> more diffuse distribution of names -> more concentrated distribution of last letters.

This must occur in other aspects of life. For example, consider food. We eat lots more different types of food than we used to, but a single ingredient--corn syrup--makes up more and more of our diet (or so I'm told). Again, lack of constraint (this time for economic reasons) leads to more diversity in some ways and more homogeneity (by choice) in others."

No comments: