Confusing Names

03 Feb 2018

Martin Lee Allen: a whole law firm or just one person's name? I bring you a data-driven answer to this question.

Based on the 1990 U.S. census data, here is a list of names that are equally common as first names and last names:

NameTotal Popularity (%)Popularity as First (%)Popularity as Last (%)Closeness (%)
MARTIN0.4910.2180.27380
LEE0.4330.2130.22097
ALLEN0.3750.1760.19988
JORDAN0.1460.0680.07887
ARNOLD0.1280.0720.05678
AUSTIN0.0970.0450.05287
OLIVER0.0900.0400.05080
NEAL0.0760.0370.03995
LUCAS0.0690.0310.03882
BLAKE0.0650.0370.02876
MACK0.0560.0250.03181
SHERMAN0.0550.0280.02796
SIMON0.0520.0260.026100
OWEN0.0510.0260.02596
DOYLE0.0490.0220.02781
CLAY0.0440.0210.02391
ROMAN0.0430.0200.02387
CAREY0.0410.0220.01986
MOSES0.0390.0200.01995
HEATH0.0380.0170.02181
LARA0.0330.0160.01794
HALEY0.0300.0140.01688
SOLOMON0.0290.0130.01681

Methodology

I wanted a way to decide if a name was equally common as a first or last name. I decided to use the formula `min(first / last, last / first)`, which I'm calling "closeness." It probably has at least one other name, if not more.

I wanted to filter out all names with closeness < `c`. First I binned the names by closeness. Within each bin, I hid the first and last popularity and guessed whether that name was predominantly a first name or a last name. I could guess accurately below `c` > 0.75 but not above it, so I set `c` to 0.75.

Finally, I ranked all qualifying names by total popularity. The total popularity figure that I'm using is actually just the sum of the name's popularity as a first name and as a last name. This will be a slight overestimate since it doesn't account for people whose first names are the same as their last names ("Martin Martin?"). This is probably not a big deal.