Census 2000: how many distinct surnames are there?
Even after applying various edits and acceptance criteria to the names, there are a sizable number of unique names in the population. Over 6 million last names were identified. Many of these names were either unique (occurred once) or nearly so (occurred 2-4 times) raising questions about the actual validity of the name. Cursory examination of the data indicates that many of these unique names were probably the entire name of the person (first and last, or first, middle initial and last) concatenated into a single continuous string, with some other information. At this time, it is not possible to easily break a fully concatenated name back into its constituent parts. Doing so, however, would have reduced the counts of unique names sizably, while only slightly increasing the numbers of person with more common names. While a relatively large proportion of all names relate to only one person or a few people, a large proportion of the entire population can be identified with a relatively small proportion of all names. Table below better explains this phenomenon.
Table below shows the frequency of last names and the numbers of people who are defined by them. Seven last names are held by a million or more people. The most common last name reported was SMITH, held by about 2.3 million people, or about 0.9 percent of the population. Another 6 names with over a million respondents (JOHNSON, WILLIAMS, BROWN, JONES, MILLER and DAVIS), along with SMITH, account for about 4 percent of the population, or one in every 25 people. There are another 268 last names each occurring at least 100,000 times, but less than 1 million times. Together, these 275 last names, just 4/100,000 of all reported last names, account together for 26 percent of the population, or about one of every four people. On the flip side of this distribution, about 65 percent (or 4 million) of all captured last names were held by just one person, and about 80 percent (or 5 million) were held by no more than 4 people.
Last Names by Frequency of Occurrence and Number of People: 2000
Frequency of Surname Occurrence
|
Number of surname
|
Cumulative Number of surnames
|
Cumulative Proportion (percent)
|
Number of people with the surname
|
Cumulative Number of people
|
Cumulative Proportion (percent)
|
1,000,000+
|
7
|
7
|
0.0
|
10,710,446
|
10,710,446
|
4.0
|
100,000-999,999
|
268
|
275
|
0.0
|
60,091,601
|
70,802,047
|
26.2
|
10,000-99,999
|
3,012
|
3,287
|
0.1
|
77,657,334
|
148,459,381
|
55.0
|
1,000-9,999
|
20,369
|
23,656
|
0.4
|
58,264,607
|
206,723,988
|
76.6
|
100-999
|
128,015
|
151,671
|
2.4
|
35,397,085
|
242,121,073
|
89.8
|
50-99
|
105,609
|
257,280
|
4.1
|
7,358,924
|
249,479,997
|
92.5
|
25-49
|
166,059
|
423,339
|
6.8
|
5,772,510
|
255,252,507
|
94.6
|
10-24
|
331,518
|
754,857
|
12.1
|
5,092,320
|
260,344,827
|
96.5
|
5-9
|
395,600
|
1,150,457
|
18.4
|
2,568,209
|
262,913,036
|
97.5
|
2-4
|
1,056,992
|
2,207,449
|
35.3
|
2,808,085
|
265,721,121
|
98.5
|
1
|
4,040,966
|
6,248,415
|
100.0
|
4,040,966
|
269,762,087
|
100.0
|
|