Viewing a single comment thread. View all comments

earnest_dad OP t1_isswtbh wrote

Source: babynames library (R package): https://cran.r-project.org/web/packages/babynames/index.html

Note: this package draws data from the US Social Security Administration

Tools used: R

data preprocessing: tidyverse

visualization: ggplot2

Additional notes:

(1) identify "standalone" names by finding top 1000 female names

(2) identify names that are composed of two standalone names combined

(3) identify common "prefix" and "suffix" names by finding the maximum (annual) proportion of names from (2); restrict to instances where log(max frequency) > -8.5

(4) restrict attention to combined names composed of the names from (3)

(5) hand-edit (that is, remove) unusual prefixes and suffixes: (redditors objected to the inclusion of "eliza-beth" and "elisa-beth"; also hand-remove "ina" and "ora")

Note: an earlier draft of this plot did not filter to female names only, and so incidentally included the name "Josue", a male name which is composed of the common female standalone names, "Jo" and "Sue"

1