Skip to content

tilotech/go-names

Repository files navigation

go-names

go-names provides name lists for working with frequent names and nicknames.

Names are not necessarily limited to personal names, like first and last names.

Common (Frequent Names)

Common provides lists for the most common names.

To get a list of the 10 most frequent first names, use it like this:

import (
  "fmt"
  "github.com/tilotech/go-names"
)

func main() {
  common, err := names.NewCommonPreset("US_FIRST_NAME")
  if err != nil {
    panic(err)
  }

  top10 := common.Top(10)
  fmt.Println(top10)
}

The common names comes with ready-to-use presets, but you can also provide your own using the NewCommon constructor.

US_FIRST_NAME and US_LAST_NAME Presets

Those two presets are based on the publicly available voters registry of North Carolina and represent the 5,000 most common first/middle and last names, as well as their relative frequency. E.g. the name Michael is the most popular name in that source, with 1.33% of the entries having this as either their first or middle name.

US_CITY Preset

This preset provides the most common cities in the US (common as in population size). Hence, the frequency is based on the population size of each city and the total US population size. The list is based on publicly available data.

Canonical (Base Name)

Using the canonical name it is possible to get a single representation of a name, independent from different spellings or other effects such as aliases. Such a canonical name can then be used e.g. in data matching. Depending on the used data such a canonical name may not seem very intuitive.

import (
  "fmt"
  "github.com/tilotech/go-names"
)

func main() {
  canonical, err := names.NewCanonicalPreset("NICKNAMES")
  if err != nil {
    panic(err)
  }

  name := canonical.Of("mickey")
  fmt.Println(name) // prints: "michael"
}

NICKNAMES Preset

The nicknames preset is based on various open source nickname lists, hand curated to represent the idea of a canonical name as good as possible. However, since nicknames can be assigned to various names, the canonical name may not make much sense in a few cases. E.g. the four entries clement, clem, clementine and clemmie are in the same canonical name group and all resolve to clement. Other examples are even more extreme, e.g. lucy resolves to louisa. Hence, the preset follows a best guess approach and is far from perfect.

Nickname sources: