Gender Guesser
gender-guesser is a Python library that attempts to determine the gender of a person based on their first name. It's a Python port of a Java library and is currently at version 0.4.0. The library returns one of six possible values: 'unknown', 'andy' (androgynous), 'male', 'female', 'mostly_male', or 'mostly_female'. It appears to be actively maintained, with the latest release from late 2016.
Warnings
- gotcha Creating multiple `Detector` instances is inefficient as each instance re-reads the data file. Instantiate the detector once and reuse it.
- gotcha By default, the `Detector` is case-sensitive. 'John' might be recognized, but 'john' could be 'unknown'.
- gotcha The library differentiates between 'unknown' and 'andy'. 'unknown' means the name was not found in the database. 'andy' means the name was found but has an equal probability of being male or female.
- gotcha When providing a country to `get_gender()`, the country name must be in lowercase with spaces replaced by underscores (e.g., 'great_britain', 'the_netherlands'). Invalid country names will raise an error or return 'unknown'.
- breaking In version 0.3.0, the `unknown_value` initialization option was removed. Additionally, the return values for names not found or equally probable were standardized to 'unknown' and 'andy' respectively.
Install
-
pip install gender-guesser
Imports
- Detector
from gender_guesser.detector import Detector
Quickstart
from gender_guesser.detector import Detector
d = Detector()
# Guess gender for a single name
name1 = "Peter"
gender1 = d.get_gender(name1)
print(f"The gender for {name1} is: {gender1}")
# Guess gender for a name with country preference
name2 = "Andrea"
country = "italy" # Country names must be lowercase with underscores
gender2 = d.get_gender(name2, country)
print(f"The gender for {name2} in {country} is: {gender2}")
# Example with a name often considered androgynous
name3 = "Pauley"
gender3 = d.get_gender(name3)
print(f"The gender for {name3} is: {gender3}")
# Example with case_sensitive=False
d_insensitive = Detector(case_sensitive=False)
name4 = "sally"
gender4 = d_insensitive.get_gender(name4)
print(f"The gender for {name4} (case-insensitive) is: {gender4}")