Skip to content

Human Name Parsing in PHP

by Jonathon on October 31st, 2009

Parsing human names are not exactly easy, but they can be done. Keith Beckman’s nameparse.php is an excellent PHP library for doing this.

Download nameparse.php

nameparse.php can recognize names in “[title]first[middles]last[,][suffix]” and “last,first[middles][,][suffix]” forms, which, when you think about it, cover most if not all well-formed name input formats. nameparse.php handles last names of arbitrary complexity, such as “bin Laden”, “van der Vort”, and “Garcia y Vega”, as well as middle names of arbitrary size and complexity, differentiating between most last names and the first or middle names or initials preceding them.

An example of names correctly parse by nameparse.php:

  • Doe, John. A. Kenneth III
  • Velasquez y Garcia, Dr. Juan, Jr.
  • Dr. Juan Q. Xavier de la Vega, Jr.

To use, simple include() or require() nameparse.php and call parse_name($string) on any name. parse_name() returns an associative array of all name segments found of “title”,”first”,”middle”,”last”, and “suffix”. Do note that no spelling, capitalization, or punctuation of titles, prefixes, or suffixes is normalized. That is, every token remains as entered: nameparse.php is a semantic parser only. If you want orthographic or other normalization, you’ll have to postprocess the output. However, since the name is now semantically parsed, such postprocessing is (for applications which require it) simple.

print_r(parse_name('Velasquez y Garcia, Dr. Juan Q. Xavier III'));

yields . . .

Array
(
    [title] => Dr.
    [first] => Juan
    [middle] => Q. Xavier
    [suffix] => III
    [last] => Velasquez y Garcia
)

From → Notebook

5 Comments
  1. Nico permalink

    Hi Jonathon,

    Very nice script! i was starting to write something amongst these lines, when i found your code, and this is just to say excellent job! Busy testing how to interact with the rest of my code but it should be quite simple.

    Best regards

  2. Jim permalink

    Must Excellent. Your logic and methodology seemed to make a complex problem simple. For my purposes I had to convert the code into javascript, which took a couple hours but worth it.

    My SQL ‘contacts’ table has the following fields:

    full_name (name as entered by user)
    sort_name (last, title first middle, suffix)
    nick_name (usually just the first name).

    If sort_name or nick_name are null, I populate the ui with values from the parse_name object (while reducing opacity). If the user changes one of the values, the new value is saved to the table and the ui displays the field like normal.

    I’m considering making a jQuery plug-in for this. It’s been that useful.

    Thanks again,
    Jim H.

  3. Jim permalink

    (Oops, bad email on last post. This is the good one.)
    Email me if you like me to send you the js rewrite of your parse_name function.

  4. Jim permalink

    Do you have a repo we can send minor changes to?

  5. Jonathon permalink

    I am not the author of this code, but you can contact the author, Keith Beckman, at http://alphahelical.com/contact.php. Thanks!

Leave a Reply

Note: XHTML is allowed. Your email address will never be published.

Subscribe to this comment feed via RSS