Searching Using Soundex Codes
A companion file is available: Click here to download
Soundex is an indexing system that translates a name into a 4-digit code consisting of one letter and three numbers. The advantage of Soundex is its ability to locate names by the way they sound, rather than by exact spelling. For example, consider the name Maris. This name has a Soundex code of M620. Other variations on this name (such as Mares, Marriss, Mariss, and Mairis) all have the same Soundex code.
- Each Soundex code has exactly four alphanumeric characters (1 letter and 3 numbers)
- The first letter of the name is always the first character of the Soundex code.
- The remaining three digits are defined from the name using the Soundex Key Codes listed below.
- Adjacent letters in the name which have the same Soundex Key code number are assigned a single digit.
- If the name is not long enough to yield four characters,the code is padded with zeros.
|1||B F P V|
|2||C G J K Q S X Z|
|No code||A E H I O U Y W|
The SOUNDEX function
This document presents a VBA function (named SOUNDEX) that converts a text string into a Soundex code. This function was developed by Richard J. Yanco.
The function can be used in a worksheet formula, or called from a VBA procedure. The SOUNDEX function is listed below. Notice that this function calls another function named Category.
Function SOUNDEX(Surname As String) As String ' Developed by Richard J. Yanco ' This function follows the Soundex rules given at ' http://home.utah-inter.net/kinsearch/Soundex.html Dim Result As String, c As String * 1 Dim Location As Integer Surname = UCase(Surname) ' First character must be a letter If Asc(Left(Surname, 1)) < 65 Or Asc(Left(Surname, 1)) > 90 Then SOUNDEX = "" Exit Function Else ' St. is converted to Saint If Left(Surname, 3) = "ST." Then Surname = "SAINT" & Mid(Surname, 4) End If ' Convert to Soundex: letters to their appropriate digit, ' A,E,I,O,U,Y ("slash letters") to slashes ' H,W, and everything else to zero-length string Result = Left(Surname, 1) For Location = 2 To Len(Surname) Result = Result & Category(Mid(Surname, Location, 1)) Next Location ' Remove double letters Location = 2 Do While Location < Len(Result) If Mid(Result, Location, 1) = Mid(Result, Location + 1, 1) Then Result = Left(Result, Location) & Mid(Result, Location + 2) Else Location = Location + 1 End If Loop ' If category of 1st letter equals 2nd character, remove 2nd character If Category(Left(Result, 1)) = Mid(Result, 2, 1) Then Result = Left(Result, 1) & Mid(Result, 3) End If ' Remove slashes For Location = 2 To Len(Result) If Mid(Result, Location, 1) = "/" Then Result = Left(Result, Location - 1) & Mid(Result, Location + 1) End If Next ' Trim or pad with zeroes as necessary Select Case Len(Result) Case 4 SOUNDEX = Result Case Is < 4 SOUNDEX = Result & String(4 - Len(Result), "0") Case Is > 4 SOUNDEX = Left(Result, 4) End Select End If End Function Private Function Category(c) As String ' Returns a Soundex code for a letter Select Case True Case c Like "[AEIOUY]" Category = "/" Case c Like "[BPFV]" Category = "1" Case c Like "[CSKGJQXZ]" Category = "2" Case c Like "[DT]" Category = "3" Case c = "L" Category = "4" Case c Like "[MN]" Category = "5" Case c = "R" Category = "6" Case Else 'This includes H and W, spaces, punctuation, etc. Category = "" End Select End Function
The demo file (linked above) contains a list of more than 4,000 names. You can search for a name in the list, and specify an exact match or an approximate match.
If you choose an approximate match, you'll get a list of names that have the same Soundex code as the name you're searching for.
Search for Tips
Browse Tips by Category
Needs tips? Here are two books, with nothing but tips:
Contains more than 200 useful tips and tricks for Excel 2007 | Other Excel 2007 books | Amazon link: John Walkenbach's Favorite Excel 2007 Tips & Tricks