Fuzzel

Fuzzel

Fuzzy String Matching for Lua

Download

What is it?

One bored afternoon, I dusted off my old notes from an algorithm class I took, and tried my hand at some dynamic programming. Fuzzel is the result of that.
Fuzzel uses Damerau-Levenshtein Distance to do fuzzy string matching. It also includes pure Levenshtien distance, and Hamming Distance for shits and giggles.

How do I use it?

Simply include it in your file. Note that the source also comes with fuzzel_min.lua, this is a minified version of fuzzel that works exactly the same. If you need clients to download this code often, or don't have much bandwith, consider useing the minified version.

Docs or something I guess

In addition to these methods, there are shortened versions of these methods at the bottom, for example:

fuzzel.dld
is the same as
fuzzel.DamerauLevenshtienDistance

fuzzel.LevenshtienDistance_extended(string_first, string_second, number_addcost, number_substituecost, number_deletecost)
Calculates the Levenshtien Distance between two strings, useing the costs given. "Real" Levenshtien Distance uses values 1,1,1 for costs. returns number_distance

fuzzel.LevenshtienDistance(string_first, strings_second)
Calculates the "real" Levenshtien Distance returns number_distance

fuzzel.LevensteinRatio(string_first, string_second)
The Levenshtien Ratio divided by the first string's length. Useing a ratio is a decent way to determin if a spelling is "close enough" returns number_distance

fuzzel.DamerauLevenshtienDistance_extended(string_first, string_second, number_addcost, number_substituecost, number_deletecost, number_transpositioncost)
Damerau-Levenshtien Distance is almost exactly like Levenshtien Distance, with the caveat that two letters next to each other, with swapped positions only counts as "one" cost (in "real" Damerau-Levenshtien Distance) returns number

fuzzel.DamerauLevenshtienDistance(stirng_first, strings_second)
Calculates the "real" Damerau-Levenshtien Distance returns number

fuzzel.DamerauLevenshtienRatio(string_first, string_second)
The Damerau-Levenshtien Distance divided by the first string's length returns number

fuzzel.HammingDistance(string_first, string_second)
Purely the number of substitutions needed to change one string into another. Note that both strings must be the same length. returns number

fuzzel.HammingRatio(string_first, string_second)
The hamming distance divided by the length of the first string returns number

fuzzel.FuzzySearchDistance(string_needle, vararg_in)
in may be either a table, or a list of arguments. fuzzel.FuzzySearchDistance will find the string that most closely resembles needle, based on Damerau-Levenshtien Distance returns string_closest, number_distance

fuzzel.FuzzySearchRatio(string_needle, vararg_in)
in may be either a table, or a list of arguments. Same as above, except it returns the string with the closest Damerau-Levenshtien ratio. returns string_closest, nubmer_ratio

Liscense

This code is public domain, feel free to use it any of your proejcts and redistribute it however you like. If you're useing the minified version it would be nice (though not required) to leave the link at the top of the file, so people that want to find the original, can.