Module fuzzel
A collection of methods for finding edit distance between two strings
Functions
| LevenshteinDistance_extended (str1, str2, addcost, subcost, delcost) | Finds edit distance between two strings with custom costs. |
| LevenshteinDistance (str1, str2) | Finds simple Levenshtein distance. |
| LevenshteinRatio (str1, str2) | Finds edit ratio between two strings |
| DamerauLevenshteinDistance_extended (str1, str2, addcost, subcost, delcost, trncost) | Finds edit distance between two strings, with custom values. |
| DamerauLevenshteinDistance (str1, str2) | Finds simple Damerau-Levenshtein distance. |
| DamerauLevenshteinRatio (str1, str2) | Finds edit ratio between two strings |
| HammingDistance (str1, str2) | Finds the nubmer of subtitutions needed to turn one string into another. |
| HammingRatio (str1, str2) | Calculates the Hamming distance between two strings, divided by the length of the first string. |
| FuzzyFindDistance (str, ...) | Finds the closest argument to the first argument. |
| FuzzyFindRatio (str, ...) | Finds the closest argument to the first argument. |
| FuzzySortDistance (str, ...) | Sorts inputed strings by distance. |
| FuzzySortRatio (str, ...) | Sorts inputed strings by ratio. |
| FuzzyAutocompleteDistance (str, ...) | Sorts truncated versions of inputed strings by distance. |
| FuzzyAutocompleteRatio (str, ...) | Sorts truncated versions of inputed strings by ratio. |
Fields
| _VERSION | The current version (1.4). |
Functions
- LevenshteinDistance_extended (str1, str2, addcost, subcost, delcost)
-
Finds edit distance between two strings with custom costs.
The levenshtein distance is the minimum number of additions, deletions, and substitutions that are needed to turn one string into another. This methods allows custom costs for addition, deletion, and substitution.
Parameters:
- str1 string the first string
- str2 string the second string
- addcost number the custom cost to add one character
- subcost number the custom cost to subtitute one character for another
- delcost number the custom cost to delete one character
Returns:
-
the distance from the first string to the second (which will always be the same as the distance from the second string to the first)
Usage:
fuzzel.LevenshteinDistance_extended("juice","moose",1,2,3)
- LevenshteinDistance (str1, str2)
-
Finds simple Levenshtein distance.
The levenshtein distance is the minimum number of additions, deletions, and substitutions that are needed to turn one string into another.
Parameters:
Returns:
-
the distance between the two input strings
Usage:
fuzzel.LevenshteinDistance("Flag","Brag")
- LevenshteinRatio (str1, str2)
-
Finds edit ratio between two strings
Parameters:
Returns:
-
the distance between the two strings divided by the length of the first string
Usage:
fuzzel.LevenshteinRatio("bling","bring")
- DamerauLevenshteinDistance_extended (str1, str2, addcost, subcost, delcost, trncost)
-
Finds edit distance between two strings, with custom values.
The minimum number of additions, deletions, substitutions, or transpositions to turn str1 into str2 with the given weights
Parameters:
- str1 string the first string
- str2 string the second string
- addcost number the cost of insterting a character
- subcost number the cost of substituteing one character for another
- delcost number the cost of removeing a character
- trncost number the cost of transposeing two adjacent characters.
Returns:
-
the edit distance between the two strings
Usage:
DamerauLevenshteinDistance_extended("berry","bury",0,1,1,2,1)
- DamerauLevenshteinDistance (str1, str2)
-
Finds simple Damerau-Levenshtein distance.
Parameters:
Returns:
-
the minimum number of additions, deletions, substitutions, or transpositions to turn str1 into str2
Usage:
fuzzel.DamerauLevenshteinDistance("tree","trap")
- DamerauLevenshteinRatio (str1, str2)
-
Finds edit ratio between two strings
Parameters:
Returns:
-
the Damerau-Levenshtein distance divided by the length of the first string.
Usage:
fuzzel.DamerauLevenshteinRatio("pants","hands")
- HammingDistance (str1, str2)
-
Finds the nubmer of subtitutions needed to turn one string into another.
Hamming distance can only be calculated on two strings of equal length.
Parameters:
Returns:
-
the edit distance between str1 and str2
Usage:
fuzzel.HammingDistance("one","two")
fuzzel.HammingDistance("two","three") --Will throw an error, since "two" is 3 characters long while "three" is 5 characters long!
- HammingRatio (str1, str2)
-
Calculates the Hamming distance between two strings, divided by the length of the first string.
Parameters:
Returns:
-
the edit distance between the two strings
Usage:
fuzzel.HammingRatio("four","five")
fuzzel.HammingRatio("seven","ten") -- Will throw an error, since "seven" is 5 characters long while "ten" is 3 characters long
- FuzzyFindDistance (str, ...)
-
Finds the closest argument to the first argument.
Finds the closest argument to the first argument useing Damerau-Levenshtein distance
Parameters:
- str string the string to compare to
- ... A 1-indexed array of strings, or a list of strings to campare str against
Usage:
fuzzel.FuzzyFindDistance("tap","tape","strap","tab")
fuzzel.FuzzyFindDistance("tap",{"tape","strap","tab"})
- FuzzyFindRatio (str, ...)
-
Finds the closest argument to the first argument.
Finds the closest argument to the first argument useing Damerau-Levenshtein ratio
Parameters:
- str string the string to compare to
- ... A 1-indexed array of strings, or a list of strings to campare str against
Usage:
fuzzel.FuzzyFindRatio("light",{"lit","lot","lightbulb"})
fuzzel.FuzzyFindRatio("light","lit","lot","lightbulb")
- FuzzySortDistance (str, ...)
-
Sorts inputed strings by distance.
Finds the Damerau-Levenshtein distance of each string to the first argument, and sorts them into a table accordingly
Parameters:
- str string the string to compare each result to
- ... either a 1-indexed table, or a list of strings to sort
Returns:
-
a 1-indexed table of the input strings, in the order of closest-to str to farthest-from str
Usage:
fuzzel.FuzzySortDistance("tub",{"toothpaste","stub","tube"})
fuzzel.FuzzySortDistance("tub","toothpaste","stub","tube")
- FuzzySortRatio (str, ...)
-
Sorts inputed strings by ratio.
Finds the Damerau-Levenshtein ratio of each string to the first argument, and sorts them into a table accordingly
Parameters:
- str string the string to compare each result to
- ... either a 1-indexed table, or a list of strings to sort
Returns:
-
a 1-indexed table of the input strings, in the order of closest-to str to farthest-from str
Usage:
fuzzel.FuzzySortRatio("can",{"candle","candie","canister"})
fuzzel.FuzzySortRatio("can","candle","candie","canister")
- FuzzyAutocompleteDistance (str, ...)
-
Sorts truncated versions of inputed strings by distance.
truncates each input string, and finds the Damerau-Levenshtein distance of each string to the first argument, and sorts them into a table accordingly. Useful for auto-complete functions
Parameters:
- str string the string to compare each result to
- ... either a 1-indexed table, or a list of strings to sort
Returns:
-
a 1-indexed table of the input strings, in the order of closest-to str to farthest-from str
Usage:
fuzzel.FuzzyAutocompleteDistance("brow",{"brown","brownie","high-brow"})
fuzzel.FuzzyAutocompleteDistance("brow","brown","brownie","high-brow")
- FuzzyAutocompleteRatio (str, ...)
-
Sorts truncated versions of inputed strings by ratio.
truncates each input string, and finds the Damerau-Levenshtein ratio of each string to the first argument, and sorts them into a table accordingly. Useful for auto-complete functions
Parameters:
- str string the string to compare each result to
- ... either a 1-indexed table, or a list of strings to sort
Returns:
-
a 1-indexed table of the input strings, in the order of closest-to str to farthest-from str
Usage:
fuzzel.FuzzyAutocompleteRatio("egg",{"eggman","excelent","excaliber"})
fuzzel.FuzzyAutocompleteRatio("egg","eggman","excelent","excaliber")