Expand description
This library implements string similarity metrics.
Structs§
- specialized hashmap to store user provided types this implementation relies on a couple of base assumptions in order to simplify the implementation
- RowId 🔒
Enums§
Functions§
- bigrams 🔒Returns an Iterator of char tuples.
- Like optimal string alignment, but substrings can be edited an unlimited number of times, and the triangle inequality holds.
- Like optimal string alignment, but substrings can be edited an unlimited number of times, and the triangle inequality holds.
- Calculates the number of positions in the two sequences where the elements differ. Returns an error if the sequences have different lengths.
- Calculates the Jaro similarity between two sequences. The returned value is between 0.0 and 1.0 (higher value means more similar).
- Like Jaro but gives a boost to sequences that have a common prefix.
- Calculates the minimum number of insertions, deletions, and substitutions required to change one sequence into the other.
- Calculates the number of positions in the two strings where the characters differ. Returns an error if the strings have different lengths.
- Calculates the Jaro similarity between two strings. The returned value is between 0.0 and 1.0 (higher value means more similar).
- Like Jaro but gives a boost to strings that have a common prefix.
- Calculates the minimum number of insertions, deletions, and substitutions required to change one string into the other.
- Calculates a normalized score of the Damerau–Levenshtein algorithm between 0.0 and 1.0 (inclusive), where 1.0 means the strings are the same.
- Calculates a normalized score of the Levenshtein algorithm between 0.0 and 1.0 (inclusive), where 1.0 means the strings are the same.
- Like Levenshtein but allows for adjacent transpositions. Each substring can only be edited once.
- Calculates a Sørensen-Dice similarity distance using bigrams. See https://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient.