public class EditDistance extends Object implements SequenceDistanceMeasurer, SequenceDistanceMeasurerDouble
EditDistance is an implementation of Wagner and Fischer's dynamic programming algorithm for computing string edit distance.
Edit distance is the minimum cost to transform one string (or sequence) into the other, which is the sum of the costs of the edit operations necessary to do so. This edit distance considers 3 edit operations: Inserts which insert a new element into the sequence, Deletes which remove an element from the sequence, and Changes which replace an element with a different element.
The edit distance is parameterized by costs for the edit operations. We provide two constructors which enable you to specify
3 costs, 1 for each type of edit operation. One of the constructors expects integer costs, and the other double valued costs.
If you specify costs as integers, then all of the distance and distancef methods from the
SequenceDistanceMeasurer
and SequenceDistanceMeasurerDouble
interfaces are available. If costs are specified as doubles, then
only the distancef methods will function, while the distance methods will throw exceptions.
This class supports computing EditDistance for Java String objects or arrays of any of the primitive types, or arrays of objects. It makes no assumptions about the contents of the Strings or arrays, and they can contain duplicates, or can be such that some elements only appear in one or the other of the sequences, or can be of different lengths.
Another class (with same name but in different package) is available if you need to compute distance specifically between permutations,
rather than general sequences. That class is the EditDistance class which computes distance between permutations
of the integers from 0 to N-1.
Runtime: O(n*m), where n and m are the lengths of the two sequences (i.e., Strings or arrays).
Wagner and Fischer's String Edit Distance was introduced in:
R. A. Wagner and M. J. Fischer, "The string-to-string correction problem," Journal of the ACM, vol. 21, no. 1, pp. 168–173, January 1974.
| Constructor and Description |
|---|
EditDistance(double insertCost,
double deleteCost,
double changeCost)
Constructs an edit distance measure with the specified edit operation costs.
|
EditDistance(int insertCost,
int deleteCost,
int changeCost)
Constructs an edit distance measure with the specified edit operation costs.
|
| Modifier and Type | Method and Description |
|---|---|
int |
distance(boolean[] s1,
boolean[] s2)
Measures the distance between two arrays.
|
int |
distance(byte[] s1,
byte[] s2)
Measures the distance between two arrays.
|
int |
distance(char[] s1,
char[] s2)
Measures the distance between two arrays.
|
int |
distance(double[] s1,
double[] s2)
Measures the distance between two arrays.
|
int |
distance(float[] s1,
float[] s2)
Measures the distance between two arrays.
|
int |
distance(int[] s1,
int[] s2)
Measures the distance between two arrays.
|
<T> int |
distance(List<T> s1,
List<T> s2)
Measures the distance between two lists of objects.
|
int |
distance(long[] s1,
long[] s2)
Measures the distance between two arrays.
|
int |
distance(Object[] s1,
Object[] s2)
Measures the distance between two arrays of objects.
|
int |
distance(short[] s1,
short[] s2)
Measures the distance between two arrays.
|
int |
distance(String s1,
String s2)
Measures the distance between two Strings.
|
double |
distancef(boolean[] s1,
boolean[] s2)
Measures the distance between two arrays.
|
double |
distancef(byte[] s1,
byte[] s2)
Measures the distance between two arrays.
|
double |
distancef(char[] s1,
char[] s2)
Measures the distance between two arrays.
|
double |
distancef(double[] s1,
double[] s2)
Measures the distance between two arrays.
|
double |
distancef(float[] s1,
float[] s2)
Measures the distance between two arrays.
|
double |
distancef(int[] s1,
int[] s2)
Measures the distance between two arrays.
|
<T> double |
distancef(List<T> s1,
List<T> s2)
Measures the distance between two lists of objects.
|
double |
distancef(long[] s1,
long[] s2)
Measures the distance between two arrays.
|
double |
distancef(Object[] s1,
Object[] s2)
Measures the distance between two arrays of objects.
|
double |
distancef(short[] s1,
short[] s2)
Measures the distance between two arrays.
|
double |
distancef(String s1,
String s2)
Measures the distance between two Strings.
|
public EditDistance(double insertCost,
double deleteCost,
double changeCost)
insertCost - Cost of an insertion operation. Must be non-negative.deleteCost - Cost of an deletion operation. Must be non-negative.changeCost - Cost of an change operation. Must be non-negative.public EditDistance(int insertCost,
int deleteCost,
int changeCost)
insertCost - Cost of an insertion operation. Must be non-negative.deleteCost - Cost of an deletion operation. Must be non-negative.changeCost - Cost of an change operation. Must be non-negative.public int distance(int[] s1,
int[] s2)
distance in interface SequenceDistanceMeasurers1 - First array.s2 - Second array.UnsupportedOperationException - if costs were initialized with double values.public int distance(long[] s1,
long[] s2)
distance in interface SequenceDistanceMeasurers1 - First array.s2 - Second array.UnsupportedOperationException - if costs were initialized with double values.public int distance(short[] s1,
short[] s2)
distance in interface SequenceDistanceMeasurers1 - First array.s2 - Second array.UnsupportedOperationException - if costs were initialized with double values.public int distance(byte[] s1,
byte[] s2)
distance in interface SequenceDistanceMeasurers1 - First array.s2 - Second array.UnsupportedOperationException - if costs were initialized with double values.public int distance(char[] s1,
char[] s2)
distance in interface SequenceDistanceMeasurers1 - First array.s2 - Second array.UnsupportedOperationException - if costs were initialized with double values.public int distance(boolean[] s1,
boolean[] s2)
distance in interface SequenceDistanceMeasurers1 - First array.s2 - Second array.UnsupportedOperationException - if costs were initialized with double values.public int distance(double[] s1,
double[] s2)
distance in interface SequenceDistanceMeasurers1 - First array.s2 - Second array.UnsupportedOperationException - if costs were initialized with double values.public int distance(float[] s1,
float[] s2)
distance in interface SequenceDistanceMeasurers1 - First array.s2 - Second array.UnsupportedOperationException - if costs were initialized with double values.public int distance(String s1, String s2)
distance in interface SequenceDistanceMeasurers1 - First String.s2 - Second String.UnsupportedOperationException - if costs were initialized with double values.public int distance(Object[] s1, Object[] s2)
distance in interface SequenceDistanceMeasurers1 - First array.s2 - Second array.UnsupportedOperationException - if costs were initialized with double values.public final <T> int distance(List<T> s1, List<T> s2)
distance in interface SequenceDistanceMeasurerT - Type of List elements.s1 - First list.s2 - Second list.UnsupportedOperationException - if costs were initialized with double values.public double distancef(int[] s1,
int[] s2)
distancef in interface SequenceDistanceMeasurerDoubles1 - First array.s2 - Second array.public double distancef(long[] s1,
long[] s2)
distancef in interface SequenceDistanceMeasurerDoubles1 - First array.s2 - Second array.public double distancef(short[] s1,
short[] s2)
distancef in interface SequenceDistanceMeasurerDoubles1 - First array.s2 - Second array.public double distancef(byte[] s1,
byte[] s2)
distancef in interface SequenceDistanceMeasurerDoubles1 - First array.s2 - Second array.public double distancef(char[] s1,
char[] s2)
distancef in interface SequenceDistanceMeasurerDoubles1 - First array.s2 - Second array.public double distancef(boolean[] s1,
boolean[] s2)
distancef in interface SequenceDistanceMeasurerDoubles1 - First array.s2 - Second array.public double distancef(double[] s1,
double[] s2)
distancef in interface SequenceDistanceMeasurerDoubles1 - First array.s2 - Second array.public double distancef(float[] s1,
float[] s2)
distancef in interface SequenceDistanceMeasurerDoubles1 - First array.s2 - Second array.public double distancef(String s1, String s2)
distancef in interface SequenceDistanceMeasurerDoubles1 - First String.s2 - Second String.public double distancef(Object[] s1, Object[] s2)
distancef in interface SequenceDistanceMeasurerDoubles1 - First array.s2 - Second array.public final <T> double distancef(List<T> s1, List<T> s2)
distancef in interface SequenceDistanceMeasurerDoubleT - Type of List elements.s1 - First list.s2 - Second list.Copyright © 2005-2020 Vincent A. Cicirello. All rights reserved.