Class NGramFingerprintKeyer
java.lang.Object
com.google.refine.clustering.binning.Keyer
com.google.refine.clustering.binning.FingerprintKeyer
com.google.refine.clustering.binning.NGramFingerprintKeyer
Fingerprint keyer which generates a fingerprint from a sorted list of unique character N-grams after removing all
whitespace, control characters, and punctuation. N-grams are concatenated to form a single output key.
-
Field Summary
Fields inherited from class com.google.refine.clustering.binning.FingerprintKeyer
DIACRITICS_AND_FRIENDS -
Constructor Summary
Constructors -
Method Summary
Methods inherited from class com.google.refine.clustering.binning.FingerprintKeyer
asciify, normalize, normalize, stripDiacritics
-
Constructor Details
-
NGramFingerprintKeyer
public NGramFingerprintKeyer()
-
-
Method Details
-
key
- Overrides:
keyin classFingerprintKeyer
-
sorted_ngrams
Generate a stream of sorted unique character N-grams from a string- Parameters:
s- String to generate N-grams fromsize- number of characters per N-gram- Returns:
- a stream of sorted unique N-gram Strings
-
ngram_split
Deprecated.2020-10-17 by tfmorris. Usesorted_ngrams(String, int)
-