Class DefaultStringNormalizer

java.lang.Object
org.tentackle.common.DefaultStringNormalizer
All Implemented Interfaces:
StringNormalizer

@Service(StringNormalizer.class) public class DefaultStringNormalizer extends Object implements StringNormalizer
The default normalizer (works sufficiently for most western languages).
Author:
harald
  • Constructor Details

    • DefaultStringNormalizer

      public DefaultStringNormalizer(char fieldSeparator, char wordSeparator)
      Creates a normalizer.
      Parameters:
      fieldSeparator - separator between text fields, 0 if none
      wordSeparator - separator between words during reduction
    • DefaultStringNormalizer

      public DefaultStringNormalizer()
      Creates normalizer.
      With a comma as the field separator and space as word separator.
  • Method Details

    • unDiacrit

      public String unDiacrit(String str, boolean keepLength)
      Description copied from interface: StringNormalizer
      Converts special unicode characters (so-called diacrits) to standard ascii.
      Supports also special german and northern european "umlauts".
      Specified by:
      unDiacrit in interface StringNormalizer
      Parameters:
      str - the string to be converted
      keepLength - true if the length should be kept, i.e. no Ä to AE, but to A
      Returns:
      the converted string
    • normalize

      public String normalize(String str)
      Normalizes a string (phonetically) for use as PDO.normText.
      Specified by:
      normalize in interface StringNormalizer
      Parameters:
      str - the string to be normalized
      Returns:
      the normalized string