Class UCPFilter

  • Direct Known Subclasses:
    UCPFilter.UCPCharsetFilter

    public abstract class UCPFilter
    extends Object
    A filter for Unicode code points. It also maintains a cache of filters associated with Charset instances.

    For charset-based filters, this class may perform slowly when the cache is built, if the JVM is running with an active debugging agent. This is because the JRE implements the acceptability test by throwing and catching an exception, which is trapped by the agent; if the charset can only encode a small subset of the Unicode code points, then a lot of exceptions are thrown and caught, resulting in a performance degradation as the agent intercepts repeatedly (even if the debugger does not indicate an interest in exceptions).

    Since:
    0.6.0
    Version:
    $Id: UCPFilter.java 16154 2012-07-14 16:34:05Z colin $
    Author:
    tlerios@marketcetera.com
    • Field Detail

      • CHAR

        public static final UCPFilter CHAR
        A filter for Unicode characters that can be represented by a single char.
      • DIGIT

        public static final UCPFilter DIGIT
        A filter for Unicode code points that are digits.
      • LETTER

        public static final UCPFilter LETTER
        A filter for Unicode code points that are letters.
      • ALNUM

        public static final UCPFilter ALNUM
        A filter for Unicode code points that are letters or digits.
    • Constructor Detail

      • UCPFilter

        public UCPFilter()
    • Method Detail

      • forCharset

        public static UCPFilter forCharset​(Charset cs)
        Returns a filter for Unicode code points that can be encoded by the given charset.
        Parameters:
        cs - The charset.
        Returns:
        The filter.
      • getDefaultCharset

        public static final UCPFilter getDefaultCharset()
        Returns a filter for Unicode code points that can be encoded by the default JVM charset.
        Returns:
        The filter.
      • getFileSystemCharset

        public static final UCPFilter getFileSystemCharset()
        Returns a filter for Unicode code points that can be encoded by the current system file encoding/charset (as specified in the system property file.encoding).
        Returns:
        The filter.
      • isAcceptable

        public abstract boolean isAcceptable​(int ucp)
        Checks whether the given Unicode code point is acceptable to the receiver.
        Parameters:
        ucp - The code point.
        Returns:
        True if so.