org.marketcetera.util.misc
Class UCPFilter

java.lang.Object
  extended by org.marketcetera.util.misc.UCPFilter

public abstract class UCPFilter
extends Object

A filter for Unicode code points. It also maintains a cache of filters associated with Charset instances.

For charset-based filters, this class may perform slowly when the cache is built, if the JVM is running with an active debugging agent. This is because the JRE implements the acceptability test by throwing and catching an exception, which is trapped by the agent; if the charset can only encode a small subset of the Unicode code points, then a lot of exceptions are thrown and caught, resulting in a performance degradation as the agent intercepts repeatedly (even if the debugger does not indicate an interest in exceptions).

Since:
0.6.0
Version:
$Id: UCPFilter.java 16154 2012-07-14 16:34:05Z colin $
Author:
tlerios@marketcetera.com

Field Summary
static UCPFilter ALNUM
          A filter for Unicode code points that are letters or digits.
static UCPFilter CHAR
          A filter for Unicode characters that can be represented by a single char.
static UCPFilter DIGIT
          A filter for Unicode code points that are digits.
static UCPFilter LETTER
          A filter for Unicode code points that are letters.
static UCPFilter VALID
          A filter for Unicode characters deemed valid by StringUtils.isValid(int).
 
Constructor Summary
UCPFilter()
           
 
Method Summary
static UCPFilter forCharset(Charset cs)
          Returns a filter for Unicode code points that can be encoded by the given charset.
static UCPFilter getDefaultCharset()
          Returns a filter for Unicode code points that can be encoded by the default JVM charset.
static UCPFilter getFileSystemCharset()
          Returns a filter for Unicode code points that can be encoded by the current system file encoding/charset (as specified in the system property file.encoding).
abstract  boolean isAcceptable(int ucp)
          Checks whether the given Unicode code point is acceptable to the receiver.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

VALID

public static final UCPFilter VALID
A filter for Unicode characters deemed valid by StringUtils.isValid(int).


CHAR

public static final UCPFilter CHAR
A filter for Unicode characters that can be represented by a single char.


DIGIT

public static final UCPFilter DIGIT
A filter for Unicode code points that are digits.


LETTER

public static final UCPFilter LETTER
A filter for Unicode code points that are letters.


ALNUM

public static final UCPFilter ALNUM
A filter for Unicode code points that are letters or digits.

Constructor Detail

UCPFilter

public UCPFilter()
Method Detail

forCharset

public static UCPFilter forCharset(Charset cs)
Returns a filter for Unicode code points that can be encoded by the given charset.

Parameters:
cs - The charset.
Returns:
The filter.

getDefaultCharset

public static final UCPFilter getDefaultCharset()
Returns a filter for Unicode code points that can be encoded by the default JVM charset.

Returns:
The filter.

getFileSystemCharset

public static final UCPFilter getFileSystemCharset()
Returns a filter for Unicode code points that can be encoded by the current system file encoding/charset (as specified in the system property file.encoding).

Returns:
The filter.

isAcceptable

public abstract boolean isAcceptable(int ucp)
Checks whether the given Unicode code point is acceptable to the receiver.

Parameters:
ucp - The code point.
Returns:
True if so.


Copyright © 2012. All Rights Reserved.