Package org.marketcetera.util.test
Class UnicodeData
- java.lang.Object
-
- org.marketcetera.util.test.UnicodeData
-
public final class UnicodeData extends Object
Unicode test data. Some were obtained by applying an online converter onto the results of Google translation.- Since:
- 0.6.0
- Version:
- $Id$
- Author:
- tlerios@marketcetera.com
-
-
Field Summary
Fields Modifier and Type Field Description static StringCOMBOA combo string that includes "Hello" in English, "Language" in Norwegian, "HELLO" in Greek, "house" in Arabic, "goodbye" in Japanese, and the G-clef, each successive pair separated by exactly one space.static char[]COMBO_CHARSThe combo string, as a character array.static byte[]COMBO_NATThe combo string, in the default encoding.static int[]COMBO_UCPSThe combo string, as a Unicode code point array.static byte[]COMBO_UTF16BEThe combo string, in UTF-16BE.static byte[]COMBO_UTF16LEThe combo string, in UTF-16LE.static byte[]COMBO_UTF32BEThe combo string, in UTF-32BE.static byte[]COMBO_UTF32LEThe combo string, in UTF-32LE.static byte[]COMBO_UTF8The combo string, in UTF-8.static StringG_CLEF_MSCThe musical symbol G-clef.static char[]G_CLEF_MSC_CHARSThe G-clef, as a character array.static byte[]G_CLEF_MSC_NATThe G-clef, in the default encoding.static int[]G_CLEF_MSC_UCPSThe G-clef, as a Unicode code point array.static byte[]G_CLEF_MSC_UTF16BEThe G-clef, in UTF-16BE.static byte[]G_CLEF_MSC_UTF16LEThe G-clef, in UTF-16LE.static byte[]G_CLEF_MSC_UTF32BEThe G-clef, in UTF-32BE.static byte[]G_CLEF_MSC_UTF32LEThe G-clef, in UTF-32LE.static byte[]G_CLEF_MSC_UTF8The G-clef, in UTF-8.static StringGOATS_LNBThe Linear B ideograms for she-goat and he-goat (in this order and separated by a space).static char[]GOATS_LNB_CHARSThe Linear B goat ideograms, as a character array.static byte[]GOATS_LNB_NATThe Linear B goat ideograms, in the default encoding.static int[]GOATS_LNB_UCPSThe Linear B goat ideograms, as a Unicode code point array.static byte[]GOATS_LNB_UTF16BEThe Linear B goat ideograms, in UTF-16BE.static byte[]GOATS_LNB_UTF16LEThe Linear B goat ideograms, in UTF-16LE.static byte[]GOATS_LNB_UTF32BEThe Linear B goat ideograms, in UTF-32BE.static byte[]GOATS_LNB_UTF32LEThe Linear B goat ideograms, in UTF-32LE.static byte[]GOATS_LNB_UTF8The Linear B goat ideograms, in UTF-8.static StringGOODBYE_JA"goodbye" (pronounced "sayonara") in Japanese, in the Hiragana writing system.static char[]GOODBYE_JA_CHARS"goodbye" in Japanese, as a character array.static byte[]GOODBYE_JA_NAT"goodbye" in Japanese, in the default encoding.static int[]GOODBYE_JA_UCPS"goodbye" in Japanese, as a Unicode code point array.static byte[]GOODBYE_JA_UTF16BE"goodbye" in Japanese, in UTF-16BE.static byte[]GOODBYE_JA_UTF16LE"goodbye" in Japanese, in UTF-16LE.static byte[]GOODBYE_JA_UTF32BE"goodbye" in Japanese, in UTF-32BE.static byte[]GOODBYE_JA_UTF32LE"goodbye" in Japanese, in UTF-32LE.static byte[]GOODBYE_JA_UTF8"goodbye" in Japanese, in UTF-8.static StringHELLO_EN"Hello" in English.static char[]HELLO_EN_CHARS"Hello" in English, as a character array.static byte[]HELLO_EN_NAT"Hello" in English, in the default encoding.static int[]HELLO_EN_UCPS"Hello" in English, as a Unicode code point array.static byte[]HELLO_EN_UTF16BE"Hello" in English, in UTF-16BE.static byte[]HELLO_EN_UTF16LE"Hello" in English, in UTF-16LE.static byte[]HELLO_EN_UTF32BE"Hello" in English, in UTF-32BE.static byte[]HELLO_EN_UTF32LE"Hello" in English, in UTF-32LE.static byte[]HELLO_EN_UTF8"Hello" in English, in UTF-8.static StringHELLO_GR"HELLO" (pronounced "yassou") in Greek: this is the word "hello" in all uppercase Greek letters (it is, in fact, two Greek words, separated by a space).static char[]HELLO_GR_CHARS"HELLO" in Greek, as a character array.static byte[]HELLO_GR_NAT"HELLO" in Greek, in the default encoding.static int[]HELLO_GR_UCPS"HELLO" in Greek, as a Unicode code point array.static byte[]HELLO_GR_UTF16BE"HELLO" in Greek, in UTF-16BE.static byte[]HELLO_GR_UTF16LE"HELLO" in Greek, in UTF-16LE.static byte[]HELLO_GR_UTF32BE"HELLO" in Greek, in UTF-32BE.static byte[]HELLO_GR_UTF32LE"HELLO" in Greek, in UTF-32LE.static byte[]HELLO_GR_UTF8"HELLO" in Greek, in UTF-8.static StringHOUSE_AR"house" (pronounced "manzil") in Arabic.static char[]HOUSE_AR_CHARS"house" in Arabic, as a character array.static byte[]HOUSE_AR_NAT"house" in Arabic, in the default encoding.static int[]HOUSE_AR_UCPS"house" in Arabic, as a Unicode code point array.static byte[]HOUSE_AR_UTF16BE"house" in Arabic, in UTF-16BE.static byte[]HOUSE_AR_UTF16LE"house" in Arabic, in UTF-16LE.static byte[]HOUSE_AR_UTF32BE"house" in Arabic, in UTF-32BE.static byte[]HOUSE_AR_UTF32LE"house" in Arabic, in UTF-32LE.static byte[]HOUSE_AR_UTF8"house" in Arabic, in UTF-8.static StringINVALIDAn invalid string, comprising an isolated 16-bit surrogate.static char[]INVALID_CHARSAn invalid string, comprising an isolated 16-bit surrogate, as a character array.static int[]INVALID_UCPSA Unicode code point comprising an isolated surrogate code point.static byte[]INVALID_UTF16BEA byte array comprising an invalid UTF-16BE byte sequence (an isolated 16-bit surrogate).static byte[]INVALID_UTF16LEA byte array comprising an invalid UTF-16LE byte sequence (an isolated 16-bit surrogate).static byte[]INVALID_UTF32BEA byte array comprising an invalid UTF-32BE byte sequence (a 32-bit value outside the valid range for Unicode scalar values).static byte[]INVALID_UTF32LEA byte array comprising an invalid UTF-32LE byte sequence (a 32-bit value outside the valid range for Unicode scalar values).static byte[]INVALID_UTF8A byte array comprising an invalid UTF-8 byte sequence (the first 3 bytes of a 4-byte sequence).static StringLANGUAGE_NO"Language" (pronounced "sprook") in Norwegian: this is the word "language" in Norwegian, with the first letter capitalized.static char[]LANGUAGE_NO_CHARS"Language" in Norwegian, as a character array.static byte[]LANGUAGE_NO_NAT"Language" in Norwegian, in the default encoding.static int[]LANGUAGE_NO_UCPS"Language" in Norwegian, as a Unicode code point array.static byte[]LANGUAGE_NO_UTF16BE"Language" in Norwegian, in UTF-16BE.static byte[]LANGUAGE_NO_UTF16LE"Language" in Norwegian, in UTF-16LE.static byte[]LANGUAGE_NO_UTF32BE"Language" in Norwegian, in UTF-32BE.static byte[]LANGUAGE_NO_UTF32LE"Language" in Norwegian, in UTF-32LE.static byte[]LANGUAGE_NO_UTF8"Language" in Norwegian, in UTF-8.static StringSPACEThe space character.static char[]SPACE_CHARSThe space character, as a character array.static byte[]SPACE_NATThe space character, in the default encoding.static int[]SPACE_UCPSThe space character, as a Unicode code point array.static byte[]SPACE_UTF16BEThe space, in UTF-16BE.static byte[]SPACE_UTF16LEThe space, in UTF-16LE.static byte[]SPACE_UTF32BEThe space, in UTF-32BE.static byte[]SPACE_UTF32LEThe space, in UTF-32LE.static byte[]SPACE_UTF8The space, in UTF-8.
-
Constructor Summary
Constructors Modifier Constructor Description privateUnicodeData()Constructor.
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description private static byte[]concat(byte[]... arrays)Concatenates the given byte arrays and returns the result.private static char[]concat(char[]... arrays)Concatenates the given character arrays and returns the result.private static int[]concat(int[]... arrays)Concatenates the given integer arrays and returns the result.
-
-
-
Field Detail
-
SPACE
public static final String SPACE
The space character.- See Also:
- Constant Field Values
-
SPACE_CHARS
public static final char[] SPACE_CHARS
The space character, as a character array.
-
SPACE_UCPS
public static final int[] SPACE_UCPS
The space character, as a Unicode code point array.
-
SPACE_NAT
public static final byte[] SPACE_NAT
The space character, in the default encoding.
-
SPACE_UTF8
public static final byte[] SPACE_UTF8
The space, in UTF-8.
-
SPACE_UTF16BE
public static final byte[] SPACE_UTF16BE
The space, in UTF-16BE.
-
SPACE_UTF16LE
public static final byte[] SPACE_UTF16LE
The space, in UTF-16LE.
-
SPACE_UTF32BE
public static final byte[] SPACE_UTF32BE
The space, in UTF-32BE.
-
SPACE_UTF32LE
public static final byte[] SPACE_UTF32LE
The space, in UTF-32LE.
-
HELLO_EN
public static final String HELLO_EN
"Hello" in English.- See Also:
- Constant Field Values
-
HELLO_EN_CHARS
public static final char[] HELLO_EN_CHARS
"Hello" in English, as a character array.
-
HELLO_EN_UCPS
public static final int[] HELLO_EN_UCPS
"Hello" in English, as a Unicode code point array.
-
HELLO_EN_NAT
public static final byte[] HELLO_EN_NAT
"Hello" in English, in the default encoding.
-
HELLO_EN_UTF8
public static final byte[] HELLO_EN_UTF8
"Hello" in English, in UTF-8.
-
HELLO_EN_UTF16BE
public static final byte[] HELLO_EN_UTF16BE
"Hello" in English, in UTF-16BE.
-
HELLO_EN_UTF16LE
public static final byte[] HELLO_EN_UTF16LE
"Hello" in English, in UTF-16LE.
-
HELLO_EN_UTF32BE
public static final byte[] HELLO_EN_UTF32BE
"Hello" in English, in UTF-32BE.
-
HELLO_EN_UTF32LE
public static final byte[] HELLO_EN_UTF32LE
"Hello" in English, in UTF-32LE.
-
LANGUAGE_NO
public static final String LANGUAGE_NO
"Language" (pronounced "sprook") in Norwegian: this is the word "language" in Norwegian, with the first letter capitalized.- See Also:
- Constant Field Values
-
LANGUAGE_NO_CHARS
public static final char[] LANGUAGE_NO_CHARS
"Language" in Norwegian, as a character array.
-
LANGUAGE_NO_UCPS
public static final int[] LANGUAGE_NO_UCPS
"Language" in Norwegian, as a Unicode code point array.
-
LANGUAGE_NO_NAT
public static final byte[] LANGUAGE_NO_NAT
"Language" in Norwegian, in the default encoding.
-
LANGUAGE_NO_UTF8
public static final byte[] LANGUAGE_NO_UTF8
"Language" in Norwegian, in UTF-8.
-
LANGUAGE_NO_UTF16BE
public static final byte[] LANGUAGE_NO_UTF16BE
"Language" in Norwegian, in UTF-16BE.
-
LANGUAGE_NO_UTF16LE
public static final byte[] LANGUAGE_NO_UTF16LE
"Language" in Norwegian, in UTF-16LE.
-
LANGUAGE_NO_UTF32BE
public static final byte[] LANGUAGE_NO_UTF32BE
"Language" in Norwegian, in UTF-32BE.
-
LANGUAGE_NO_UTF32LE
public static final byte[] LANGUAGE_NO_UTF32LE
"Language" in Norwegian, in UTF-32LE.
-
HELLO_GR
public static final String HELLO_GR
"HELLO" (pronounced "yassou") in Greek: this is the word "hello" in all uppercase Greek letters (it is, in fact, two Greek words, separated by a space).- See Also:
- Constant Field Values
-
HELLO_GR_CHARS
public static final char[] HELLO_GR_CHARS
"HELLO" in Greek, as a character array.
-
HELLO_GR_UCPS
public static final int[] HELLO_GR_UCPS
"HELLO" in Greek, as a Unicode code point array.
-
HELLO_GR_NAT
public static final byte[] HELLO_GR_NAT
"HELLO" in Greek, in the default encoding.
-
HELLO_GR_UTF8
public static final byte[] HELLO_GR_UTF8
"HELLO" in Greek, in UTF-8.
-
HELLO_GR_UTF16BE
public static final byte[] HELLO_GR_UTF16BE
"HELLO" in Greek, in UTF-16BE.
-
HELLO_GR_UTF16LE
public static final byte[] HELLO_GR_UTF16LE
"HELLO" in Greek, in UTF-16LE.
-
HELLO_GR_UTF32BE
public static final byte[] HELLO_GR_UTF32BE
"HELLO" in Greek, in UTF-32BE.
-
HELLO_GR_UTF32LE
public static final byte[] HELLO_GR_UTF32LE
"HELLO" in Greek, in UTF-32LE.
-
HOUSE_AR
public static final String HOUSE_AR
"house" (pronounced "manzil") in Arabic.- See Also:
- Constant Field Values
-
HOUSE_AR_CHARS
public static final char[] HOUSE_AR_CHARS
"house" in Arabic, as a character array.
-
HOUSE_AR_UCPS
public static final int[] HOUSE_AR_UCPS
"house" in Arabic, as a Unicode code point array.
-
HOUSE_AR_NAT
public static final byte[] HOUSE_AR_NAT
"house" in Arabic, in the default encoding.
-
HOUSE_AR_UTF8
public static final byte[] HOUSE_AR_UTF8
"house" in Arabic, in UTF-8.
-
HOUSE_AR_UTF16BE
public static final byte[] HOUSE_AR_UTF16BE
"house" in Arabic, in UTF-16BE.
-
HOUSE_AR_UTF16LE
public static final byte[] HOUSE_AR_UTF16LE
"house" in Arabic, in UTF-16LE.
-
HOUSE_AR_UTF32BE
public static final byte[] HOUSE_AR_UTF32BE
"house" in Arabic, in UTF-32BE.
-
HOUSE_AR_UTF32LE
public static final byte[] HOUSE_AR_UTF32LE
"house" in Arabic, in UTF-32LE.
-
GOODBYE_JA
public static final String GOODBYE_JA
"goodbye" (pronounced "sayonara") in Japanese, in the Hiragana writing system.- See Also:
- Constant Field Values
-
GOODBYE_JA_CHARS
public static final char[] GOODBYE_JA_CHARS
"goodbye" in Japanese, as a character array.
-
GOODBYE_JA_UCPS
public static final int[] GOODBYE_JA_UCPS
"goodbye" in Japanese, as a Unicode code point array.
-
GOODBYE_JA_NAT
public static final byte[] GOODBYE_JA_NAT
"goodbye" in Japanese, in the default encoding.
-
GOODBYE_JA_UTF8
public static final byte[] GOODBYE_JA_UTF8
"goodbye" in Japanese, in UTF-8.
-
GOODBYE_JA_UTF16BE
public static final byte[] GOODBYE_JA_UTF16BE
"goodbye" in Japanese, in UTF-16BE.
-
GOODBYE_JA_UTF16LE
public static final byte[] GOODBYE_JA_UTF16LE
"goodbye" in Japanese, in UTF-16LE.
-
GOODBYE_JA_UTF32BE
public static final byte[] GOODBYE_JA_UTF32BE
"goodbye" in Japanese, in UTF-32BE.
-
GOODBYE_JA_UTF32LE
public static final byte[] GOODBYE_JA_UTF32LE
"goodbye" in Japanese, in UTF-32LE.
-
GOATS_LNB
public static final String GOATS_LNB
The Linear B ideograms for she-goat and he-goat (in this order and separated by a space).- See Also:
- Constant Field Values
-
GOATS_LNB_CHARS
public static final char[] GOATS_LNB_CHARS
The Linear B goat ideograms, as a character array.
-
GOATS_LNB_UCPS
public static final int[] GOATS_LNB_UCPS
The Linear B goat ideograms, as a Unicode code point array.
-
GOATS_LNB_NAT
public static final byte[] GOATS_LNB_NAT
The Linear B goat ideograms, in the default encoding.
-
GOATS_LNB_UTF8
public static final byte[] GOATS_LNB_UTF8
The Linear B goat ideograms, in UTF-8.
-
GOATS_LNB_UTF16BE
public static final byte[] GOATS_LNB_UTF16BE
The Linear B goat ideograms, in UTF-16BE.
-
GOATS_LNB_UTF16LE
public static final byte[] GOATS_LNB_UTF16LE
The Linear B goat ideograms, in UTF-16LE.
-
GOATS_LNB_UTF32BE
public static final byte[] GOATS_LNB_UTF32BE
The Linear B goat ideograms, in UTF-32BE.
-
GOATS_LNB_UTF32LE
public static final byte[] GOATS_LNB_UTF32LE
The Linear B goat ideograms, in UTF-32LE.
-
G_CLEF_MSC
public static final String G_CLEF_MSC
The musical symbol G-clef.- See Also:
- Constant Field Values
-
G_CLEF_MSC_CHARS
public static final char[] G_CLEF_MSC_CHARS
The G-clef, as a character array.
-
G_CLEF_MSC_UCPS
public static final int[] G_CLEF_MSC_UCPS
The G-clef, as a Unicode code point array.
-
G_CLEF_MSC_NAT
public static final byte[] G_CLEF_MSC_NAT
The G-clef, in the default encoding.
-
G_CLEF_MSC_UTF8
public static final byte[] G_CLEF_MSC_UTF8
The G-clef, in UTF-8.
-
G_CLEF_MSC_UTF16BE
public static final byte[] G_CLEF_MSC_UTF16BE
The G-clef, in UTF-16BE.
-
G_CLEF_MSC_UTF16LE
public static final byte[] G_CLEF_MSC_UTF16LE
The G-clef, in UTF-16LE.
-
G_CLEF_MSC_UTF32BE
public static final byte[] G_CLEF_MSC_UTF32BE
The G-clef, in UTF-32BE.
-
G_CLEF_MSC_UTF32LE
public static final byte[] G_CLEF_MSC_UTF32LE
The G-clef, in UTF-32LE.
-
COMBO
public static final String COMBO
A combo string that includes "Hello" in English, "Language" in Norwegian, "HELLO" in Greek, "house" in Arabic, "goodbye" in Japanese, and the G-clef, each successive pair separated by exactly one space.- See Also:
- Constant Field Values
-
COMBO_CHARS
public static final char[] COMBO_CHARS
The combo string, as a character array.
-
COMBO_UCPS
public static final int[] COMBO_UCPS
The combo string, as a Unicode code point array.
-
COMBO_NAT
public static final byte[] COMBO_NAT
The combo string, in the default encoding.
-
COMBO_UTF8
public static final byte[] COMBO_UTF8
The combo string, in UTF-8.
-
COMBO_UTF16BE
public static final byte[] COMBO_UTF16BE
The combo string, in UTF-16BE.
-
COMBO_UTF16LE
public static final byte[] COMBO_UTF16LE
The combo string, in UTF-16LE.
-
COMBO_UTF32BE
public static final byte[] COMBO_UTF32BE
The combo string, in UTF-32BE.
-
COMBO_UTF32LE
public static final byte[] COMBO_UTF32LE
The combo string, in UTF-32LE.
-
INVALID
public static final String INVALID
An invalid string, comprising an isolated 16-bit surrogate.- See Also:
- Constant Field Values
-
INVALID_CHARS
public static final char[] INVALID_CHARS
An invalid string, comprising an isolated 16-bit surrogate, as a character array.
-
INVALID_UCPS
public static final int[] INVALID_UCPS
A Unicode code point comprising an isolated surrogate code point.
-
INVALID_UTF8
public static final byte[] INVALID_UTF8
A byte array comprising an invalid UTF-8 byte sequence (the first 3 bytes of a 4-byte sequence).
-
INVALID_UTF16BE
public static final byte[] INVALID_UTF16BE
A byte array comprising an invalid UTF-16BE byte sequence (an isolated 16-bit surrogate).
-
INVALID_UTF16LE
public static final byte[] INVALID_UTF16LE
A byte array comprising an invalid UTF-16LE byte sequence (an isolated 16-bit surrogate).
-
INVALID_UTF32BE
public static final byte[] INVALID_UTF32BE
A byte array comprising an invalid UTF-32BE byte sequence (a 32-bit value outside the valid range for Unicode scalar values).
-
INVALID_UTF32LE
public static final byte[] INVALID_UTF32LE
A byte array comprising an invalid UTF-32LE byte sequence (a 32-bit value outside the valid range for Unicode scalar values).
-
-
Method Detail
-
concat
private static byte[] concat(byte[]... arrays)
Concatenates the given byte arrays and returns the result.- Parameters:
arrays- The arrays.- Returns:
- The concatenated arrays.
-
concat
private static int[] concat(int[]... arrays)
Concatenates the given integer arrays and returns the result.- Parameters:
arrays- The arrays.- Returns:
- The concatenated arrays.
-
concat
private static char[] concat(char[]... arrays)
Concatenates the given character arrays and returns the result.- Parameters:
arrays- The arrays.- Returns:
- The concatenated arrays.
-
-