StreamTokenizer keeps track of the position of
the tokens in the input stream, and it can parse hexadecimal numbers and
double numbers with exponents.
The handling of numeric data is also different: a single dot '.' and minus dot '-.' are not treated as numbers.
- Author:
- Werner Randelshofer
-
Field Summary
FieldsModifier and TypeFieldDescriptiondoubleIf the current token is a number, this field contains the value of that number.If the current token is a word token, this field contains a string giving the characters of the word token.static final intA constant indicating that the end of the stream has been read.static final intA constant indicating that the end of the line has been read.static final intA constant indicating that a number token has been read.static final intA constant indicating that a word token has been read.intAfter a call to thenextmethod, this field contains the type of the token just read. -
Constructor Summary
ConstructorsConstructorDescriptionInitializes everything except the streams.Create a tokenizer that parses the given character stream.Create a tokenizer that parses the given character stream. -
Method Summary
Modifier and TypeMethodDescriptionvoidcommentChar(int ch) Specified that the character argument starts a single-line comment.voidconsumeGreedy(@NonNull String greedyToken) Consumes a substring from the current sval of the StreamPosTokenizer.createException(String errorMessage) voideolIsSignificant(boolean flag) Determines whether or not ends of line are treated as tokens.intReturns the end position of the token relative to the position that the stream had, when the StreamPosTokenizer was constructed.intReturns the start position of the token relative to the position that the stream had, when the StreamPosTokenizer was constructed.intlineno()Return the current line number.voidlowerCaseMode(boolean fl) Determines whether or not word token are automatically lowercased.intnextChar()Reads the next character from the input stream, without passing it to the tokenizer.intParses the next token from the input stream of this tokenizer.voidordinaryChar(int ch) Specifies that the character argument is "ordinary" in this tokenizer.voidordinaryChars(int low, int hi) Specifies that all characters c in the rangelow <= c <= highare "ordinary" in this tokenizer.voidEnables number parsing of exponents.voidEnables number parsing for decimal numbers and for hexadecimal numbersvoidSpecifies that numbers should be parsed by this tokenizer.voidvoidpushBack()Causes the next call to thenextmethod of this tokenizer to return the current value in thettypefield, and not to modify the value in thenvalorsvalfield.voidpushCharBack(int ch) Unreads a character back into the input stream of the tokenizer.voidquoteChar(int ch) Specifies that matching pairs of this character delimit string constants in this tokenizer.voidrequireNextToken(int ttype, String errorMessage) voidResets this tokenizer's syntax table so that all characters are "ordinary." See theordinaryCharmethod for more information on a character being ordinary.voidSets the reader for the tokenizer.voidsetSlashSlashToken(@NonNull String slashSlash) Sets the slash slash token.voidsetSlashStarTokens(@NonNull String slashStar, @NonNull String starSlash) Sets the slash star and star slash tokens.voidsetStartPosition(int p) Set the start position of the current token.voidslashSlashComments(boolean flag) Determines whether or not the tokenizer recognizes C++-style comments.voidslashStarComments(boolean flag) Determines whether or not the tokenizer recognizes C-style comments.toString()Returns the string representation of the current stream token.voidwhitespaceChars(int low, int hi) Specifies that all characters c in the rangelow <= c <= highare white space characters.voidwordChars(int low, int hi) Specifies that all characters c in the rangelow <= c <= highare word constituents.
-
Field Details
-
ttype
public int ttypeAfter a call to thenextmethod, this field contains the type of the token just read. For a single character token, its value is the single character, converted to an integer. For a quoted string token (see , its value is the quote character. Otherwise, its value is one of the following:TT_WORDindicates that the token is a word.TT_NUMBERindicates that the token is a number.TT_EOLindicates that the end of line has been read. The field can only have this value if theeolIsSignificantmethod has been called with the argumenttrue.TT_EOFindicates that the end of the input stream has been reached.
- See Also:
-
TT_EOF
public static final int TT_EOFA constant indicating that the end of the stream has been read.- See Also:
-
TT_EOL
public static final int TT_EOLA constant indicating that the end of the line has been read.- See Also:
-
TT_NUMBER
public static final int TT_NUMBERA constant indicating that a number token has been read.- See Also:
-
TT_WORD
public static final int TT_WORDA constant indicating that a word token has been read.- See Also:
-
sval
If the current token is a word token, this field contains a string giving the characters of the word token. When the current token is a quoted string token, this field contains the body of the string.The current token is a word when the value of the
ttypefield isTT_WORD. The current token is a quoted string token when the value of thettypefield is a quote character.- See Also:
-
nval
public double nvalIf the current token is a number, this field contains the value of that number. The current token is a number when the value of thettypefield isTT_NUMBER.- See Also:
-
-
Constructor Details
-
StreamPosTokenizer
public StreamPosTokenizer()Initializes everything except the streams. -
StreamPosTokenizer
Create a tokenizer that parses the given character stream.- Parameters:
r- the reader
-
StreamPosTokenizer
Create a tokenizer that parses the given character stream.- Parameters:
r- the reader
-
-
Method Details
-
setReader
Sets the reader for the tokenizer.- Parameters:
r- The reader
-
resetSyntax
public void resetSyntax()Resets this tokenizer's syntax table so that all characters are "ordinary." See theordinaryCharmethod for more information on a character being ordinary.- See Also:
-
wordChars
public void wordChars(int low, int hi) Specifies that all characters c in the rangelow <= c <= highare word constituents. A word token consists of a word constituent followed by zero or more word constituents or number constituents.- Parameters:
low- the low end of the range.hi- the high end of the range.
-
whitespaceChars
public void whitespaceChars(int low, int hi) Specifies that all characters c in the rangelow <= c <= highare white space characters. White space characters serve only to separate tokens in the input stream.- Parameters:
low- the low end of the range.hi- the high end of the range.
-
ordinaryChars
public void ordinaryChars(int low, int hi) Specifies that all characters c in the rangelow <= c <= highare "ordinary" in this tokenizer. See theordinaryCharmethod for more information on a character being ordinary.- Parameters:
low- the low end of the range.hi- the high end of the range.- See Also:
-
ordinaryChar
public void ordinaryChar(int ch) Specifies that the character argument is "ordinary" in this tokenizer. It removes any special significance the character has as a comment character, word component, string delimiter, white space, or number character. When such a character is encountered by the parser, the parser treates it as a single-character token and setsttypefield to the character value.- Parameters:
ch- the character.- See Also:
-
commentChar
public void commentChar(int ch) Specified that the character argument starts a single-line comment. All characters from the comment character to the end of the line are ignored by this stream tokenizer.- Parameters:
ch- the character.
-
quoteChar
public void quoteChar(int ch) Specifies that matching pairs of this character delimit string constants in this tokenizer.When the
nextmethod encounters a string constant, thettypefield is set to the string delimiter and thesvalfield is set to the body of the string.If a string quote character is encountered, then a string is recognized, consisting of all characters after (but not including) the string quote character, up to (but not including) the next occurrence of that same string quote character, or a line terminator, or end of file. The usual escape sequences such as
"\n"and"\t"are recognized and converted to single characters as the string is parsed.- Parameters:
ch- the character.- See Also:
-
parseNumbers
public void parseNumbers()Specifies that numbers should be parsed by this tokenizer. The syntax table of this tokenizer is modified so that each of the twelve characters:0 1 2 3 4 5 6 7 8 9 . -has the "numeric" attribute.
When the parser encounters a word token that has the format of a double precision floating-point number, it treats the token as a number rather than a word, by setting the the
ttypefield to the valueTT_NUMBERand putting the numeric value of the token into thenvalfield.- See Also:
-
parsePlusAsNumber
public void parsePlusAsNumber() -
parseHexNumbers
public void parseHexNumbers()Enables number parsing for decimal numbers and for hexadecimal numbers -
parseExponents
public void parseExponents()Enables number parsing of exponents. Exponents appear after the last digit of number with capital letter 'E' or small letter 'e'. -
eolIsSignificant
public void eolIsSignificant(boolean flag) Determines whether or not ends of line are treated as tokens. If the flag argument is true, this tokenizer treats end of lines as tokens; thenextmethod returnsTT_EOLand also sets thettypefield to this value when an end of line is read.A line is a sequence of characters ending with either a carriage-return character (
'\r') or a newline character ('\n'). In addition, a carriage-return character followed immediately by a newline character is treated as a single end-of-line token.If the
flagis false, end-of-line characters are treated as white space and serve only to separate tokens.- Parameters:
flag-trueindicates that end-of-line characters are separate tokens;falseindicates that end-of-line characters are white space.- See Also:
-
slashStarComments
public void slashStarComments(boolean flag) Determines whether or not the tokenizer recognizes C-style comments. If the flag argument istrue, this stream tokenizer recognizes C-style comments. All text between successive occurrences of/*and*/are discarded.If the flag argument is
false, then C-style comments are not treated specially.- Parameters:
flag-trueindicates to recognize and ignore C-style comments.
-
slashSlashComments
public void slashSlashComments(boolean flag) Determines whether or not the tokenizer recognizes C++-style comments. If the flag argument istrue, this stream tokenizer recognizes C++-style comments. Any occurrence of two consecutive slash characters ('/') is treated as the beginning of a comment that extends to the end of the line.If the flag argument is
false, then C++-style comments are not treated specially.- Parameters:
flag-trueindicates to recognize and ignore C++-style comments.
-
lowerCaseMode
public void lowerCaseMode(boolean fl) Determines whether or not word token are automatically lowercased. If the flag argument istrue, then the value in thesvalfield is lowercased whenever a word token is returned (thettypefield has the valueTT_WORDby thenextmethod of this tokenizer.If the flag argument is
false, then thesvalfield is not modified.- Parameters:
fl-trueindicates that all word tokens should be lowercased.- See Also:
-
requireNextToken
- Throws:
ParseExceptionIOException
-
createException
-
nextToken
Parses the next token from the input stream of this tokenizer. The type of the next token is returned in thettypefield. Additional information about the token may be in thenvalfield or thesvalfield of this tokenizer.Typical clients of this class first set up the syntax tables and then sit in a loop calling next to parse successive tokens until TT_EOF is returned.
- Returns:
- the value of the
ttypefield. - Throws:
IOException- if an I/O error occurs.- See Also:
-
nextChar
Reads the next character from the input stream, without passing it to the tokenizer.- Returns:
- the next char
- Throws:
IOException- in case of an IO error
-
pushCharBack
public void pushCharBack(int ch) Unreads a character back into the input stream of the tokenizer.- Parameters:
ch- The character
-
setSlashStarTokens
Sets the slash star and star slash tokens. Due to limitations by this implementation, both tokens must have the same number of characters and the character length must be either 1 or 2.- Parameters:
slashStar- tokenstarSlash- token
-
setSlashSlashToken
Sets the slash slash token. Due to limitations by this implementation, the character length must be either 1 or 2.- Parameters:
slashSlash- token
-
pushBack
public void pushBack()Causes the next call to thenextmethod of this tokenizer to return the current value in thettypefield, and not to modify the value in thenvalorsvalfield.- See Also:
-
lineno
public int lineno()Return the current line number.- Returns:
- the current line number of this stream tokenizer.
-
getStartPosition
public int getStartPosition()Returns the start position of the token relative to the position that the stream had, when the StreamPosTokenizer was constructed.- Returns:
- the start position of the token.
-
setStartPosition
public void setStartPosition(int p) Set the start position of the current token.- Parameters:
p- the position
-
getEndPosition
public int getEndPosition()Returns the end position of the token relative to the position that the stream had, when the StreamPosTokenizer was constructed.- Returns:
- the end position of the token.
-
consumeGreedy
Consumes a substring from the current sval of the StreamPosTokenizer.- Parameters:
greedyToken- the token to be consumed
-
toString
Returns the string representation of the current stream token.
-