jodd.lagarto
Class LagartoParserEngine

java.lang.Object
  extended by jodd.lagarto.LagartoParserEngine
Direct Known Subclasses:
LagartoDOMBuilder, LagartoParser

public abstract class LagartoParserEngine
extends java.lang.Object

Lagarto HTML/XML parser engine. Usage consist of two steps:

  • initalization with provided content
  • actual parsing the content


    Field Summary
    protected  boolean calculateErrorPosition
               
    protected  boolean enableConditionalComments
               
    protected  boolean parseSpecialTagsAsCdata
               
     
    Constructor Summary
    LagartoParserEngine()
               
     
    Method Summary
    protected  boolean acceptTag(java.lang.String tagName)
              Returns true if some tag has to be parsed.
    protected  void error(java.lang.String message)
              Prepares error message and reports it to the visitor.
    protected  void flushText()
              Flushes buffered text and stops buffering.
    protected  void initialize(java.nio.CharBuffer input)
              Initializes parser engine by providing the content.
     boolean isCalculateErrorPosition()
               
     boolean isEnableConditionalComments()
               
     boolean isParseSpecialTagsAsCdata()
               
    protected  Token nextToken()
              Returns the next token from lexer or previously fetched token.
    protected  void parse()
              Main parsing loop that process lexer tokens from input.
    protected  void parse(TagVisitor visitor)
              Parses provided content.
    protected  void parseAttribute()
              Parses single attribute.
    protected  void parseCCEnd()
              Parses conditional comment end.
    protected  void parseCDATA()
              Parses CDATA.
    protected  void parseCommentOrConditionalComment()
              Parses HTML comments.
    protected  void parseDoctype()
              Parses HTML DOCTYPE directive.
    protected  void parseRevealedCCStart()
              Parses revealed conditional comment start.
    protected  void parseSpecialTag(int state)
              Parses special tags.
    protected  void parseTag(Token tagToken, TagType type)
              Parse tag starting from "<".
    protected  void parseTagAndAttributes(Token tagToken, java.lang.String tagName, TagType type, int start)
              Parses full tag.
    protected  void parseText(int start, int end)
              Buffers the parsed text.
     void setCalculateErrorPosition(boolean calculateErrorPosition)
              Resolves error position on parsing error.
     void setEnableConditionalComments(boolean enableConditionalComments)
              Enables detection of IE conditional comments.
     void setParseSpecialTagsAsCdata(boolean parseSpecialTagsAsCdata)
              Specifies if special tags should be parsed as CDATA block.
    protected  void skipWhiteSpace()
              Skips all whitespace tokens.
    protected  java.lang.CharSequence text()
              Returns current text.
     
    Methods inherited from class java.lang.Object
    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
     

    Field Detail

    enableConditionalComments

    protected boolean enableConditionalComments

    calculateErrorPosition

    protected boolean calculateErrorPosition

    parseSpecialTagsAsCdata

    protected boolean parseSpecialTagsAsCdata
    Constructor Detail

    LagartoParserEngine

    public LagartoParserEngine()
    Method Detail

    initialize

    protected void initialize(java.nio.CharBuffer input)
    Initializes parser engine by providing the content.


    isEnableConditionalComments

    public boolean isEnableConditionalComments()

    setEnableConditionalComments

    public void setEnableConditionalComments(boolean enableConditionalComments)
    Enables detection of IE conditional comments. If not enabled, downlevel-hidden cond. comments will be treated as regular comment, while revealed cond. comments will be treated as an error.


    setCalculateErrorPosition

    public void setCalculateErrorPosition(boolean calculateErrorPosition)
    Resolves error position on parsing error. JFlex may be used to track current line and row, but that brings overhead. By enabling this property, position will be calculated manually only on errors.


    isCalculateErrorPosition

    public boolean isCalculateErrorPosition()

    setParseSpecialTagsAsCdata

    public void setParseSpecialTagsAsCdata(boolean parseSpecialTagsAsCdata)
    Specifies if special tags should be parsed as CDATA block.


    isParseSpecialTagsAsCdata

    public boolean isParseSpecialTagsAsCdata()

    parse

    protected void parse(TagVisitor visitor)
    Parses provided content.


    parse

    protected void parse()
                  throws java.io.IOException
    Main parsing loop that process lexer tokens from input.

    Throws:
    java.io.IOException

    flushText

    protected void flushText()
    Flushes buffered text and stops buffering.


    parseText

    protected void parseText(int start,
                             int end)
    Buffers the parsed text. Buffered text will be consumed on the very next flushText().


    parseCommentOrConditionalComment

    protected void parseCommentOrConditionalComment()
                                             throws java.io.IOException
    Parses HTML comments. Detect IE hidden conditional comments, too.

    Throws:
    java.io.IOException

    parseCDATA

    protected void parseCDATA()
                       throws java.io.IOException
    Parses CDATA.

    Throws:
    java.io.IOException

    parseDoctype

    protected void parseDoctype()
                         throws java.io.IOException
    Parses HTML DOCTYPE directive.

    Throws:
    java.io.IOException

    parseRevealedCCStart

    protected void parseRevealedCCStart()
                                 throws java.io.IOException
    Parses revealed conditional comment start. Downlevel-hidden conditional comment is detected in parseCommentOrConditionalComment().

    Throws:
    java.io.IOException

    parseCCEnd

    protected void parseCCEnd()
                       throws java.io.IOException
    Parses conditional comment end.

    Throws:
    java.io.IOException

    parseTag

    protected void parseTag(Token tagToken,
                            TagType type)
                     throws java.io.IOException
    Parse tag starting from "<".

    Throws:
    java.io.IOException

    acceptTag

    protected boolean acceptTag(java.lang.String tagName)
    Returns true if some tag has to be parsed. User may override this method to gain more control over what should be parsed. May be used in situations where only few specific tags has to be parsed (e.g. just title and body).


    parseTagAndAttributes

    protected void parseTagAndAttributes(Token tagToken,
                                         java.lang.String tagName,
                                         TagType type,
                                         int start)
                                  throws java.io.IOException
    Parses full tag.

    Throws:
    java.io.IOException

    parseAttribute

    protected void parseAttribute()
                           throws java.io.IOException
    Parses single attribute.

    Throws:
    java.io.IOException

    parseSpecialTag

    protected void parseSpecialTag(int state)
                            throws java.io.IOException
    Parses special tags.

    Throws:
    java.io.IOException

    nextToken

    protected Token nextToken()
                       throws java.io.IOException
    Returns the next token from lexer or previously fetched token.

    Throws:
    java.io.IOException

    skipWhiteSpace

    protected void skipWhiteSpace()
                           throws java.io.IOException
    Skips all whitespace tokens.

    Throws:
    java.io.IOException

    text

    protected java.lang.CharSequence text()
    Returns current text.


    error

    protected void error(java.lang.String message)
    Prepares error message and reports it to the visitor.



    Copyright © 2003-2012 Jodd Team