Parsing options.
Event emitter The defined events on documents including:
Raised when the parser encounters a start tag.
Tag name.
List of attributes.
Indicates if the tag is self-closing.
Start tag source code location info. Available if location info is enabled via Options.SAXParserOptions.
Raised then parser encounters an end tag.
Tag name.
End tag source code location info. Available if location info is enabled via Options.SAXParserOptions.
Raised then parser encounters a comment.
Comment text.
Comment source code location info. Available if location info is enabled via Options.SAXParserOptions.
Raised then parser encounters text content.
Text content.
Text content code location info. Available if location info is enabled via Options.SAXParserOptions.
Raised then parser encounters a document type declaration.
Document type name.
Document type public identifier.
Document type system identifier.
Document type declaration source code location info. Available if location info is enabled via Options.SAXParserOptions.
TransformStream events
Stops parsing. Useful if you want the parser to stop consuming CPU time once you've obtained the desired info from the input stream. Doesn't prevent piping, so that data will flow through the parser as usual.
Generated using TypeDoc
Streaming SAX-style HTML parser. A transform stream (which means you can pipe through it, see example).
NOTE: This API is available only for Node.js.
const parse5 = require('parse5'); const http = require('http'); const fs = require('fs'); const file = fs.createWriteStream('/home/google.com.html'); const parser = new parse5.SAXParser(); parser.on('text', text => { // Handle page text content ... }); http.get('http://google.com', res => { // SAXParser is the Transform stream, which means you can pipe // through it. So, you can analyze page content and, e.g., save it // to the file at the same time: res.pipe(parser).pipe(file); });