Package org.imixs.archive.documents
Class PDFXMLExtractorPlugin
- java.lang.Object
-
- org.imixs.workflow.engine.plugins.AbstractPlugin
-
- org.imixs.archive.documents.PDFXMLExtractorPlugin
-
- All Implemented Interfaces:
org.imixs.workflow.Plugin
public class PDFXMLExtractorPlugin extends org.imixs.workflow.engine.plugins.AbstractPluginThe PDFXMLExtractorPlugin extracts embedded XML files from a PDF document and transforms the content into a Imixs XMLDocument. This data can be added into the current workitem for further processing.The plugin is based on the Apache PDFBox project. The maven dependency need to be added to a project
To activate the plugin, the BPMN event must contain the following item definition<dependency> <groupId>org.apache.pdfbox</groupId> <artifactId>pdfbox</artifactId> <scope>compile</scope> </dependency><item name="PDFXMLExtractor"> <filename>*.xml</filename> <report>myReport</report> </item>- Version:
- 1.0
- Author:
- rsoika
-
-
Field Summary
Fields Modifier and Type Field Description static StringFILE_PATTERN_PDFstatic StringFILE_PATTERN_XMLstatic StringPARSING_EXCEPTIONstatic StringPDFXMLEXTRACTORstatic StringPLUGIN_ERRORstatic StringREPORT_ERROR
-
Constructor Summary
Constructors Constructor Description PDFXMLExtractorPlugin()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static byte[]getXMLFile(org.imixs.workflow.ItemCollection document, String file_pattern)This method searches attached PDF files of a workitem and extracts an embedded XML file.org.imixs.workflow.ItemCollectionrun(org.imixs.workflow.ItemCollection document, org.imixs.workflow.ItemCollection event)This method parses the content of an attached pdf file and extracts an embedded XML file.static byte[]streamToByteArray(InputStream ins)This method converts a inputStream into a byte array.
-
-
-
Field Detail
-
PDFXMLEXTRACTOR
public static final String PDFXMLEXTRACTOR
- See Also:
- Constant Field Values
-
PARSING_EXCEPTION
public static final String PARSING_EXCEPTION
- See Also:
- Constant Field Values
-
PLUGIN_ERROR
public static final String PLUGIN_ERROR
- See Also:
- Constant Field Values
-
REPORT_ERROR
public static final String REPORT_ERROR
- See Also:
- Constant Field Values
-
FILE_PATTERN_PDF
public static final String FILE_PATTERN_PDF
- See Also:
- Constant Field Values
-
FILE_PATTERN_XML
public static final String FILE_PATTERN_XML
- See Also:
- Constant Field Values
-
-
Method Detail
-
run
public org.imixs.workflow.ItemCollection run(org.imixs.workflow.ItemCollection document, org.imixs.workflow.ItemCollection event) throws org.imixs.workflow.exceptions.PluginExceptionThis method parses the content of an attached pdf file and extracts an embedded XML file. This xml file will than be transformed by a given report definition into a Imixs XMLDocument. The content of the XMLDocument is than merged into the current document.- Throws:
org.imixs.workflow.exceptions.PluginException
-
getXMLFile
public static byte[] getXMLFile(org.imixs.workflow.ItemCollection document, String file_pattern) throws org.imixs.workflow.exceptions.PluginExceptionThis method searches attached PDF files of a workitem and extracts an embedded XML file.The method only returns the first embedded xml file and does not support multiple xml files embedded in one pdf file.
- Parameters:
document-filePattern-- Returns:
- Throws:
org.imixs.workflow.exceptions.PluginException
-
streamToByteArray
public static byte[] streamToByteArray(InputStream ins) throws IOException
This method converts a inputStream into a byte array.- Parameters:
ins-- Returns:
- Throws:
IOException
-
-