Package org.imixs.archive.documents
Class OCRDocumentAdapter
java.lang.Object
org.imixs.archive.documents.OCRDocumentAdapter
- All Implemented Interfaces:
org.imixs.workflow.Adapter,org.imixs.workflow.SignalAdapter
The TikaDocumentAdapter reacts on ProcessingEvent to auto extract the text
content.
The adapter expect the following environment setting TIKA_SERVICE_MODE: "MODEL" You can set additional options to be passed to the Tika Service
<tika name="options">X-Tika-PDFocrStrategy=OCR_ONLY</tika>
<tika name="options">X-Tika-PDFOcrImageType=RGB</tika>
<tika name="options">X-Tika-PDFOcrDPI=400</tika>
- Version:
- 1.0
- Author:
- rsoika
- See Also:
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionorg.imixs.workflow.ItemCollectionexecute(org.imixs.workflow.ItemCollection document, org.imixs.workflow.ItemCollection event) This method posts a text from an attachment to the Imixs-ML Analyse service endpoint
-
Field Details
-
OCR_ERROR
- See Also:
-
-
Constructor Details
-
OCRDocumentAdapter
public OCRDocumentAdapter()
-
-
Method Details
-
execute
public org.imixs.workflow.ItemCollection execute(org.imixs.workflow.ItemCollection document, org.imixs.workflow.ItemCollection event) throws org.imixs.workflow.exceptions.AdapterException This method posts a text from an attachment to the Imixs-ML Analyse service endpoint- Specified by:
executein interfaceorg.imixs.workflow.Adapter- Throws:
org.imixs.workflow.exceptions.AdapterException
-