Package org.imixs.archive.documents
Class OCRDocumentService
java.lang.Object
org.imixs.archive.documents.OCRDocumentService
The TikaDocumentService extracts the textual information from document
attachments. The CDI bean runs on the ProcessingEvent BEFORE_PROCESS. The
service sends each new attached document to an instance of an Apache Tika
Server to get the file content.
The service expects a valid Rest API end-point defined by the Environment Parameter 'TIKA_SERVICE_ENDPONT'. If the TIKA_SERVICE_ENDPONT is not set, then the service will be skipped.
The environment parameter 'TIKA_SERVICE_MODE' must be set to 'auto' to enable the service.
See also the project: https://github.com/imixs/imixs-docker/tree/master/tika
- Version:
- 1.1
- Author:
- rsoika
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidonBeforeProcess(org.imixs.workflow.engine.ProcessingEvent processingEvent) React on the ProcessingEvent.
-
Field Details
-
DEFAULT_ENCODING
- See Also:
-
PLUGIN_ERROR
- See Also:
-
-
Constructor Details
-
OCRDocumentService
public OCRDocumentService()
-
-
Method Details
-
onBeforeProcess
public void onBeforeProcess(@Observes org.imixs.workflow.engine.ProcessingEvent processingEvent) throws org.imixs.workflow.exceptions.PluginException React on the ProcessingEvent. This method sends the document content to the tika server and updates the DMS information.- Throws:
org.imixs.workflow.exceptions.PluginException
-