Class OCRDocumentService


  • public class OCRDocumentService
    extends Object
    The TikaDocumentService extracts the textual information from document attachments. The CDI bean runs on the ProcessingEvent BEFORE_PROCESS. The service sends each new attached document to an instance of an Apache Tika Server to get the file content.

    The service expects a valid Rest API end-point defined by the Environment Parameter 'TIKA_SERVICE_ENDPONT'. If the TIKA_SERVICE_ENDPONT is not set, then the service will be skipped.

    The environment parameter 'TIKA_SERVICE_MODE' must be set to 'auto' to enable the service.

    See also the project: https://github.com/imixs/imixs-docker/tree/master/tika

    Version:
    1.1
    Author:
    rsoika
    • Constructor Detail

      • OCRDocumentService

        public OCRDocumentService()
    • Method Detail

      • onBeforeProcess

        public void onBeforeProcess​(@Observes
                                    org.imixs.workflow.engine.ProcessingEvent processingEvent)
                             throws org.imixs.workflow.exceptions.PluginException
        React on the ProcessingEvent. This method sends the document content to the tika server and updates the DMS information.
        Throws:
        org.imixs.workflow.exceptions.PluginException