Class OCRDocumentAdapter

java.lang.Object
org.imixs.archive.documents.OCRDocumentAdapter
All Implemented Interfaces:
org.imixs.workflow.Adapter, org.imixs.workflow.SignalAdapter

public class OCRDocumentAdapter extends Object implements org.imixs.workflow.SignalAdapter
The TikaDocumentAdapter reacts on ProcessingEvent to auto extract the text content.

The adapter expect the following environment setting TIKA_SERVICE_MODE: "MODEL" You can set additional options to be passed to the Tika Service

 
        <tika name="options">X-Tika-PDFocrStrategy=OCR_ONLY</tika>
        <tika name="options">X-Tika-PDFOcrImageType=RGB</tika>
        <tika name="options">X-Tika-PDFOcrDPI=400</tika>
   
 
Version:
1.0
Author:
rsoika
See Also:
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final String
     
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    org.imixs.workflow.ItemCollection
    execute(org.imixs.workflow.ItemCollection document, org.imixs.workflow.ItemCollection event)
    This method posts a text from an attachment to the Imixs-ML Analyse service endpoint

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

  • Constructor Details

    • OCRDocumentAdapter

      public OCRDocumentAdapter()
  • Method Details

    • execute

      public org.imixs.workflow.ItemCollection execute(org.imixs.workflow.ItemCollection document, org.imixs.workflow.ItemCollection event) throws org.imixs.workflow.exceptions.AdapterException
      This method posts a text from an attachment to the Imixs-ML Analyse service endpoint
      Specified by:
      execute in interface org.imixs.workflow.Adapter
      Throws:
      org.imixs.workflow.exceptions.AdapterException