| Class | Description |
|---|---|
| DocumentConsumer |
Base consumer for documents.
|
| EmbedBlocker |
A custom extractor that prevents Tika from parsing any embedded documents.
|
| EmbeddedDocumentMemoryExtractor | |
| EmbedLinker |
A custom extractor that saves all embeds to temporary files and records the new paths.
|
| EmbedParser |
A custom extractor that is an almost exact copy of Tika's default extractor for embedded documents.
|
| EmbedSpawner | |
| Extractor |
A reusable class that sets up Tika parsers based on runtime options.
|
| UpdatableDigester | |
| UpdatableInputStreamDigester |
copied from tika to customize digestStream
that is used for id generation
|
| Enum | Description |
|---|---|
| ExtractionStatus |
Status for the extraction result of a file.
|
| Extractor.EmbedHandling | |
| Extractor.OutputFormat |
Copyright © 2019 The International Consortium of Investigative Journalists. All rights reserved.