public interface Identifier
| Modifier and Type | Method and Description |
|---|---|
String |
generate(TikaDocument tikaDocument)
Generate an identifier for a root tikaDocument.
|
String |
generateForEmbed(EmbeddedTikaDocument document)
Generate an identifier for an embedded document.
|
String |
hash(TikaDocument tikaDocument)
Generate or retrieve (from metadata) a hash digest of the tikaDocument's underlying file data.
|
String |
retrieveHash(org.apache.tika.metadata.Metadata metadata)
Retrieve a hash digest of the document's underlying file data.
|
String generate(TikaDocument tikaDocument) throws Exception
tikaDocument - the tikaDocument to generate an identifier forException - if there's an exception generating the IDString generateForEmbed(EmbeddedTikaDocument document) throws Exception
document - the embedded document to generate an ID forException - if there's an error generating the IDString hash(TikaDocument tikaDocument) throws Exception
generate(TikaDocument) methods of the implementation generate hash digests, those are
semantically different as they represent a hash of the tikaDocument, rather than the file. The former might
comprise the the relationship of the tikaDocument with its parent, or its position in the path hierarchy, whereas
the latter must not.tikaDocument - the tikaDocument for which to return a file hash digestException - if there's an error generating the hashString retrieveHash(org.apache.tika.metadata.Metadata metadata)
metadata - the document's metadataCopyright © 2018. All rights reserved.