A sequencer that processes the binary content of an PDF file, extracts the metadata, and then writes that
metadata to the repository.
This sequencer produces data that corresponds to the following structure:
- pdf:metadata node of type
pdf:metadata
- jcr:mimeType - optional string property for the mime type of the image
- pdf:pageCount - mandatory long property specifying number of pages
- pdf:encrypted - mandatory boolean property specifying whether the document is encrypted
- pdf:version - mandatory string property for the version of the PDF format
- pdf:orientation - mandatory string property specifying the orientation of the paper (landscape, portrait, reverse landscape)
- pdf:author - optional string property for the author of the document
- pdf:creationDate - optional date property for the creation date of the document
- pdf:creator - optional string property for the creator of the document
- pdf:keywords - optional string property for the keywords of the document (comma delimited)
- pdf:modificationDate - optional date property for the modification date of document
- pdf:producer - optional string property for the producer of the document
- pdf:subject - optional string property for the subject of the document
- pdf:title - optional string property for the title of the document
- pdf:xmp - optional child node for the metadata fields from XMP block
- xmp:baseURL - optional string property for the baseURL
- xmp:createDate - optional date property for modification date of this object
- xmp:creatorTool - optional string property specifying the creator tool used to make this document
- xmp:identifier - optional multi-valued string property for the identifiers of the object
- xmp:label - optional string property for the label of the object
- xmp:metadataDate - optional date property for creation date of this metadata
- xmp:modifyDate - optional date property for modification date of this object
- xmp:nickname - optional string property for the nickname
- xmp:rating - optional string property for the nickname
- xmp:label - optional string property for the label
- pdf:page - optional child node for the metadata fields related to individual pages
- pdf:pageNumber - mandatory long property for the number of this page
- pdf:attachement - optional child node for the metadata fields related to attachment
- pdf:creationDate - optional date property for creation date of this attachment
- pdf:modificationDate - optional date property for modification date of this attachment
- pdf:subject - optional string property for the subject of this attachment
- pdf:name - optional string property for the name of this attachment
- jcr:mimeType - optional string property for the mime type of this attachment
- jcr:data - optional binary property for the content of this attachment