Package org.dspace.content.packager
Class METSManifest
java.lang.Object
org.dspace.content.packager.METSManifest
Manage the METS manifest document for METS importer classes,
such as the package importer org.dspace.content.packager.MetsSubmission
and the federated importer org.dspace.app.mets.FederatedMETSImport
It can parse the METS document, build an internal model, and give the importers access to that model. It also crosswalks all of the descriptive and administrative metadata in the METS manifest into the target DSpace Item, under control of the importer.
It reads the following DSpace Configuration entries:
- Local XML schema (XSD) declarations, in the general format:
mets.xsd.identifier = namespace xsd-URL
e.g.mets.xsd.dc = http://purl.org/dc/elements/1.1/ dc.xsd
Add a separate configuration entry for each schema. - Crosswalk plugin mappings:
These tell it the name of the crosswalk plugin to invoke for metadata sections
with a particular value of
MDTYPE(orOTHERMDTYPE) By default, the crosswalk mechanism will look for a plugin with the same name as the metadata type (e.g."MODS","DC"). This example line invokes theQDCplugin whenMDTYPE="DC"mets.submission.crosswalk.DC = QDC
general format is:mets.submission.crosswalk.mdType = pluginName
- Author:
- Robert Tansley, WeiHua Huang, Rita Lee, Larry Stone
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic interfaceCallback interface to retrieve data streams in mdRef elements. -
Field Summary
FieldsModifier and TypeFieldDescriptionprotected List<org.jdom2.Element>static final StringPrefix of DSpace configuration lines that map METS metadata type to crosswalk plugin names.protected static final Stringprefix of configuration lines identifying local XML Schema (XSD) filesprotected Stringname of packager who created this manifest object, for looking up configuration entries.protected List<org.jdom2.Element><file>elements in "original" file group (bundle)protected static final org.jdom2.NamespaceDublin core element namespaceprotected static final org.jdom2.NamespaceDublin core term namespace (for qualified DC)protected static Stringstatic final StringCanonical filename of METS manifest within a package or as a bitstream.protected Listall mdRef elements in the manifestprotected org.jdom2.Elementroot element of the current METS manifest.static final org.jdom2.NamespaceMETS namespace -- includes "mets" prefix for use in XPathsprotected org.jdom2.input.SAXBuilderbuilder to use for mdRef streams, inherited from create()static final org.jdom2.NamespaceXLink namespace -- includes "xlink" prefix prefix for use in XPaths -
Constructor Summary
ConstructorsModifierConstructorDescriptionprotectedMETSManifest(org.jdom2.input.SAXBuilder builder, org.jdom2.Element mets, String configName) Default constructor, only called internally. -
Method Summary
Modifier and TypeMethodDescriptionstatic METSManifestcreate(InputStream is, boolean validate, String configName) Create a new manifest object from a serialized METS XML document.voidcrosswalkBitstream(Context context, PackageParameters params, Bitstream bitstream, String fileId, METSManifest.Mdref callback) Crosswalk the metadata associated with a particularfileelement into the bitstream it corresponds to.voidcrosswalkBundle(Context context, PackageParameters params, Bundle bundle, String fileId, METSManifest.Mdref callback) voidcrosswalkItemDmd(Context context, PackageParameters params, DSpaceObject dso, org.jdom2.Element dmdSec, METSManifest.Mdref callback) Invokes appropriate crosswalks on Item-wide descriptive metadata.voidcrosswalkObjectOtherAdminMD(Context context, PackageParameters params, DSpaceObject dso, METSManifest.Mdref callback) Crosswalk all technical and source metadata sections that belong to the whole object.booleancrosswalkObjectSourceMD(Context context, PackageParameters params, DSpaceObject dso, METSManifest.Mdref callback) Just crosswalk the sourceMD sections; used to set the handle and parent of AIP.protected voidcrosswalkXmd(Context context, PackageParameters params, DSpaceObject dso, org.jdom2.Element xmd, METSManifest.Mdref callback, boolean createMissingMetadataFields) protected String[]Get an array of all AMDID values for this objectList<org.jdom2.Element>Gets allfileelements which make up the item's content.static StringgetBundleName(org.jdom2.Element file) Get the DSpace bundle name corresponding to theUSEattribute of the file group enclosing thisfileelement.static StringgetBundleName(org.jdom2.Element file, boolean getParent) Get the DSpace bundle name corresponding to theUSEattribute of the file group enclosing thisfileelement.String[]Retrieve the file paths for the children objects' METS Manifest files.Get an array of child object<div>s from the METS Manifest<structMap>.List<org.jdom2.Element>protected ObjectgetCrosswalk(String type, Class clazz) org.jdom2.Element[]getDmdElements(String dmdList) Gets all dmdSec elements from a space separated listprotected org.jdom2.ElementgetElementByXPath(String path, boolean nullOk) static StringgetFileName(org.jdom2.Element file) Get the "local" file name of thisfileormdRefelement.org.jdom2.Element[]Gets all dmdSec elements containing metadata for the DSpace Item.org.jdom2.Element[]Return rights metadata section(s) relevant to item as a whole.getMdContentAsStream(org.jdom2.Element mdSec, METSManifest.Mdref callback) Return contents of *md element as stream.getMdContentMimeType(org.jdom2.Element mdSec) Returns MIME type of metadata content, if available.Gets list of allmdRefelements in the METS document.getMdType(org.jdom2.Element mdSec) Get the metadata type from within a *mdSec element.org.jdom2.ElementgetMets()Return entire METS document as an inputStreamgetObjID()Return the OBJID attribute of the METS manifest.org.jdom2.ElementReturn the<div>which describes this DSpace Object (and its contents) from the<structMap>.getOriginalFilePath(org.jdom2.Element file) Get the "original" file element for a derived file.Return the reference to the Parent Object from the "Parent"<structMap>.org.jdom2.ElementReturns file element corresponding to primary bitstream.Gets name of the profile to which this METS document conforms.protected static String
-
Field Details
-
MANIFEST_FILE
Canonical filename of METS manifest within a package or as a bitstream.- See Also:
-
CONFIG_METS_PREFIX
Prefix of DSpace configuration lines that map METS metadata type to crosswalk plugin names.- See Also:
-
CONFIG_XSD_PREFIX
prefix of configuration lines identifying local XML Schema (XSD) files- See Also:
-
dcNS
protected static final org.jdom2.Namespace dcNSDublin core element namespace -
dcTermNS
protected static final org.jdom2.Namespace dcTermNSDublin core term namespace (for qualified DC) -
metsNS
public static final org.jdom2.Namespace metsNSMETS namespace -- includes "mets" prefix for use in XPaths -
xlinkNS
public static final org.jdom2.Namespace xlinkNSXLink namespace -- includes "xlink" prefix prefix for use in XPaths -
mets
protected org.jdom2.Element metsroot element of the current METS manifest. -
mdFiles
all mdRef elements in the manifest -
contentFiles
<file>elements in "original" file group (bundle) -
bundleFiles
-
parser
protected org.jdom2.input.SAXBuilder parserbuilder to use for mdRef streams, inherited from create() -
configName
name of packager who created this manifest object, for looking up configuration entries. -
localSchemas
-
-
Constructor Details
-
METSManifest
protected METSManifest(org.jdom2.input.SAXBuilder builder, org.jdom2.Element mets, String configName) Default constructor, only called internally.- Parameters:
builder- XML parser (for parsing mdRef'd files and binData)mets- parsed METS documentconfigName- configuration name
-
-
Method Details
-
create
public static METSManifest create(InputStream is, boolean validate, String configName) throws IOException, MetadataValidationException Create a new manifest object from a serialized METS XML document. Parse document read from the input stream, optionally validating.- Parameters:
is- input stream containing serialized XMLvalidate- if true, enable XML validation using schemas in document. Also validates any sub-documents.configName- config name- Returns:
- new METSManifest object.
- Throws:
IOException- if IO errorMetadataValidationException- if there is any error parsing or validating the METS.
-
getProfile
Gets name of the profile to which this METS document conforms.- Returns:
- value the PROFILE attribute of mets element, or null if none.
-
getObjID
Return the OBJID attribute of the METS manifest. This is where the Handle URI/URN of the object can be found.- Returns:
- OBJID attribute of METS manifest
-
getBundleFiles
Gets allfileelements which make up the item's content.- Returns:
- a List of
Elements. - Throws:
MetadataValidationException- if validation error
-
getContentFiles
- Throws:
MetadataValidationException
-
getMdFiles
Gets list of allmdRefelements in the METS document. Used by ingester to e.g. check that all required files are present.- Returns:
- a List of
Elements. - Throws:
MetadataValidationException- if validation error
-
getOriginalFilePath
Get the "original" file element for a derived file. Finds the original from which this was derived by matching the GROUPID attribute that binds it to its original. For instance, the file for a thumbnail image would have the same GROUPID as its full-size version.NOTE: This pattern of relating derived files through the GROUPID attribute is peculiar to the DSpace METS SIP profile, and may not be generally useful with other sorts of METS documents.
- Parameters:
file- METS file element of derived file- Returns:
- file path of original or null if none found.
-
normalizeBundleName
-
getBundleName
Get the DSpace bundle name corresponding to theUSEattribute of the file group enclosing thisfileelement.- Parameters:
file- file element- Returns:
- DSpace bundle name
- Throws:
MetadataValidationException- when there is no USE attribute on the enclosing fileGrp.
-
getBundleName
public static String getBundleName(org.jdom2.Element file, boolean getParent) throws MetadataValidationException Get the DSpace bundle name corresponding to theUSEattribute of the file group enclosing thisfileelement.- Parameters:
file- file elementgetParent- parent flag- Returns:
- DSpace bundle name
- Throws:
MetadataValidationException- when there is no USE attribute on the enclosing fileGrp.
-
getFileName
Get the "local" file name of thisfileormdRefelement. By "local" we mean the reference to the actual resource containing the data for this file, e.g. a relative path within a Zip or tar archive if the METS is serving as a manifest for that sort of package.- Parameters:
file- file element- Returns:
- "local" file name (i.e. relative to package or content
directory) corresponding to this
fileormdRefelement. - Throws:
MetadataValidationException- when there is not enough information to find a resource identifier.
-
getPrimaryOrLogoBitstream
Returns file element corresponding to primary bitstream. There is ONLY a primary bitstream if the firstdivunder firststructMaphas anfptr.- Returns:
- file element of Item's primary bitstream, or null if there is none.
- Throws:
MetadataValidationException- if validation error
-
getMdType
Get the metadata type from within a *mdSec element.- Parameters:
mdSec- mdSec element- Returns:
- metadata type name.
- Throws:
MetadataValidationException- if validation error
-
getMdContentMimeType
Returns MIME type of metadata content, if available.- Parameters:
mdSec- mdSec element- Returns:
- MIMEtype word, or null if none is available.
- Throws:
MetadataValidationException- if validation error
-
getMdContentAsStream
public InputStream getMdContentAsStream(org.jdom2.Element mdSec, METSManifest.Mdref callback) throws MetadataValidationException, PackageValidationException, IOException, SQLException, AuthorizeException Return contents of *md element as stream. Gets content, dereferencing mdRef if necessary, or decoding a binData element if necessary.- Parameters:
mdSec- mdSec elementcallback- mdref callback- Returns:
- Stream containing contents of metadata section. Never returns null.
- Throws:
MetadataValidationException- if METS format does not contain any metadata.PackageValidationException- if invalid packageIOException- if IO errorSQLException- if database errorAuthorizeException- if authorization error
-
getObjStructDiv
Return the<div>which describes this DSpace Object (and its contents) from the<structMap>. In all cases, this is the first<div>in the first<structMap>.- Returns:
- Element which is the DSpace Object Contents
<div> - Throws:
MetadataValidationException- if metadata validation error
-
getChildObjDivs
Get an array of child object<div>s from the METS Manifest<structMap>. These<div>s reference the location of any child objects METS manifests.- Returns:
- a List of
Elements, each a<div>. May be empty but NOT null. - Throws:
MetadataValidationException- if metadata validation error
-
getChildMetsFilePaths
Retrieve the file paths for the children objects' METS Manifest files. These file paths are located in the<mptr>where @LOCTYPE=URL- Returns:
- a list of Strings, corresponding to relative file paths of children METS manifests
- Throws:
MetadataValidationException- if metadata validation error
-
getParentOwnerLink
Return the reference to the Parent Object from the "Parent"<structMap>. This parent object is the owner of current object.- Returns:
- Link to the Parent Object (this is the Handle of that Parent)
- Throws:
MetadataValidationException- if metadata validation error
-
getElementByXPath
protected org.jdom2.Element getElementByXPath(String path, boolean nullOk) throws MetadataValidationException - Throws:
MetadataValidationException
-
getCrosswalk
-
getItemDmds
Gets all dmdSec elements containing metadata for the DSpace Item.- Returns:
- array of Elements, each a dmdSec. May be empty but NOT null.
- Throws:
MetadataValidationException- if the METS is missing a reference to item-wide DMDs in the correct place.
-
getDmdElements
Gets all dmdSec elements from a space separated list- Parameters:
dmdList- space-separated list of DMDIDs- Returns:
- array of Elements, each a dmdSec. May be empty but NOT null.
- Throws:
MetadataValidationException- if the METS is missing a reference to item-wide DMDs in the correct place.
-
getItemRightsMD
Return rights metadata section(s) relevant to item as a whole.- Returns:
- array of rightsMd elements, possibly empty but never null.
- Throws:
MetadataValidationException- if METS is invalid, e.g. referenced amdSec is missing.
-
crosswalkItemDmd
public void crosswalkItemDmd(Context context, PackageParameters params, DSpaceObject dso, org.jdom2.Element dmdSec, METSManifest.Mdref callback) throws MetadataValidationException, PackageValidationException, CrosswalkException, IOException, SQLException, AuthorizeException Invokes appropriate crosswalks on Item-wide descriptive metadata.- Parameters:
context- contextcallback- mdref callbackdso- DSpaceObjectparams- package paramsdmdSec- dmdSec element- Throws:
MetadataValidationException- if METS errorCrosswalkException- if crosswalk errorPackageValidationException- if invalid packageIOException- if IO errorSQLException- if database errorAuthorizeException- if authorization error
-
crosswalkObjectOtherAdminMD
public void crosswalkObjectOtherAdminMD(Context context, PackageParameters params, DSpaceObject dso, METSManifest.Mdref callback) throws MetadataValidationException, PackageValidationException, CrosswalkException, IOException, SQLException, AuthorizeException Crosswalk all technical and source metadata sections that belong to the whole object.- Parameters:
context- contextcallback- mdref callbackparams- package paramsdso- DSpaceObject- Throws:
MetadataValidationException- if METS is invalid, e.g. referenced amdSec is missing.PackageValidationException- if invalid packageIOException- if IO errorSQLException- if database errorAuthorizeException- if authorization errorCrosswalkException
-
crosswalkObjectSourceMD
public boolean crosswalkObjectSourceMD(Context context, PackageParameters params, DSpaceObject dso, METSManifest.Mdref callback) throws MetadataValidationException, PackageValidationException, CrosswalkException, IOException, SQLException, AuthorizeException Just crosswalk the sourceMD sections; used to set the handle and parent of AIP.- Parameters:
context- contextcallback- mdref callbackparams- package paramsdso- DSpaceObject- Returns:
- true if any metadata section was actually crosswalked, false otherwise
- Throws:
MetadataValidationException- if METS is invalid, e.g. referenced amdSec is missing.PackageValidationException- if invalid packageIOException- if IO errorSQLException- if database errorAuthorizeException- if authorization errorCrosswalkException- if crosswalk error
-
getAmdIDs
Get an array of all AMDID values for this object- Returns:
- array of all AMDID values for this object
- Throws:
MetadataValidationException- if metadata validation error
-
crosswalkXmd
protected void crosswalkXmd(Context context, PackageParameters params, DSpaceObject dso, org.jdom2.Element xmd, METSManifest.Mdref callback, boolean createMissingMetadataFields) throws MetadataValidationException, PackageValidationException, CrosswalkException, IOException, SQLException, AuthorizeException -
crosswalkBitstream
public void crosswalkBitstream(Context context, PackageParameters params, Bitstream bitstream, String fileId, METSManifest.Mdref callback) throws MetadataValidationException, PackageValidationException, CrosswalkException, IOException, SQLException, AuthorizeException Crosswalk the metadata associated with a particularfileelement into the bitstream it corresponds to.- Parameters:
context- a dspace context.params- any PackageParameters which may affect how bitstreams are crosswalkedbitstream- bitstream target of the crosswalkfileId- value of ID attribute in the file element responsible for the contents of that bitstream.callback- mdref callback- Throws:
MetadataValidationException- if METS is invalid, e.g. referenced amdSec is missing.PackageValidationException- if invalid packageIOException- if IO errorSQLException- if database errorAuthorizeException- if authorization errorCrosswalkException- if crosswalk error
-
crosswalkBundle
public void crosswalkBundle(Context context, PackageParameters params, Bundle bundle, String fileId, METSManifest.Mdref callback) throws MetadataValidationException, PackageValidationException, CrosswalkException, IOException, SQLException, AuthorizeException -
getMets
public org.jdom2.Element getMets()- Returns:
- root element of METS document.
-
getMetsAsStream
Return entire METS document as an inputStream- Returns:
- entire METS document as a stream
-