Class AbstractMETSDisseminator

  • All Implemented Interfaces:
    PackageDisseminator
    Direct Known Subclasses:
    DSpaceAIPDisseminator, DSpaceMETSDisseminator

    public abstract class AbstractMETSDisseminator
    extends AbstractPackageDisseminator
    Base class for disseminator of METS (Metadata Encoding and Transmission Standard) Package.
    See http://www.loc.gov/standards/mets/

    This is a generic packager framework intended to be subclassed to create packagers for more specific METS "profiles". METS is an abstract and flexible framework that can encompass many different kinds of metadata and inner package structures.

    Package Parameters:

    • manifestOnly -- if true, generate a standalone XML document of the METS manifest instead of a complete package. Any other metadata (such as licenses) will be encoded inline. Default is false.
    • unauthorized -- this determines what is done when the packager encounters a Bundle or Bitstream it is not authorized to read. By default, it just quits with an AuthorizeException. If this option is present, it must be one of the following values:
      • skip -- simply exclude unreadable content from package.
      • zero -- include unreadable bitstreams as 0-length files; unreadable Bundles will still cause authorize errors.
    Version:
    $Revision$
    Author:
    Larry Stone, Robert Tansley, Tim Donohue
    • Field Detail

      • outputter

        protected static org.jdom.output.XMLOutputter outputter
      • idCounter

        protected int idCounter
      • DEFAULT_MODIFIED_DATE

        protected static final long DEFAULT_MODIFIED_DATE
        Default date/time (in milliseconds since epoch) to set for Zip Entries for DSpace Objects which don't have a Last Modified date. If we don't set our own date/time, then it will default to current system date/time. This is less than ideal, as it causes the md5 checksum of Zip file to change whenever Zip is regenerated (even if compressed files are unchanged) 1036368000 seconds * 1000 = Nov 4, 2002 GMT (the date DSpace 1.0 was released)
        See Also:
        Constant Field Values
      • TEMPLATE_TYPE_SUFFIX

        protected static final String TEMPLATE_TYPE_SUFFIX
        Suffix for Template objects (e.g. Item Templates)
        See Also:
        Constant Field Values
    • Constructor Detail

      • AbstractMETSDisseminator

        public AbstractMETSDisseminator()
    • Method Detail

      • gensym

        protected String gensym​(String prefix)
        Make a new unique ID symbol with specified prefix.
        Parameters:
        prefix - the prefix of the identifier, constrained to XML ID schema
        Returns:
        a new string identifier unique in this session (instance).
      • resetCounter

        protected void resetCounter()
        Resets the unique ID counter used by gensym() method to determine the @ID values of METS tags.
      • getMIMEType

        public String getMIMEType​(PackageParameters params)
        Description copied from interface: PackageDisseminator
        Identifies the MIME-type of this package, e.g. "application/zip". Required when sending the package via HTTP, to provide the Content-Type header.
        Parameters:
        params - Package Parameters
        Returns:
        the MIME type (content-type header) of the package to be returned
      • disseminate

        public void disseminate​(Context context,
                                DSpaceObject dso,
                                PackageParameters params,
                                File pkgFile)
                         throws PackageValidationException,
                                CrosswalkException,
                                AuthorizeException,
                                SQLException,
                                IOException
        Export the object (Item, Collection, or Community) as a "package" on the indicated OutputStream. Package is any serialized representation of the item, at the discretion of the implementing class. It does not have to include content bitstreams.

        Use the params parameter list to adjust the way the package is made, e.g. including a "metadataOnly" parameter might make the package a bare manifest in XML instead of a Zip file including manifest and contents.

        Throws an exception of the chosen object is not acceptable or there is a failure creating the package.

        Parameters:
        context - DSpace context.
        dso - DSpace object (item, collection, etc)
        params - Properties-style list of options specific to this packager
        pkgFile - File where export package should be written
        Throws:
        PackageValidationException - if package cannot be created or there is a fatal error in creating it.
        CrosswalkException - if crosswalk error
        AuthorizeException - if authorization error
        SQLException - if database error
        IOException - if IO error
      • setMdType

        protected void setMdType​(edu.harvard.hul.ois.mets.MdWrap mdWrap,
                                 String mdtype)
      • setMdType

        protected void setMdType​(edu.harvard.hul.ois.mets.MdRef mdRef,
                                 String mdtype)
      • makeFileDiv

        protected edu.harvard.hul.ois.mets.Div makeFileDiv​(String fileID,
                                                           String type)
      • makeChildDiv

        protected edu.harvard.hul.ois.mets.Div makeChildDiv​(String type,
                                                            DSpaceObject dso,
                                                            PackageParameters params)
        Create a <div> element with <mptr> which references a child object via its handle (and via a local file name, when recursively disseminating all child objects).
        Parameters:
        type - - type attr value for the <div>
        dso - - object for which to create the div
        params - package params
        Returns:
        a new Div with dso as child.
      • getHandleURN

        protected String getHandleURN​(String handle)
      • findOriginalBitstream

        protected Bitstream findOriginalBitstream​(Item item,
                                                  Bitstream derived)
                                           throws SQLException
        For a bitstream that's a thumbnail or extracted text, find the corresponding bitstream it was derived from, in the ORIGINAL bundle.
        Parameters:
        item - the item we're dealing with
        derived - the derived bitstream
        Returns:
        the corresponding original bitstream (or null)
        Throws:
        SQLException - if database error
      • linkLicenseRefsToBitstreams

        protected void linkLicenseRefsToBitstreams​(Context context,
                                                   PackageParameters params,
                                                   DSpaceObject dso,
                                                   edu.harvard.hul.ois.mets.MdRef mdRef)
                                            throws SQLException,
                                                   IOException,
                                                   AuthorizeException
        Cleanup our license file reference links, as Deposit Licenses and CC Licenses can be added two ways (and we only want to add them to zip package *once*): (1) Added as a normal Bitstream (assuming LICENSE and CC_LICENSE bundles will be included in pkg) (2) Added via a 'rightsMD' crosswalk (as they are rights information/metadata on an Item)

        So, if they are being added by *both*, then we want to just link the rightsMD <mdRef> entry so that it points to the Bitstream location. This implementation is a bit 'hackish', but it's the best we can do, as the Harvard METS API doesn't allow us to go back and crawl an entire METS file to look for these inconsistencies/duplications.

        Parameters:
        context - current DSpace Context
        params - current Packager Parameters
        dso - current DSpace Object
        mdRef - the rightsMD <mdRef> element
        Throws:
        SQLException - if database error
        IOException - if IO error
        AuthorizeException - if authorization error
      • getObjectTypeString

        public String getObjectTypeString​(DSpaceObject dso)
        Build a string which will be used as the "Type" of this object in the METS manifest.

        Default format is "DSpace [Type-as-string]".

        Parameters:
        dso - DSpaceObject to create type-string for
        Returns:
        a string which will represent this object Type in METS
        See Also:
        Constants
      • getParameterHelp

        public String getParameterHelp()
        Returns a user help string which should describe the additional valid command-line options that this packager implementation will accept when using the -o or --option flags with the Packager script.
        Returns:
        a string describing additional command-line options available with this packager
      • makeBitstreamURL

        public String makeBitstreamURL​(Context context,
                                       Bitstream bitstream,
                                       PackageParameters params)
                                throws SQLException
        Get the URL by which the METS manifest refers to a Bitstream member within the same package. In other words, this is generally a relative path link to where the Bitstream file is within the Zipped up package.

        For a manifest-only METS, this is a reference to an HTTP URL where the bitstream should be able to be downloaded from.

        Parameters:
        context - context
        bitstream - the Bitstream
        params - Packager Parameters
        Returns:
        String in URL format naming path to bitstream.
        Throws:
        SQLException - if database error
      • makeMetsHdr

        public abstract edu.harvard.hul.ois.mets.MetsHdr makeMetsHdr​(Context context,
                                                                     DSpaceObject dso,
                                                                     PackageParameters params)
                                                              throws SQLException
        Create metsHdr element - separate so subclasses can override.
        Parameters:
        context - context
        dso - DSpaceObject
        params - packaging params
        Returns:
        Mets header
        Throws:
        SQLException - if database error
      • getProfile

        public abstract String getProfile()
        Returns name of METS profile to which this package conforms, e.g. "DSpace METS DIP Profile 1.0"
        Returns:
        string name of profile.
      • bundleToFileGrp

        public abstract String bundleToFileGrp​(String bname)
        Returns fileGrp's USE attribute value corresponding to a DSpace bundle name.
        Parameters:
        bname - name of DSpace bundle.
        Returns:
        string name of fileGrp
      • getDmdTypes

        public abstract String[] getDmdTypes​(Context context,
                                             DSpaceObject dso,
                                             PackageParameters params)
                                      throws SQLException,
                                             IOException,
                                             AuthorizeException
        Get the types of Item-wide DMD to include in package. Each element of the returned array is a String, which MAY be just a simple name, naming both the Crosswalk Plugin and the METS "MDTYPE", or a colon-separated pair consisting of the METS name followed by a colon and the Crosswalk Plugin name. E.g. the type string "DC:qualifiedDublinCore" tells it to create a METS section with MDTYPE="DC" and use the plugin named "qualifiedDublinCore" to obtain the data.
        Parameters:
        context - context
        dso - DSpaceObject
        params - the PackageParameters passed to the disseminator.
        Returns:
        array of metadata type strings, never null.
        Throws:
        IOException - if IO error
        SQLException - if database error
        AuthorizeException - if authorization error
      • getTechMdTypes

        public abstract String[] getTechMdTypes​(Context context,
                                                DSpaceObject dso,
                                                PackageParameters params)
                                         throws SQLException,
                                                IOException,
                                                AuthorizeException
        Get the type string of the technical metadata to create for each object and each Bitstream in an Item. The type string may be a simple name or colon-separated compound as specified for getDmdTypes() above.
        Parameters:
        context - context
        dso - DSpaceObject
        params - the PackageParameters passed to the disseminator.
        Returns:
        array of metadata type strings, never null.
        Throws:
        IOException - if IO error
        SQLException - if database error
        AuthorizeException - if authorization error
      • getSourceMdTypes

        public abstract String[] getSourceMdTypes​(Context context,
                                                  DSpaceObject dso,
                                                  PackageParameters params)
                                           throws SQLException,
                                                  IOException,
                                                  AuthorizeException
        Get the type string of the source metadata to create for each object and each Bitstream in an Item. The type string may be a simple name or colon-separated compound as specified for getDmdTypes() above.
        Parameters:
        context - context
        dso - DSpaceObject
        params - the PackageParameters passed to the disseminator.
        Returns:
        array of metadata type strings, never null.
        Throws:
        IOException - if IO error
        SQLException - if database error
        AuthorizeException - if authorization error
      • getDigiprovMdTypes

        public abstract String[] getDigiprovMdTypes​(Context context,
                                                    DSpaceObject dso,
                                                    PackageParameters params)
                                             throws SQLException,
                                                    IOException,
                                                    AuthorizeException
        Get the type string of the "digiprov" (digital provenance) metadata to create for each object and each Bitstream in an Item. The type string may be a simple name or colon-separated compound as specified for getDmdTypes() above.
        Parameters:
        context - context
        dso - DSpaceObject
        params - the PackageParameters passed to the disseminator.
        Returns:
        array of metadata type strings, never null.
        Throws:
        IOException - if IO error
        SQLException - if database error
        AuthorizeException - if authorization error
      • getRightsMdTypes

        public abstract String[] getRightsMdTypes​(Context context,
                                                  DSpaceObject dso,
                                                  PackageParameters params)
                                           throws SQLException,
                                                  IOException,
                                                  AuthorizeException
        Get the type string of the "rights" (permission and/or license) metadata to create for each object and each Bitstream in an Item. The type string may be a simple name or colon-separated compound as specified for getDmdTypes() above.
        Parameters:
        context - context
        dso - DSpaceObject
        params - the PackageParameters passed to the disseminator.
        Returns:
        array of metadata type strings, never null.
        Throws:
        IOException - if IO error
        SQLException - if database error
        AuthorizeException - if authorization error
      • addStructMap

        public abstract void addStructMap​(Context context,
                                          DSpaceObject dso,
                                          PackageParameters params,
                                          edu.harvard.hul.ois.mets.Mets mets)
                                   throws SQLException,
                                          IOException,
                                          AuthorizeException,
                                          edu.harvard.hul.ois.mets.helper.MetsException
        Add any additional structMap elements to the METS document, as required by this subclass. A simple default structure map which fulfills the minimal DSpace METS DIP/SIP requirements is already present, so this does not need to do anything.
        Parameters:
        context - context
        dso - DSpaceObject
        mets - the METS document to which to add structMaps
        params - the PackageParameters passed to the disseminator.
        Throws:
        IOException - if IO error
        SQLException - if database error
        AuthorizeException - if authorization error
        edu.harvard.hul.ois.mets.helper.MetsException - if METS error
      • includeBundle

        public abstract boolean includeBundle​(Bundle bundle)
        Parameters:
        bundle - bundle
        Returns:
        true when this bundle should be included as "content" in the package.. e.g. DSpace SIP does not include metadata bundles.