Class ModuleBase

  • All Implemented Interfaces:
    Module
    Direct Known Subclasses:
    BytestreamModule

    public abstract class ModuleBase
    extends Object
    implements Module
    This class is an abstract implementation of the Module interface. It contains all the methods required for a Module, but doesn't do anything by itself. A subclass should provide a functional implementation of parse(InputStream, RepInfo, int) if it is not random access, or parse(RandomAccessFile, RepInfo) if it is random access.
    • Field Detail

      • _app

        protected App _app
        The application object
      • _coverage

        protected String _coverage
        Coverage information
      • _date

        protected Date _date
        Module last modification date
      • _format

        protected String[] _format
        Formats recognized by this Module
      • _init

        protected String _init
        Initialization value.
      • _defaultParams

        protected final List<String> _defaultParams
        List of default parameters.
      • _mimeType

        protected String[] _mimeType
        MIME types supported by this Module
      • _name

        protected String _name
        Module name
      • _note

        protected String _note
        Module note
      • _param

        protected String _param
        Module-specific parameter.
      • _release

        protected String _release
        Module release description
      • _repInfoNote

        protected String _repInfoNote
        RepInfo note
      • _rights

        protected String _rights
        Copyright notice
      • _signature

        protected List<Signature> _signature
        Module Signature list
      • _specification

        protected List<Document> _specification
        Module specification document list
      • _vendor

        protected Agent _vendor
        Module vendor
      • _wellFormedNote

        protected String _wellFormedNote
        Well-formedness criteria
      • _validityNote

        protected String _validityNote
        Validity criteria
      • _isRandomAccess

        protected boolean _isRandomAccess
        Random access flag
      • _nByte

        protected long _nByte
        Byte count of content object
      • _crc32

        protected CRC32 _crc32
        CRC32 calculated on content object
      • _md5

        protected MessageDigest _md5
        MD5 digest calculated on content object
      • _sha1

        protected MessageDigest _sha1
        SHA-1 digest calculated on content object
      • _sha256

        protected MessageDigest _sha256
        SHA-256 digest calculated on content object
      • _checksumFinished

        protected boolean _checksumFinished
        Flag indicating valid checksum information set
      • _verbosity

        protected int _verbosity
        Indicator of how much data to report
      • _countStream

        protected boolean _countStream
        Flag to indicate read routines should count the stream
      • _bigEndian

        protected boolean _bigEndian
        The dominant "endianness" of the Module.
      • _features

        protected List<String> _features
        The list of supported features.
      • _logger

        protected Logger _logger
        Logger for a module class.
    • Constructor Detail

      • ModuleBase

        protected ModuleBase​(String name,
                             String release,
                             int[] date,
                             String[] format,
                             String coverage,
                             String[] mimeType,
                             String wellFormedNote,
                             String validityNote,
                             String repInfoNote,
                             String note,
                             String rights,
                             boolean isRandomAccess)
        Constructors of all subclasses of ModuleBase should call this as a super constructor.
        Parameters:
        name - Name of the module
        release - Release identifier
        date - Last modification date of the module code, in the form of an array of three numbers. date[0] is the year, date[1] the month, and date[2] the day.
        format - Array of format names supported by the module
        coverage - Details as to the specific format versions or variants that are supported by the module
        mimeType - Array of MIME type strings for formats supported by the module
        wellFormedNote - Brief explanation of what constitutes well-formed content
        validityNote - Brief explanation of what constitutes valid content
        repInfoNote - Note pertaining to RepInfo (may be null)
        note - Additional information about the module (may be null)
        rights - Copyright notice for the module
        isRandomAccess - true if the module treats content as random-access data, false if it treats content as stream data
    • Method Detail

      • initFeatures

        public void initFeatures()
        Initializes the feature list. This method puts the following features in the list:
        • edu.harvard.hul.ois.canValidate
        • edu.harvard.hul.ois.canIdentify
      • init

        public void init​(String init)
        Per-instantiation initialization. The default method does nothing but save its parameter.
        Specified by:
        init in interface Module
        Parameters:
        init - Initialization parameter. This is typically obtained from the configuration file.
      • setDefaultParams

        public void setDefaultParams​(List<String> params)
        Set a a List of default parameters for the module.
        Specified by:
        setDefaultParams in interface Module
        Parameters:
        params - A List whose elements are Strings. May be empty.
      • applyDefaultParams

        public void applyDefaultParams()
                                throws Exception
        Applies the default parameters. Calling this clears any prior parameters.
        Specified by:
        applyDefaultParams in interface Module
        Throws:
        Exception
      • resetParams

        public void resetParams()
        Reset parameter settings. Returns to a default state without any parameters. The default method clears the saved parameter.
        Specified by:
        resetParams in interface Module
      • param

        public void param​(String param)
        Per-action initialization. May be called multiple times. The default method does nothing but save its parameter.
        Specified by:
        param in interface Module
        Parameters:
        param - Initialization parameter.
      • getApp

        public App getApp()
        Returns the App object.
      • getBase

        public JhoveBase getBase()
        Returns the JHOVE engine object.
      • getNByte

        public long getNByte()
        Returns the value of _nByte. Meaningful only for modules that use a counted InputStream.
      • isBigEndian

        public boolean isBigEndian()
        Returns true if the dominant "endianness" of the module, or the current file being processed, is big-endian, otherwise false. This does not guarantee that all numbers in the module follow the dominant endianness, particularly as formats sometimes incorporate data stored in a previously defined format. For some formats, e.g., TIFF, the endianness depends on the file being processed. Every module must initialize the value of _bigEndian for this function, or else assign its value when parsing a file, to return a meaningful result. For some modules (e.g., ASCII, endianness has no meaning.
      • getCoverage

        public final String getCoverage()
        Return details as to the specific format versions or variants that are supported by this module
        Specified by:
        getCoverage in interface Module
      • getDate

        public final Date getDate()
        Return the last modification date of this Module, as a Java Date object
        Specified by:
        getDate in interface Module
      • getFormat

        public final String[] getFormat()
        Return the array of format names supported by this Module
        Specified by:
        getFormat in interface Module
      • getMimeType

        public final String[] getMimeType()
        Return the array of MIME type strings for formats supported by this Module
        Specified by:
        getMimeType in interface Module
      • getName

        public final String getName()
        Return the module name
        Specified by:
        getName in interface Module
      • getNote

        public final String getNote()
        Return the module note
        Specified by:
        getNote in interface Module
      • getRelease

        public final String getRelease()
        Return the release identifier
        Specified by:
        getRelease in interface Module
      • getRepInfoNote

        public final String getRepInfoNote()
        Return the RepInfo note
        Specified by:
        getRepInfoNote in interface Module
      • getRights

        public final String getRights()
        Return the copyright information string
        Specified by:
        getRights in interface Module
      • getSignature

        public final List<Signature> getSignature()
        Return the List of Signatures recognized by this Module
        Specified by:
        getSignature in interface Module
      • getSpecification

        public final List<Document> getSpecification()
        Returns a list of Document objects (one for each specification document of the format). The specification list is generated by the Module, and specifications cannot be added by callers.
        Specified by:
        getSpecification in interface Module
        See Also:
        Document
      • getVendor

        public final Agent getVendor()
        Return the vendor information
        Specified by:
        getVendor in interface Module
      • getWellFormedNote

        public final String getWellFormedNote()
        Return the string describing well-formedness criteria
        Specified by:
        getWellFormedNote in interface Module
      • getValidityNote

        public final String getValidityNote()
        Return the string describing validity criteria
        Specified by:
        getValidityNote in interface Module
      • isRandomAccess

        public final boolean isRandomAccess()
        Return the random access flag (true if the module operates on random access files, false if it operates on streams)
        Specified by:
        isRandomAccess in interface Module
      • hasFeature

        public boolean hasFeature​(String feature)
        Returns true if the module supports a given named feature, and false if the feature is unsupported or unknown. Feature names are case sensitive. It is recommended that features be named using package nomenclature. The following features are, by default, supported by the modules developed by OIS:
        • edu.harvard.hul.ois.canValidate
        • edu.harvard.hul.ois.canIdentify
        Specified by:
        hasFeature in interface Module
      • setApp

        public final void setApp​(App app)
        Pass the associated App object to this Module. The App makes various services available.
        Specified by:
        setApp in interface Module
      • setBase

        public final void setBase​(JhoveBase je)
        Pass the JHOVE engine object to this Module.
        Specified by:
        setBase in interface Module
      • setValidityNote

        public final void setValidityNote​(String validityNote)
        Set the value of the validityNote property, which briefly explains the validity criteria of this Module.
      • setCRC32

        public final void setCRC32​(CRC32 crc32)
        Set the value of the CRC32 calculated for the content object. The checksum-like functions can be set by the caller. Setting any of these creates the assumption that the calculation is already done, and sets the checksumFinished flag to inhibit recalculation.
      • setVerbosity

        public void setVerbosity​(int verbosity)
        Set the degree of verbosity desired from the module. The setting of param can override the verbosity setting. It does not affect whether raw data are reported or not, only which data are reported.
        Specified by:
        setVerbosity in interface Module
        Parameters:
        verbosity - The requested verbosity value. Recognized values are Module.MINIMUM_VERBOSITY and Module.MAXIMUM_VERBOSITY. The interpretation of the value depends on the module, and the module may choose not to use this setting. However, modules should treat MAXIMUM_VERBOSITY as a request for all the data available from the module.
      • setNByte

        public final void setNByte​(long nByte)
        Sets the byte count for the content object, and sets the checksumFinished flag.
      • setMD5

        public final void setMD5​(MessageDigest md5)
        Sets the MD5 calculated digest for the content object, and sets the checksumFinished flag.
      • setSHA1

        public final void setSHA1​(MessageDigest sha1)
        Sets the SHA-1 calculated digest for the content object, and sets the checksumFinished flag.
      • setSHA256

        public final void setSHA256​(MessageDigest sha256)
        Sets the SHA-256 calculated digest for the content object, and sets the checksumFinished flag.
      • parse

        public int parse​(InputStream stream,
                         RepInfo info,
                         int parseIndex)
                  throws IOException
        Parse the content of a stream digital object and store the results in RepInfo. A given Module will normally override only one of the two parse methods; the default method does nothing.
        Specified by:
        parse in interface Module
        Parameters:
        stream - An InputStream, positioned at its beginning, which is generated from the object to be parsed. If multiple calls to parse are made on the basis of a nonzero value being returned, a new InputStream must be provided each time.
        info - A fresh (on the first call) RepInfo object which will be modified to reflect the results of the parsing If multiple calls to parse are made on the basis of a nonzero value being returned, the same RepInfo object should be passed with each call.
        parseIndex - Must be 0 in first call to parse. If parse returns a nonzero value, it must be called again with parseIndex equal to that return value.
        Throws:
        IOException
      • parse

        public void parse​(RandomAccessFile file,
                          RepInfo info)
                   throws IOException
        Parse the content of a random access digital object and store the results in RepInfo. A given Module will normally override only one of the two parse methods; the default method does nothing.
        Specified by:
        parse in interface Module
        Parameters:
        file - A RandomAccessFile, positioned at its beginning, which is generated from the object to be parsed
        info - A fresh RepInfo object which will be modified to reflect the results of the parsing
        Throws:
        IOException
      • checkSignatures

        public void checkSignatures​(File file,
                                    InputStream stream,
                                    RepInfo info)
                             throws IOException
        Check if the digital object conforms to this Module's internal signature information. This function checks the file against the list of predefined signatures for the module. If there are no predefined signatures, it calls parse with the arguments passed to it. Override this for modules that check digital signatures in some other way. Any module for which the signature may be located other than at the beginning of the file must override.
        Specified by:
        checkSignatures in interface Module
        Parameters:
        file - A File object for the object being parsed
        stream - An InputStream, positioned at its beginning, which is generated from the object to be parsed
        info - A fresh RepInfo object which will be modified to reflect the results of the test
        Throws:
        IOException
      • checkSignatures

        public void checkSignatures​(File file,
                                    RandomAccessFile raf,
                                    RepInfo info)
                             throws IOException
        Check if the digital object conforms to this Module's internal signature information.
        Specified by:
        checkSignatures in interface Module
        Parameters:
        file - A File object representing the object to be parsed
        raf - A RandomAccessFile, positioned at its beginning, which is generated from the object to be parsed
        info - A fresh RepInfo object which will be modified to reflect the results of the test
        Throws:
        IOException
      • initParse

        protected void initParse()
        Initializes the state of the module for parsing. This should be called early in each module's parse() method. If a module overrides it to provide additional functionality, the module's initParse() should call super.initParse().
      • initInfo

        protected void initInfo​(RepInfo info)
      • setChecksums

        protected static void setChecksums​(Checksummer ckSummer,
                                           RepInfo info)
        Set the checksum values.
        Parameters:
        ckSummer - Checksummer object
        info - RepInfo object
      • show

        public void show​(OutputHandler handler)
        Generates information about this Module. The format of the output depends on the OutputHandler.
        Specified by:
        show in interface Module
      • getCRC32

        protected String getCRC32()
        Returns the hex string representation of the CRC32 result.
      • addIntegerProperty

        public Property addIntegerProperty​(String name,
                                           int value,
                                           String[] labels,
                                           int[] index)
        Returns a Property representing an integer value. If raw output is specified for the module, returns an INTEGER property, and labels and index are unused. Otherwise, returns a STRING property, with the string being the element of labels whose index is the index of value in index.
      • addIntegerProperty

        public Property addIntegerProperty​(String name,
                                           int value,
                                           String[] labels)
        Returns a Property representing an integer value. If raw output is specified for the module, returns an INTEGER property, and labels and index are unused. Otherwise, returns a STRING property, with the string being the element of labels whose index is value.
      • readUnsignedByte

        public static int readUnsignedByte​(DataInputStream stream)
                                    throws IOException
        Reads an unsigned byte from a DataInputStream.
        Parameters:
        stream - Stream to read
        Throws:
        IOException
      • readUnsignedByte

        public static int readUnsignedByte​(DataInputStream stream,
                                           ModuleBase counted)
                                    throws IOException
        Reads an unsigned byte from a DataInputStream.
        Parameters:
        stream - Stream to read
        counted - If non-null, module for which value of _nByte shall be incremented appropriately
        Throws:
        IOException
      • readByteBuf

        public static int readByteBuf​(DataInputStream stream,
                                      byte[] buf,
                                      ModuleBase counted)
                               throws IOException
        Reads into a byte buffer from a DataInputStream.
        Parameters:
        stream - Stream to read from
        buf - Byte buffer to fill up
        counted - If non-null, module for which value of _nByte shall be incremented appropriately
        Throws:
        IOException
      • readUnsignedShort

        public static int readUnsignedShort​(DataInputStream stream,
                                            boolean bigEndian)
                                     throws IOException
        Reads two bytes as an unsigned short value from a DataInputStream.
        Parameters:
        stream - The stream to read from.
        bigEndian - If true, interpret the first byte as the high byte, otherwise interpret the first byte as the low byte.
        Throws:
        IOException
      • readUnsignedShort

        public static int readUnsignedShort​(DataInputStream stream,
                                            boolean bigEndian,
                                            ModuleBase counted)
                                     throws IOException
        Reads two bytes as an unsigned short value from a DataInputStream.
        Parameters:
        stream - The stream to read from.
        bigEndian - If true, interpret the first byte as the high byte, otherwise interpret the first byte as the low byte.
        Throws:
        IOException
      • readUnsignedShort

        public static int readUnsignedShort​(RandomAccessFile file,
                                            boolean bigEndian)
                                     throws IOException
        Reads two bytes as an unsigned short value from a RandomAccessFile.
        Parameters:
        file - The file to read from.
        bigEndian - If true, interpret the first byte as the high byte, otherwise interpret the first byte as the low byte.
        Throws:
        IOException
      • readUnsignedInt

        public static long readUnsignedInt​(DataInputStream stream,
                                           boolean bigEndian)
                                    throws IOException
        Reads four bytes as an unsigned 32-bit value from a DataInputStream.
        Parameters:
        stream - The stream to read from.
        bigEndian - If true, interpret the first byte as the high byte, otherwise interpret the first byte as the low byte.
        Throws:
        IOException
      • readUnsignedInt

        public static long readUnsignedInt​(DataInputStream stream,
                                           boolean bigEndian,
                                           ModuleBase counted)
                                    throws IOException
        Reads four bytes as an unsigned 32-bit value from a DataInputStream.
        Parameters:
        stream - The stream to read from.
        bigEndian - If true, interpret the first byte as the high byte, otherwise interpret the first byte as the low byte.
        Throws:
        IOException
      • readUnsignedInt

        public static long readUnsignedInt​(RandomAccessFile file,
                                           boolean bigEndian)
                                    throws IOException
        Reads four bytes as an unsigned 32-bit value from a RandomAccessFile.
        Parameters:
        file - The file to read from.
        bigEndian - If true, interpret the first byte as the high byte, otherwise interpret the first byte as the low byte.
        Throws:
        IOException
      • readSignedLong

        public static long readSignedLong​(DataInputStream stream,
                                          boolean bigEndian,
                                          ModuleBase counted)
                                   throws IOException
        Reads eight bytes as a signed 64-bit value from a DataInputStream.
        Parameters:
        stream - The stream to read from.
        bigEndian - If true, interpret the first byte as the high byte, otherwise interpret the first byte as the low byte.
        Throws:
        IOException
      • getBufferedDataStream

        public static DataInputStream getBufferedDataStream​(InputStream stream,
                                                            int size)
        A convenience method for getting a buffered DataInputStream from a module's InputStream. If the size specified is 0 or less, the default buffer size is used.
      • vectorToPropArray

        protected Property[] vectorToPropArray​(Vector vec)
        A utility for converting a Vector of Properties to an Array. It can be simpler to build a Vector and then call VectorToPropArray than to allocate an array and drop all the Properites into the correct indices. All the members of the Vector must be of type Property, or a ClassCastException will be thrown.
      • isParamInDefaults

        protected boolean isParamInDefaults​(String paramVal)
      • skipDstreamToEnd

        protected void skipDstreamToEnd​(RepInfo info)