Class ModuleBase

java.lang.Object
edu.harvard.hul.ois.jhove.ModuleBase
All Implemented Interfaces:
Module
Direct Known Subclasses:
BytestreamModule

public abstract class ModuleBase extends Object implements Module
This class is an abstract implementation of the Module interface. It contains all the methods required for a Module, but doesn't do anything by itself. A subclass should provide a functional implementation of parse(InputStream, RepInfo, int) if it is not random access, or parse(RandomAccessFile, RepInfo) if it is random access.
  • Field Details

    • _app

      protected App _app
      The application object
    • _coverage

      protected String _coverage
      Coverage information
    • _date

      protected Date _date
      Module last modification date
    • _format

      protected String[] _format
      Formats recognized by this Module
    • _init

      protected String _init
      Initialization value.
    • _defaultParams

      protected final List<String> _defaultParams
      List of default parameters.
    • _je

      protected JhoveBase _je
      JHOVE engine.
    • _mimeType

      protected String[] _mimeType
      MIME types supported by this Module
    • _name

      protected String _name
      Module name
    • _note

      protected String _note
      Module note
    • _param

      protected String _param
      Module-specific parameter.
    • _release

      protected String _release
      Module release description
    • _repInfoNote

      protected String _repInfoNote
      RepInfo note
    • _rights

      protected String _rights
      Copyright notice
    • _signature

      protected List<Signature> _signature
      Module Signature list
    • _specification

      protected List<Document> _specification
      Module specification document list
    • _vendor

      protected Agent _vendor
      Module vendor
    • _wellFormedNote

      protected String _wellFormedNote
      Well-formedness criteria
    • _validityNote

      protected String _validityNote
      Validity criteria
    • _isRandomAccess

      protected boolean _isRandomAccess
      Random access flag
    • _nByte

      protected long _nByte
      Byte count of content object
    • _crc32

      protected CRC32 _crc32
      CRC32 calculated on content object
    • _md5

      protected MessageDigest _md5
      MD5 digest calculated on content object
    • _sha1

      protected MessageDigest _sha1
      SHA-1 digest calculated on content object
    • _sha256

      protected MessageDigest _sha256
      SHA-256 digest calculated on content object
    • _checksumFinished

      protected boolean _checksumFinished
      Flag indicating valid checksum information set
    • _verbosity

      protected int _verbosity
      Indicator of how much data to report
    • _countStream

      protected boolean _countStream
      Flag to indicate read routines should count the stream
    • _bigEndian

      protected boolean _bigEndian
      The dominant "endianness" of the Module.
    • _features

      protected List<String> _features
      The list of supported features.
    • _logger

      protected Logger _logger
      Logger for a module class.
    • _ckSummer

      protected Checksummer _ckSummer
    • _cstream

      protected ChecksumInputStream _cstream
    • _dstream

      protected DataInputStream _dstream
  • Constructor Details

    • ModuleBase

      protected ModuleBase(String name, String release, int[] date, String[] format, String coverage, String[] mimeType, String wellFormedNote, String validityNote, String repInfoNote, String note, String rights, boolean isRandomAccess)
      Constructors of all subclasses of ModuleBase should call this as a super constructor.
      Parameters:
      name - Name of the module
      release - Release identifier
      date - Last modification date of the module code, in the form of an array of three numbers. date[0] is the year, date[1] the month, and date[2] the day.
      format - Array of format names supported by the module
      coverage - Details as to the specific format versions or variants that are supported by the module
      mimeType - Array of MIME type strings for formats supported by the module
      wellFormedNote - Brief explanation of what constitutes well-formed content
      validityNote - Brief explanation of what constitutes valid content
      repInfoNote - Note pertaining to RepInfo (may be null)
      note - Additional information about the module (may be null)
      rights - Copyright notice for the module
      isRandomAccess - true if the module treats content as random-access data, false if it treats content as stream data
  • Method Details

    • initFeatures

      public void initFeatures()
      Initializes the feature list. This method puts the following features in the list:
      • edu.harvard.hul.ois.canValidate
      • edu.harvard.hul.ois.canIdentify
    • init

      public void init(String init)
      Per-instantiation initialization. The default method does nothing but save its parameter.
      Specified by:
      init in interface Module
      Parameters:
      init - Initialization parameter. This is typically obtained from the configuration file.
    • setDefaultParams

      public void setDefaultParams(List<String> params)
      Set a a List of default parameters for the module.
      Specified by:
      setDefaultParams in interface Module
      Parameters:
      params - A List whose elements are Strings. May be empty.
    • applyDefaultParams

      public void applyDefaultParams() throws Exception
      Applies the default parameters. Calling this clears any prior parameters.
      Specified by:
      applyDefaultParams in interface Module
      Throws:
      Exception
    • resetParams

      public void resetParams()
      Reset parameter settings. Returns to a default state without any parameters. The default method clears the saved parameter.
      Specified by:
      resetParams in interface Module
    • param

      public void param(String param)
      Per-action initialization. May be called multiple times. The default method does nothing but save its parameter.
      Specified by:
      param in interface Module
      Parameters:
      param - Initialization parameter.
    • getApp

      public App getApp()
      Returns the App object.
    • getBase

      public JhoveBase getBase()
      Returns the JHOVE engine object.
    • getNByte

      public long getNByte()
      Returns the value of _nByte. Meaningful only for modules that use a counted InputStream.
    • isBigEndian

      public boolean isBigEndian()
      Returns true if the dominant "endianness" of the module, or the current file being processed, is big-endian, otherwise false. This does not guarantee that all numbers in the module follow the dominant endianness, particularly as formats sometimes incorporate data stored in a previously defined format. For some formats, e.g., TIFF, the endianness depends on the file being processed. Every module must initialize the value of _bigEndian for this function, or else assign its value when parsing a file, to return a meaningful result. For some modules (e.g., ASCII, endianness has no meaning.
    • getCoverage

      public final String getCoverage()
      Return details as to the specific format versions or variants that are supported by this module
      Specified by:
      getCoverage in interface Module
    • getDate

      public final Date getDate()
      Return the last modification date of this Module, as a Java Date object
      Specified by:
      getDate in interface Module
    • getFormat

      public final String[] getFormat()
      Return the array of format names supported by this Module
      Specified by:
      getFormat in interface Module
    • getMimeType

      public final String[] getMimeType()
      Return the array of MIME type strings for formats supported by this Module
      Specified by:
      getMimeType in interface Module
    • getName

      public final String getName()
      Return the module name
      Specified by:
      getName in interface Module
    • getNote

      public final String getNote()
      Return the module note
      Specified by:
      getNote in interface Module
    • getRelease

      public final String getRelease()
      Return the release identifier
      Specified by:
      getRelease in interface Module
    • getRepInfoNote

      public final String getRepInfoNote()
      Return the RepInfo note
      Specified by:
      getRepInfoNote in interface Module
    • getRights

      public final String getRights()
      Return the copyright information string
      Specified by:
      getRights in interface Module
    • getSignature

      public final List<Signature> getSignature()
      Return the List of Signatures recognized by this Module
      Specified by:
      getSignature in interface Module
    • getSpecification

      public final List<Document> getSpecification()
      Returns a list of Document objects (one for each specification document of the format). The specification list is generated by the Module, and specifications cannot be added by callers.
      Specified by:
      getSpecification in interface Module
      See Also:
    • getVendor

      public final Agent getVendor()
      Return the vendor information
      Specified by:
      getVendor in interface Module
    • getWellFormedNote

      public final String getWellFormedNote()
      Return the string describing well-formedness criteria
      Specified by:
      getWellFormedNote in interface Module
    • getValidityNote

      public final String getValidityNote()
      Return the string describing validity criteria
      Specified by:
      getValidityNote in interface Module
    • isRandomAccess

      public final boolean isRandomAccess()
      Return the random access flag (true if the module operates on random access files, false if it operates on streams)
      Specified by:
      isRandomAccess in interface Module
    • hasFeature

      public boolean hasFeature(String feature)
      Returns true if the module supports a given named feature, and false if the feature is unsupported or unknown. Feature names are case sensitive. It is recommended that features be named using package nomenclature. The following features are, by default, supported by the modules developed by OIS:
      • edu.harvard.hul.ois.canValidate
      • edu.harvard.hul.ois.canIdentify
      Specified by:
      hasFeature in interface Module
    • getFeatures

      public List<String> getFeatures()
      Returns the full list of features.
      Specified by:
      getFeatures in interface Module
    • getDefaultParams

      public List<String> getDefaultParams()
      Returns the list of default parameters.
      Specified by:
      getDefaultParams in interface Module
    • setApp

      public final void setApp(App app)
      Pass the associated App object to this Module. The App makes various services available.
      Specified by:
      setApp in interface Module
    • setBase

      public final void setBase(JhoveBase je)
      Pass the JHOVE engine object to this Module.
      Specified by:
      setBase in interface Module
    • setValidityNote

      public final void setValidityNote(String validityNote)
      Set the value of the validityNote property, which briefly explains the validity criteria of this Module.
    • setCRC32

      public final void setCRC32(CRC32 crc32)
      Set the value of the CRC32 calculated for the content object. The checksum-like functions can be set by the caller. Setting any of these creates the assumption that the calculation is already done, and sets the checksumFinished flag to inhibit recalculation.
    • setVerbosity

      public void setVerbosity(int verbosity)
      Set the degree of verbosity desired from the module. The setting of param can override the verbosity setting. It does not affect whether raw data are reported or not, only which data are reported.
      Specified by:
      setVerbosity in interface Module
      Parameters:
      verbosity - The requested verbosity value. Recognized values are Module.MINIMUM_VERBOSITY and Module.MAXIMUM_VERBOSITY. The interpretation of the value depends on the module, and the module may choose not to use this setting. However, modules should treat MAXIMUM_VERBOSITY as a request for all the data available from the module.
    • setNByte

      public final void setNByte(long nByte)
      Sets the byte count for the content object, and sets the checksumFinished flag.
    • setMD5

      public final void setMD5(MessageDigest md5)
      Sets the MD5 calculated digest for the content object, and sets the checksumFinished flag.
    • setSHA1

      public final void setSHA1(MessageDigest sha1)
      Sets the SHA-1 calculated digest for the content object, and sets the checksumFinished flag.
    • setSHA256

      public final void setSHA256(MessageDigest sha256)
      Sets the SHA-256 calculated digest for the content object, and sets the checksumFinished flag.
    • parse

      public int parse(InputStream stream, RepInfo info, int parseIndex) throws IOException
      Parse the content of a stream digital object and store the results in RepInfo. A given Module will normally override only one of the two parse methods; the default method does nothing.
      Specified by:
      parse in interface Module
      Parameters:
      stream - An InputStream, positioned at its beginning, which is generated from the object to be parsed. If multiple calls to parse are made on the basis of a nonzero value being returned, a new InputStream must be provided each time.
      info - A fresh (on the first call) RepInfo object which will be modified to reflect the results of the parsing If multiple calls to parse are made on the basis of a nonzero value being returned, the same RepInfo object should be passed with each call.
      parseIndex - Must be 0 in first call to parse. If parse returns a nonzero value, it must be called again with parseIndex equal to that return value.
      Throws:
      IOException
    • parse

      public void parse(RandomAccessFile file, RepInfo info) throws IOException
      Parse the content of a random access digital object and store the results in RepInfo. A given Module will normally override only one of the two parse methods; the default method does nothing.
      Specified by:
      parse in interface Module
      Parameters:
      file - A RandomAccessFile, positioned at its beginning, which is generated from the object to be parsed
      info - A fresh RepInfo object which will be modified to reflect the results of the parsing
      Throws:
      IOException
    • checkSignatures

      public void checkSignatures(File file, InputStream stream, RepInfo info) throws IOException
      Check if the digital object conforms to this Module's internal signature information. This function checks the file against the list of predefined signatures for the module. If there are no predefined signatures, it calls parse with the arguments passed to it. Override this for modules that check digital signatures in some other way. Any module for which the signature may be located other than at the beginning of the file must override.
      Specified by:
      checkSignatures in interface Module
      Parameters:
      file - A File object for the object being parsed
      stream - An InputStream, positioned at its beginning, which is generated from the object to be parsed
      info - A fresh RepInfo object which will be modified to reflect the results of the test
      Throws:
      IOException
    • checkSignatures

      public void checkSignatures(File file, RandomAccessFile raf, RepInfo info) throws IOException
      Check if the digital object conforms to this Module's internal signature information.
      Specified by:
      checkSignatures in interface Module
      Parameters:
      file - A File object representing the object to be parsed
      raf - A RandomAccessFile, positioned at its beginning, which is generated from the object to be parsed
      info - A fresh RepInfo object which will be modified to reflect the results of the test
      Throws:
      IOException
    • initParse

      protected void initParse()
      Initializes the state of the module for parsing. This should be called early in each module's parse() method. If a module overrides it to provide additional functionality, the module's initParse() should call super.initParse().
    • initInfo

      protected void initInfo(RepInfo info)
    • calcRAChecksum

      protected void calcRAChecksum(Checksummer ckSummer, RandomAccessFile raf) throws IOException
      Calculates the checksums for a module that uses a random access file.
      Throws:
      IOException
    • setChecksums

      protected static void setChecksums(Checksummer ckSummer, RepInfo info)
      Set the checksum values.
      Parameters:
      ckSummer - Checksummer object
      info - RepInfo object
    • show

      public void show(OutputHandler handler)
      Generates information about this Module. The format of the output depends on the OutputHandler.
      Specified by:
      show in interface Module
    • getCRC32

      protected String getCRC32()
      Returns the hex string representation of the CRC32 result.
    • addIntegerProperty

      public Property addIntegerProperty(String name, int value, String[] labels, int[] index)
      Returns a Property representing an integer value. If raw output is specified for the module, returns an INTEGER property, and labels and index are unused. Otherwise, returns a STRING property, with the string being the element of labels whose index is the index of value in index.
    • addIntegerProperty

      public Property addIntegerProperty(String name, int value, String[] labels)
      Returns a Property representing an integer value. If raw output is specified for the module, returns an INTEGER property, and labels and index are unused. Otherwise, returns a STRING property, with the string being the element of labels whose index is value.
    • readUnsignedByte

      public static int readUnsignedByte(DataInputStream stream) throws IOException
      Reads an unsigned byte from a DataInputStream.
      Parameters:
      stream - Stream to read
      Throws:
      IOException
    • readUnsignedByte

      public static int readUnsignedByte(DataInputStream stream, ModuleBase counted) throws IOException
      Reads an unsigned byte from a DataInputStream.
      Parameters:
      stream - Stream to read
      counted - If non-null, module for which value of _nByte shall be incremented appropriately
      Throws:
      IOException
    • readUnsignedByte

      public static int readUnsignedByte(RandomAccessFile file) throws IOException
      Reads an unsigned byte from a RandomAccessFile.
      Throws:
      IOException
    • readByteBuf

      public static int readByteBuf(DataInputStream stream, byte[] buf, ModuleBase counted) throws IOException
      Reads into a byte buffer from a DataInputStream.
      Parameters:
      stream - Stream to read from
      buf - Byte buffer to fill up
      counted - If non-null, module for which value of _nByte shall be incremented appropriately
      Throws:
      IOException
    • readUnsignedShort

      public static int readUnsignedShort(DataInputStream stream, boolean bigEndian) throws IOException
      Reads two bytes as an unsigned short value from a DataInputStream.
      Parameters:
      stream - The stream to read from.
      bigEndian - If true, interpret the first byte as the high byte, otherwise interpret the first byte as the low byte.
      Throws:
      IOException
    • readUnsignedShort

      public static int readUnsignedShort(DataInputStream stream, boolean bigEndian, ModuleBase counted) throws IOException
      Reads two bytes as an unsigned short value from a DataInputStream.
      Parameters:
      stream - The stream to read from.
      bigEndian - If true, interpret the first byte as the high byte, otherwise interpret the first byte as the low byte.
      Throws:
      IOException
    • readUnsignedShort

      public static int readUnsignedShort(RandomAccessFile file, boolean bigEndian) throws IOException
      Reads two bytes as an unsigned short value from a RandomAccessFile.
      Parameters:
      file - The file to read from.
      bigEndian - If true, interpret the first byte as the high byte, otherwise interpret the first byte as the low byte.
      Throws:
      IOException
    • readUnsignedInt

      public static long readUnsignedInt(DataInputStream stream, boolean bigEndian) throws IOException
      Reads four bytes as an unsigned 32-bit value from a DataInputStream.
      Parameters:
      stream - The stream to read from.
      bigEndian - If true, interpret the first byte as the high byte, otherwise interpret the first byte as the low byte.
      Throws:
      IOException
    • readUnsignedInt

      public static long readUnsignedInt(DataInputStream stream, boolean bigEndian, ModuleBase counted) throws IOException
      Reads four bytes as an unsigned 32-bit value from a DataInputStream.
      Parameters:
      stream - The stream to read from.
      bigEndian - If true, interpret the first byte as the high byte, otherwise interpret the first byte as the low byte.
      Throws:
      IOException
    • readUnsignedInt

      public static long readUnsignedInt(RandomAccessFile file, boolean bigEndian) throws IOException
      Reads four bytes as an unsigned 32-bit value from a RandomAccessFile.
      Parameters:
      file - The file to read from.
      bigEndian - If true, interpret the first byte as the high byte, otherwise interpret the first byte as the low byte.
      Throws:
      IOException
    • readSignedLong

      public static long readSignedLong(DataInputStream stream, boolean bigEndian, ModuleBase counted) throws IOException
      Reads eight bytes as a signed 64-bit value from a DataInputStream.
      Parameters:
      stream - The stream to read from.
      bigEndian - If true, interpret the first byte as the high byte, otherwise interpret the first byte as the low byte.
      Throws:
      IOException
    • readUnsignedRational

      public static Rational readUnsignedRational(DataInputStream stream, boolean endian) throws IOException
      Throws:
      IOException
    • readUnsignedRational

      public static Rational readUnsignedRational(DataInputStream stream, boolean endian, ModuleBase counted) throws IOException
      Throws:
      IOException
    • readUnsignedRational

      public static Rational readUnsignedRational(RandomAccessFile file, boolean endian) throws IOException
      Throws:
      IOException
    • readSignedRational

      public static Rational readSignedRational(DataInputStream stream, boolean endian, ModuleBase counted) throws IOException
      Throws:
      IOException
    • readSignedRational

      public static Rational readSignedRational(RandomAccessFile file, boolean endian) throws IOException
      Throws:
      IOException
    • readSignedByte

      public static int readSignedByte(RandomAccessFile file) throws IOException
      Throws:
      IOException
    • readSignedShort

      public static int readSignedShort(RandomAccessFile file, boolean endian) throws IOException
      Throws:
      IOException
    • readSignedInt

      public static int readSignedInt(RandomAccessFile file, boolean endian) throws IOException
      Throws:
      IOException
    • readSignedByte

      public static int readSignedByte(DataInputStream stream) throws IOException
      Throws:
      IOException
    • readSignedByte

      public static int readSignedByte(DataInputStream stream, ModuleBase counted) throws IOException
      Throws:
      IOException
    • readSignedShort

      public static int readSignedShort(DataInputStream stream, boolean endian) throws IOException
      Throws:
      IOException
    • readSignedShort

      public static int readSignedShort(DataInputStream stream, boolean endian, ModuleBase counted) throws IOException
      Throws:
      IOException
    • readSignedInt

      public static int readSignedInt(DataInputStream stream, boolean endian) throws IOException
      Throws:
      IOException
    • readSignedInt

      public static int readSignedInt(DataInputStream stream, boolean endian, ModuleBase counted) throws IOException
      Throws:
      IOException
    • readFloat

      public static float readFloat(RandomAccessFile file, boolean endian) throws IOException
      Throws:
      IOException
    • readFloat

      public static float readFloat(DataInputStream stream, boolean endian, ModuleBase counted) throws IOException
      Throws:
      IOException
    • readDouble

      public static double readDouble(RandomAccessFile file, boolean endian) throws IOException
      Throws:
      IOException
    • readDouble

      public static double readDouble(DataInputStream stream, boolean endian) throws IOException
      Throws:
      IOException
    • readDouble

      public static double readDouble(DataInputStream stream, boolean endian, ModuleBase counted) throws IOException
      Throws:
      IOException
    • skipBytes

      public long skipBytes(DataInputStream stream, long bytesToSkip) throws IOException
      Skip over some bytes. Return number of bytes skipped.
      Throws:
      IOException
    • skipBytes

      public long skipBytes(DataInputStream stream, long bytesToSkip, ModuleBase counted) throws IOException
      Skip over some bytes. Return number of bytes skipped.
      Throws:
      IOException
    • getBufferedDataStream

      public static DataInputStream getBufferedDataStream(InputStream stream, int size)
      A convenience method for getting a buffered DataInputStream from a module's InputStream. If the size specified is 0 or less, the default buffer size is used.
    • vectorToPropArray

      protected Property[] vectorToPropArray(Vector vec)
      A utility for converting a Vector of Properties to an Array. It can be simpler to build a Vector and then call VectorToPropArray than to allocate an array and drop all the Properites into the correct indices. All the members of the Vector must be of type Property, or a ClassCastException will be thrown.
    • setupDataStream

      protected void setupDataStream(InputStream stream, RepInfo info)
    • checksumIfRafNotCopied

      protected void checksumIfRafNotCopied(RepInfo info, RandomAccessFile raf) throws IOException
      Throws:
      IOException
    • isParamInDefaults

      protected boolean isParamInDefaults(String paramVal)
    • skipDstreamToEnd

      protected void skipDstreamToEnd(RepInfo info)