Class WarcRecordData

java.lang.Object
edu.harvard.hul.ois.jhove.module.warc.WarcRecordData

public class WarcRecordData extends Object
Copied from JHOVE2 WARC module. This class is a wrapper for the information available in an WARC record. Since the WARC reader is not persistent its data must be moved to a simpler data class which can be persisted instead. Note: Some populate methods currently do not include any functionality. However they are included for backwards compatibility in case the ISO standard changes and extra properties are required.
Author:
nicl
  • Field Details

    • startOffset

      protected Long startOffset
      Start offset of record in input stream.
    • consumed

      protected Long consumed
      Number of bytes consumed validating record.
    • warcVersionStr

      protected String warcVersionStr
      WARC version read from header.
    • warcType

      protected String warcType
      WARC-Type read from header.
    • warcFilename

      protected String warcFilename
      WARC-Filename read from header.
    • warcRecordId

      protected String warcRecordId
      WARC-Record-Id read from header.
    • warcDate

      protected String warcDate
      WARC-Date read from header.
    • contentLength

      protected String contentLength
      Content-Length read from header.
    • contentType

      protected String contentType
      Content-type read from header.
    • warcTruncated

      protected String warcTruncated
      WARC-Truncated read from header.
    • warcIpAddress

      protected String warcIpAddress
      WARC-IP-Address read from header.
    • warcConcurrentToList

      protected List<String> warcConcurrentToList
      List of WARC-Concurrent-To read from header.
    • warcRefersTo

      protected String warcRefersTo
      WARC-Refers-To read from header.
    • warcTargetUri

      protected String warcTargetUri
      WARC-Target-URI read from header.
    • warcWarcinfoId

      protected String warcWarcinfoId
      WARC-Warcinfo-ID read from header.
    • warcIdentifiedPayloadType

      protected String warcIdentifiedPayloadType
      WARC-Identified-Payload-Type read from header.
    • warcProfile

      protected String warcProfile
      WARC-Profile read from header.
    • warcSegmentNumber

      protected String warcSegmentNumber
      WARC-Segment-Number read from header.
    • warcSegmentOriginId

      protected String warcSegmentOriginId
      WARC-Segment-Origin-ID read from header.
    • warcSegmentTotalLength

      protected String warcSegmentTotalLength
      WARC-Segment-Total-Length read from header.
    • warcBlockDigest

      protected String warcBlockDigest
      Block digest read from header.
    • warcBlockDigestAlgorithm

      protected String warcBlockDigestAlgorithm
      Block digest algorithm read from header.
    • warcBlockDigestEncoding

      protected String warcBlockDigestEncoding
      Block digest encoding auto-detected from digest and algorithm.
    • warcPayloadDigest

      protected String warcPayloadDigest
      Payload digest read from header.
    • warcPayloadDigestAlgorithm

      protected String warcPayloadDigestAlgorithm
      Payload digest algorithm read from header.
    • warcPayloadDigestEncoding

      protected String warcPayloadDigestEncoding
      Payload digest encoding auto-detected from digest and algorithm.
    • computedBlockDigest

      protected String computedBlockDigest
      Computed block digest.
    • computedBlockDigestAlgorithm

      protected String computedBlockDigestAlgorithm
      Computed block digest algorithm.
    • computedBlockDigestEncoding

      protected String computedBlockDigestEncoding
      Computed block digest encoding.
    • computedPayloadDigest

      protected String computedPayloadDigest
      Computed payload digest, if applicable.
    • computedPayloadDigestAlgorithm

      protected String computedPayloadDigestAlgorithm
      Computed payload digest algorithm, if applicable.
    • computedPayloadDigestEncoding

      protected String computedPayloadDigestEncoding
      Computed payload digest encoding, if applicable.
    • recordIdScheme

      protected String recordIdScheme
      WARC-Record-Id scheme used.
    • bIsNonCompliant

      protected Boolean bIsNonCompliant
      Boolean indicating whether this record is compliant or not.
    • isValidBlockDigest

      protected Boolean isValidBlockDigest
      Boolean indicating whether the block digest is valid or not.
    • isValidPayloadDigest

      protected Boolean isValidPayloadDigest
      Boolean indicating whether the payload digest is valid or not.
    • bHasPayload

      protected Boolean bHasPayload
    • payloadLength

      protected String payloadLength
      Payload length, without payload header (version block/HTTP header).
    • ipVersion

      protected String ipVersion
      IP vresion of WARC-IP-Address (4 or 6).
    • resultCode

      protected String resultCode
      Result-code read from HTTP header, if present.
    • protocolVersion

      protected String protocolVersion
      Protocol version read from HTTP header, if present.
    • protocolContentType

      protected String protocolContentType
      Content-type read from HTTP header, if present.
    • protocolServer

      protected String protocolServer
      Server header entry read from HTTP header, if present.
    • protocolUserAgent

      protected String protocolUserAgent
      User-Agent header entry read from HTTP header, if present.
  • Constructor Details

    • WarcRecordData

      public WarcRecordData()
      Constructor required by the persistence layer.
    • WarcRecordData

      public WarcRecordData(org.jwat.warc.WarcRecord record)
      Constructs an object using the data in the WarcRecord object.
      Parameters:
      record - parsed WARC record