org.fcrepo.server.storage.translation
Class DOTranslationUtility

java.lang.Object
  extended by org.fcrepo.server.storage.translation.DOTranslationUtility
All Implemented Interfaces:
Constants

public abstract class DOTranslationUtility
extends Object
implements Constants

Utility methods for usage by digital object serializers and deserializers. This class provides methods for detecting various forms of relative repository URLs, which are URLs that point to the hostname and port of the local repository. Methods will detect these kinds of URLS in datastream location fields and in special cases of inline XML. Methods are available to convert these URLS back and forth from relative URL syntax, to Fedora's internal local URL syntax, and to absolute URL sytnax. This utility class defines different "translation contexts" and the format of these relative URLs will be set appropriately to the context. Currently defined translation contexts are: 0=Deserialize XML into java object appropriate for in-memory usage 1=Serialize java object to XML appropriate for "public" export (absolute URLs) 2=Serialize java object to XML appropriate for move/migrate to another repository 3=Serialize java object to XML appropriate for internal storage

The public "normalize*" methods in this class should be called to make the right decisions about what conversions should occur for what contexts. Other utility methods set default values for datastreams and disseminators.

Version:
$Id$
Author:
Sandy Payette

Nested Class Summary
 
Nested classes/interfaces inherited from interface org.fcrepo.common.Constants
Constants.FedoraHome
 
Field Summary
static int AS_IS
          Deserialize or Serialize as is.
static int DESERIALIZE_INSTANCE
          DESERIALIZE_INSTANCE: Deserialize XML into a java object appropriate for in-memory usage.
static Pattern s_getItemPattern
           
static int SERIALIZE_EXPORT_ARCHIVE
          SERIALIZE_EXPORT_ARCHIVE: Serialize digital object to XML in a manner appropriate for creating a stand alone archive of objects from a repository that will NOT be available after objects have been exported.
static int SERIALIZE_EXPORT_MIGRATE
          SERIALIZE_EXPORT_MIGRATE: Serialize digital object to XML in a manner appropriate for migrating or moving objects from one repository to another.
static int SERIALIZE_EXPORT_PUBLIC
          SERIALIZE_EXPORT_PUBLIC: Serialize digital object to XML appropriate for "public" external use.
static int SERIALIZE_STORAGE_INTERNAL
          SERIALIZE_STORAGE_INTERNAL: Serialize java object to XML appropriate for persistent storage in the repository, ensuring that any URLs that are relative to the local repository are stored with the Fedora local URL syntax.
 
Fields inherited from interface org.fcrepo.common.Constants
ACCESS, ACTION, API, ATOM_APIM1_0, ATOM_ZIP1_1, ATOM1_1, AUDIT, AUDIT1_0, BATCH_MODIFY, BATCH_MODIFY1_1, BE_SECURITY, BE_SECURITY1_0, BINDING_SPEC, DATASTREAM, DC, DISSEMINATOR, DS_COMPOSITE_MODEL, DS_COMPOSITE_MODEL1_0, DS_INPUT_SPEC1_0, DS_INPUT_SPEC1_1, ENVIRONMENT, FCFG, FEDORA, FEDORA_APP_CONTEXT_NAME, FEDORA_DEFAULT_APP_CONTEXT, FEDORA_HOME, FEDORA_REPOSITORY_PID, FOXML, FOXML1_0, FOXML1_0_LEGACY, FOXML1_1, HTTP_REQUEST, MANAGEMENT, METHOD_MAP, METS, METS_EXT, METS_EXT1_0, METS_EXT1_0_LEGACY, METS_EXT1_1, MODEL, MULGARA, OAI_DC, OAI_DC2_0, OAI_FRIENDS, OAI_FRIENDS2_0, OAI_IDENTIFIER, OAI_IDENTIFIER2_0, OAI_PMH, OAI_PMH2_0, OAI_PROV, OAI_PROV2_0, OBJ_DATASTREAMS1_0, OBJ_HISTORY1_0, OBJ_ITEMS1_0, OBJ_METHODS1_0, OBJ_PROFILE1_0, OBJ_VALIDATION1_0, OBJECT, OLD_XLINK, PID_LIST1_0, RDF, RDF_XSD, RECOVERY, RELS_EXT, RELS_EXT1_0, RELS_INT1_0, REPO_DESC1_0, RESOURCE, SDEF, SDEF_METHOD_MAP1_0, SDEP, SDEP_METHOD_MAP1_0, SDEP_METHOD_MAP1_1, SERVICE_PROFILE, SOAP, SOAP_ENC, SUBJECT, TYPES, VIEW, WSDL, WSDL_HTTP, WSDL_MIME, XACML_POLICY, XACML_POLICY1_0, XACML1, XACML1_ACTION, XACML1_POLICY, XACML1_RESOURCE, XACML1_SUBJECT, XACML2_POLICY_SCHEMA, XLINK, XML_XSD, XMLNS, XSI
 
Constructor Summary
DOTranslationUtility()
           
 
Method Summary
protected static void appendAuditTrail(DigitalObject obj, PrintWriter writer)
           
protected static void appendXMLStream(InputStream in, PrintWriter writer, String encoding)
          Appends XML to a PrintWriter.
protected static List<AuditRecord> getAuditRecords(InputStream auditTrail)
          Parse an audit:auditTrail and return a list of AuditRecords.
protected static List<AuditRecord> getAuditRecords(Reader auditTrail)
           
protected static List<AuditRecord> getAuditRecords(XMLEventReader reader)
           
protected static String getAuditTrail(DigitalObject obj)
           
static String getStateAttribute(DigitalObject obj)
          Reads the state attribute from a DigitalObject.
static RDFName getTypeAttribute(DigitalObject obj)
           
static String makeAbsoluteURLs(String input)
          Make URLs that are relative to the local Fedora repository ABSOLUTE URLs.
static String makeFedoraLocalURLs(String input)
          Detect all forms of URLs that point to the local Fedora repository and make sure they are encoded in the special Fedora local URL syntax (http://local.fedora.server/...").
static void normalizeDatastreams(DigitalObject obj, int transContext, String characterEncoding)
           
static Datastream normalizeDSLocationURLs(String PID, Datastream origDS, int transContext)
           
static String normalizeInlineXML(String xml, int transContext)
          Utility method to normalize a chunk of inline XML depending on the translation context.
protected static String oneString(String[] idList)
           
static String readStateAttribute(String rawValue)
          Parse and read the object state value from raw text.
static Datastream setDatastreamDefaults(Datastream ds)
          Check for null values in attributes and set them to empty string so 'null' does not appear in XML attribute values.
static Disseminator setDisseminatorDefaults(Disseminator diss)
          Deprecated. 
protected static void validateAudit(AuditRecord audit)
          The audit record is created by the system, so programmatic validation here is o.k.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DESERIALIZE_INSTANCE

public static final int DESERIALIZE_INSTANCE
DESERIALIZE_INSTANCE: Deserialize XML into a java object appropriate for in-memory usage. This will make the value of relative repository URLs appropriate for instantiations of the digital object in memory. For External (E) and Redirected (R) datastreams, any URLs that are relative to the local repository are converted to absolute URLs using the currently configured hostname:port of the repository. To do this, the dsLocation is searched for instances the Fedora local URL string ("http://local.fedora.server") which is the way Fedora internally keeps track of instances of relative repository URLs. For Managed Content (M) datastreams, the internal identifiers are instantiated as is. Also, certain reserved inline XML datastreams (WSDL and SERVICE_PROFILE) are searched for relative repository URLs and they are made absolute.

See Also:
Constant Field Values

SERIALIZE_EXPORT_PUBLIC

public static final int SERIALIZE_EXPORT_PUBLIC
SERIALIZE_EXPORT_PUBLIC: Serialize digital object to XML appropriate for "public" external use. This is context is appropriate when the exporting repository will continue to exist and will continue to support callback URLs for datastream content and disseminations. This gives a "public" export of an object in which all relative repository URLs AND internal identifiers are converted to absolute callback URLs. For External (E) and Redirected (R) datastreams, any URLs that are relative to the local repository are converted to absolute URLs using the currently configured hostname:port of the repository. For Managed Content (M) datastreams, the internal identifiers in dsLocation are converted to default dissemination URLs so they can serve as callbacks to the repository to obtain the internally managed content. Also, selected inline XML datastreams (i.e., WSDL and SERVICE_PROFILE) are searched for relative repository URLs and they are made absolute.

See Also:
Constant Field Values

SERIALIZE_EXPORT_MIGRATE

public static final int SERIALIZE_EXPORT_MIGRATE
SERIALIZE_EXPORT_MIGRATE: Serialize digital object to XML in a manner appropriate for migrating or moving objects from one repository to another. This context is appropriate when the local repository will NOT be available after objects have been migrated to a new repository. For External (E) and Redirected (R)datastreams, any URLs that are relative to the local repository will be expressed with the Fedora local URL syntax (which consists of the string "local.fedora.server" standing in place of the actual "hostname:port"). This enables a new repository to ingest the serialization and maintain the relative nature of the URLs (they will become relative to the *new* repository. Also, for Managed Content (M) datastreams, the internal identifiers in dsLocation are converted to default dissemination URLs. This enables the new repository to callback to the old repository to obtain the content bytestream to be stored in the new repository. Also, within selected inline XML datastreams (i.e., WSDL and SERVICE_PROFILE) any URLs that are relative to the local repository will also be expressed with the Fedora local URL syntax.

See Also:
Constant Field Values

SERIALIZE_STORAGE_INTERNAL

public static final int SERIALIZE_STORAGE_INTERNAL
SERIALIZE_STORAGE_INTERNAL: Serialize java object to XML appropriate for persistent storage in the repository, ensuring that any URLs that are relative to the local repository are stored with the Fedora local URL syntax. The Fedora local URL syntax consists of the string "local.fedora.server" standing in place of the actual "hostname:port" on the URL). Managed Content (M) datastreams are stored with internal identifiers in dsLocation. Also, within selected inline XML datastreams (i.e., WSDL and SERVICE_PROFILE) any URLs that are relative to the local repository will also be stored with the Fedora local URL syntax. Note that a view of the storage serialization can be obtained via the getObjectXML method of API-M.

See Also:
Constant Field Values

SERIALIZE_EXPORT_ARCHIVE

public static final int SERIALIZE_EXPORT_ARCHIVE
SERIALIZE_EXPORT_ARCHIVE: Serialize digital object to XML in a manner appropriate for creating a stand alone archive of objects from a repository that will NOT be available after objects have been exported. For External (E) and Redirected (R)datastreams, any URLs that are relative to the local repository will be expressed with the Fedora local URL syntax (which consists of the string "local.fedora.server" standing in place of the actual "hostname:port"). This enables a new repository to ingest the serialization and maintain the relative nature of the URLs (they will become relative to the *new* repository. Also, for Managed Content (M) datastreams, the internal identifiers in dsLocation are converted to default dissemination URLs, and the contents of the URL's are included inline via base-64 encoding. This enables the new repository recreate the content bytestream to be stored in the new repository, when the original repository is no longer available. Also, within selected inline XML datastreams (i.e., WSDL and SERVICE_PROFILE) any URLs that are relative to the local repository will also be expressed with the Fedora local URL syntax.

See Also:
Constant Field Values

AS_IS

public static final int AS_IS
Deserialize or Serialize as is. This context doesn't attempt to do any conversion of URLs.

See Also:
Constant Field Values

s_getItemPattern

public static Pattern s_getItemPattern
Constructor Detail

DOTranslationUtility

public DOTranslationUtility()
Method Detail

makeAbsoluteURLs

public static String makeAbsoluteURLs(String input)
Make URLs that are relative to the local Fedora repository ABSOLUTE URLs. First, see if any URLs are expressed in relative URL syntax (beginning with "fedora/get" or "fedora/search") and convert these to the special Fedora local URL syntax ("http://local.fedora.server/..."). Then look for all URLs that contain the special Fedora local URL syntax and replace instances of this string with the actual host:port configured for the repository. This ensures that all forms of relative repository URLs are converted to proper absolute URLs that reference the hostname:port of the local Fedora repository. Examples: "http://local.fedora.server/fedora/get/demo:1/DS1" is converted to "http://myrepo.com:8080/fedora/get/demo:1/DS1" "fedora/get/demo:1/DS1" is converted to "http://myrepo.com:8080/fedora/get/demo:1/DS1" "http://local.fedora.server/fedora/get/demo:1/sdef:1/getFoo?in=" http://local.fedora.server/fedora/get/demo:2/DC" is converted to "http://myrepo.com:8080/fedora/get/demo:1/sdef:1/getFoo?in=" http://myrepo.com:8080/fedora/get/demo:2/DC"

Parameters:
m_xmlContent -
Returns:
String with all relative repository URLs and Fedora local URLs converted to absolute URL syntax.

makeFedoraLocalURLs

public static String makeFedoraLocalURLs(String input)
Detect all forms of URLs that point to the local Fedora repository and make sure they are encoded in the special Fedora local URL syntax (http://local.fedora.server/..."). First, look for relative URLs that begin with "fedora/get" or "fedora/search" replaces instances of these string patterns with the special Fedora relative URL syntax. Then, look for absolute URLs that have a host:port equal to the host:port currently configured for the Fedora repository and replace host:port with the special string. The special Fedora relative URL string provides a consistent unique string be easily searched for and either converted back to an absolute URL or a relative URL to the repository. Examples: "http://myrepo.com:8080/fedora/get/demo:1/DS1" is converted to "http://local.fedora.server/fedora/get/demo:1/DS1" "https://myrepo.com:8443/fedora/get/demo:1/sdef:1/getFoo?in=" http://myrepo.com:8080/fedora/get/demo:2/DC" is converted to "http://local.fedora.server/fedora/get/demo:1/sdef:1/getFoo?in=" http://local.fedora.server/fedora/get/demo:2/DC" "http://myrepo.com:8080/saxon..." (internal service in sDep WSDL) is converted to "http://local.fedora.server/saxon..."

Parameters:
input -
Returns:
String with all forms of relative repository URLs converted to the Fedora local URL syntax.

normalizeDSLocationURLs

public static Datastream normalizeDSLocationURLs(String PID,
                                                 Datastream origDS,
                                                 int transContext)

normalizeInlineXML

public static String normalizeInlineXML(String xml,
                                        int transContext)
Utility method to normalize a chunk of inline XML depending on the translation context. This is mainly to deal with certain inline XML datastreams found in Service Deployment objects that may contain a service URL that references the host:port of the local Fedora server. This method will usually only ever be called to check WSDL and SERVICE_PROFILE inline XML datastream, but is of general utility for dealing with any datastreams that may contain URLs that reference the local Fedora server. However, it this method should be used sparingly, and only on inline XML datastreams where the impact of the conversions is well understood.

Parameters:
xml - a chunk of XML that's contents of an inline XML datastream
transContext - Integer value indicating the serialization or deserialization context. Valid values are defined as constants in org.fcrepo.server.storage.translation.DOTranslationUtility: 0=DOTranslationUtility.DESERIALIZE_INSTANCE 1=DOTranslationUtility.SERIALIZE_EXPORT_PUBLIC 2=DOTranslationUtility.SERIALIZE_EXPORT_MIGRATE 3=DOTranslationUtility.SERIALIZE_STORAGE_INTERNAL 4=DOTranslationUtility.SERIALIZE_EXPORT_ARCHIVE
Returns:
the inline XML contents with appropriate conversions.

setDatastreamDefaults

public static Datastream setDatastreamDefaults(Datastream ds)
                                        throws ObjectIntegrityException
Check for null values in attributes and set them to empty string so 'null' does not appear in XML attribute values. This helps in XML validation of required attributes. If 'null' is the attribute value then validation would incorrectly consider in a valid non-empty value. Also, we set some other default values here.

Parameters:
ds - The Datastream object to work on.
Returns:
The Datastream value with default set.
Throws:
ObjectIntegrityException

appendXMLStream

protected static void appendXMLStream(InputStream in,
                                      PrintWriter writer,
                                      String encoding)
                               throws ObjectIntegrityException,
                                      UnsupportedEncodingException,
                                      StreamIOException
Appends XML to a PrintWriter. Essentially, just appends all text content of the inputStream, trimming any leading and trailing whitespace. It does his in a streaming fashion, with resource consumption entirely comprised of fixed internal buffers.

Parameters:
in - InputStreaming containing serialized XML.
writer - PrintWriter to write XML content to.
encoding - Character set encoding.
Throws:
ObjectIntegrityException
UnsupportedEncodingException
StreamIOException

normalizeDatastreams

public static void normalizeDatastreams(DigitalObject obj,
                                        int transContext,
                                        String characterEncoding)
                                 throws UnsupportedEncodingException
Throws:
UnsupportedEncodingException

setDisseminatorDefaults

@Deprecated
public static Disseminator setDisseminatorDefaults(Disseminator diss)
                                            throws ObjectIntegrityException
Deprecated. 

Throws:
ObjectIntegrityException

oneString

protected static String oneString(String[] idList)

getStateAttribute

public static String getStateAttribute(DigitalObject obj)
                                throws ObjectIntegrityException
Reads the state attribute from a DigitalObject.

Null or empty strings are interpteted as "Active".

Parameters:
obj - Object that potentially contains object state data.
Returns:
String containing full state value (Active, Inactive, or Deleted)
Throws:
ObjectIntegrityException - thrown when the state cannot be parsed.

readStateAttribute

public static String readStateAttribute(String rawValue)
                                 throws ParseException
Parse and read the object state value from raw text.

Reads a text representation of object state, and returns a "state code" abbreviation corresponding to that state. Null or empty values are interpreted as "Active".

XXX: It might clearer to nix state codes altogether and just use the full value

Parameters:
rawValue - Raw string to parse. May be null
Returns:
String containing the state code (A, D, or I)
Throws:
ParseException - thrown when state value cannot be determined

getTypeAttribute

public static RDFName getTypeAttribute(DigitalObject obj)
                                throws ObjectIntegrityException
Throws:
ObjectIntegrityException

validateAudit

protected static void validateAudit(AuditRecord audit)
                             throws ObjectIntegrityException
The audit record is created by the system, so programmatic validation here is o.k. Normally, validation takes place via XML Schema and Schematron.

Parameters:
audit -
Throws:
ObjectIntegrityException

getAuditTrail

protected static String getAuditTrail(DigitalObject obj)
                               throws ObjectIntegrityException
Throws:
ObjectIntegrityException

appendAuditTrail

protected static void appendAuditTrail(DigitalObject obj,
                                       PrintWriter writer)
                                throws ObjectIntegrityException
Throws:
ObjectIntegrityException

getAuditRecords

protected static List<AuditRecord> getAuditRecords(XMLEventReader reader)
                                            throws XMLStreamException
Throws:
XMLStreamException

getAuditRecords

protected static List<AuditRecord> getAuditRecords(InputStream auditTrail)
                                            throws XMLStreamException
Parse an audit:auditTrail and return a list of AuditRecords.

Parameters:
auditTrail -
Returns:
Throws:
XMLStreamException
Since:
3.0

getAuditRecords

protected static List<AuditRecord> getAuditRecords(Reader auditTrail)
                                            throws XMLStreamException
Throws:
XMLStreamException


Copyright © 2012 DuraSpace. All Rights Reserved.