Class MetadataWebService
- All Implemented Interfaces:
NamespaceContext,CurationTask
http://www.sherpa.ac.uk/romeo/api29.php?issn=\{dc.identifier.issn\}
Task will substitute the value of the passed item's metadata field in the {parameter} position. If multiple values are present in the item field, the first value is used.
The task uses another property (the datamap) to determine what data to extract from the service response and how to use it, e.g.
//publisher/name=>dc.publisher,//romeocolour
Task will evaluate the left-hand side (or entire token) of each
comma-separated token in the property as an XPath 1.0 expression into
the response document, and if there is a mapping symbol (e.g. '=>') and
value, it will assign the response document value(s) to the named
metadata field in the passed item. If the response document contains
multiple values, they will all be assigned to the item field. The
mapping symbol governs the nature of metadata field assignment:
'->'mapping will add to any existing values in the item field'=>'mapping will replace any existing values in the item field'~>'mapping will add *only* if item field has no existing values
Unmapped data (without a mapping symbol) will simply be added to the task result string, prepended by the XPath expression (a little prettified). Each label/value pair in the result string is separated by a space, unless the optional 'separator' property is defined.
A very rudimentary facility for transformation of data is supported, e.g.
http://www.crossref.org/openurl/?id=\{doi:dc.relation.isversionof\}&format=unixref
The 'doi:' prefix will cause the task to look for a 'transform' with that name, which is applied to the metadata value before parameter substitution occurs. Transforms are defined in a task property such as the following:
transform.doi = match 10. trunc 60
This means exclude the value string up to the occurrence of '10.', then truncate after 60 characters. The only transform functions currently defined:
'cut' <number>= remove number leading characters'trunc' <number>= remove trailing characters after number length'match' <pattern>= start match at pattern'text' <characters>= append literal characters (enclose in ' ' when whitespace needed)
If the transform results in an invalid state (e.g. cutting more characters than are in the value), the condition will be logged and the un-transformed value used.
Transforms may also be used in datamaps, e.g.
//publisher/name=>shorten:dc.publisher,//romeocolour
which would apply the 'shorten' transform to the service response value(s) prior to metadata field assignment.
An optional property 'headers' may be defined to stipulate any HTTP headers required in the service call. The property syntax is double-pipe separated headers:
Accept: text/xml||Cache-Control: no-cache
- Author:
- richardrodgers
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected List<MetadataWebServiceDataInfo>protected DocumentBuilderprotected Stringprotected Stringprotected Stringprotected Stringprotected Stringprotected Patternprotected StringFields inherited from class org.dspace.curate.AbstractCurationTask
communityService, configurationService, curator, handleService, itemService, taskId -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected intcallService(String value, Item item, StringBuilder resultSb) protected voidcheckNamespaces(Document document) protected intgetMapIndex(String mapping) getNamespaceURI(String prefix) getPrefixes(String uri) voidInitializes taskprotected StringmangleExpr(String expr, String prefix) protected String[]parseTransform(String field) intperform(DSpaceObject dso) Perform the curation task upon passed DSOprotected intprocessResponse(Document doc, Item item, StringBuilder resultSb) protected StringMethods inherited from class org.dspace.curate.AbstractCurationTask
dereference, distribute, perform, performItem, performObject, report, setResult, taskArrayProperty, taskBooleanProperty, taskIntProperty, taskLongProperty, taskProperty
-
Field Details
-
ttPattern
-
urlTemplate
-
templateParam
-
lookupField
-
lookupTransform
-
dataList
-
docBuilder
-
lang
-
fieldSeparator
-
nsMap
-
headers
-
-
Constructor Details
-
MetadataWebService
public MetadataWebService()
-
-
Method Details
-
init
Initializes task- Specified by:
initin interfaceCurationTask- Overrides:
initin classAbstractCurationTask- Parameters:
curator- Curator object performing this tasktaskId- the configured local name of the task- Throws:
IOException- if the parser could not be configured, or passed through.
-
perform
Description copied from interface:CurationTaskPerform the curation task upon passed DSO- Specified by:
performin interfaceCurationTask- Specified by:
performin classAbstractCurationTask- Parameters:
dso- the DSpace object- Returns:
- status code
- Throws:
IOException- if error
-
callService
- Throws:
IOException
-
processResponse
- Throws:
IOException
-
transform
-
tokenize
-
getMapIndex
-
parseTransform
-
checkNamespaces
- Throws:
IOException
-
mangleExpr
-
getNamespaceURI
- Specified by:
getNamespaceURIin interfaceNamespaceContext
-
getPrefix
- Specified by:
getPrefixin interfaceNamespaceContext
-
getPrefixes
- Specified by:
getPrefixesin interfaceNamespaceContext
-