@Mutative @Suspendable public class MetadataWebService extends AbstractCurationTask implements NamespaceContext
http://www.sherpa.ac.uk/romeo/api29.php?issn=\{dc.identifier.issn\}
Task will substitute the value of the passed item's metadata field
in the {parameter} position. If multiple values are present in the
item field, the first value is used.
The task uses another property (the datamap) to determine what data
to extract from the service response and how to use it, e.g.
//publisher/name=>dc.publisher,//romeocolour
Task will evaluate the left-hand side (or entire token) of each
comma-separated token in the property as an XPath 1.0 expression into
the response document, and if there is a mapping symbol (e.g. '=>') and
value, it will assign the response document value(s) to the named
metadata field in the passed item. If the response document contains
multiple values, they will all be assigned to the item field. The
mapping symbol governs the nature of metadata field assignment:
'->' mapping will add to any existing values in the item field
'=>' mapping will replace any existing values in the item field
'~>' mapping will add *only* if item field has no existing values
Unmapped data (without a mapping symbol) will simply be added to the task
result string, prepended by the XPath expression (a little prettified).
Each label/value pair in the result string is separated by a space,
unless the optional 'separator' property is defined.
A very rudimentary facility for transformation of data is supported, e.g.
http://www.crossref.org/openurl/?id=\{doi:dc.relation.isversionof\}&format=unixref
The 'doi:' prefix will cause the task to look for a 'transform' with that
name, which is applied to the metadata value before parameter substitution
occurs. Transforms are defined in a task property such as the following:
transform.doi = match 10. trunc 60
This means exclude the value string up to the occurrence of '10.', then
truncate after 60 characters. The only transform functions currently defined:
'cut' <number> = remove number leading characters
'trunc' <number> = remove trailing characters after number length
'match' <pattern> = start match at pattern
'text' <characters> = append literal characters (enclose in ' ' when whitespace needed)
If the transform results in an invalid state (e.g. cutting more characters
than are in the value), the condition will be logged and the
un-transformed value used.
Transforms may also be used in datamaps, e.g.
//publisher/name=>shorten:dc.publisher,//romeocolour
which would apply the 'shorten' transform to the service response value(s)
prior to metadata field assignment.
An optional property 'headers' may be defined to stipulate any HTTP headers
required in the service call. The property syntax is double-pipe separated headers:
Accept: text/xml||Cache-Control: no-cache| Modifier and Type | Field and Description |
|---|---|
protected List<MetadataWebServiceDataInfo> |
dataList |
protected DocumentBuilder |
docBuilder |
protected String |
fieldSeparator |
protected Map<String,String> |
headers |
protected String |
lang |
protected String |
lookupField |
protected String |
lookupTransform |
protected Map<String,String> |
nsMap |
protected String |
templateParam |
protected Pattern |
ttPattern |
protected String |
urlTemplate |
communityService, configurationService, curator, handleService, itemService, taskId| Constructor and Description |
|---|
MetadataWebService() |
| Modifier and Type | Method and Description |
|---|---|
protected int |
callService(String value,
Item item,
StringBuilder resultSb) |
protected void |
checkNamespaces(Document document) |
protected int |
getMapIndex(String mapping) |
String |
getNamespaceURI(String prefix) |
String |
getPrefix(String uri) |
Iterator |
getPrefixes(String uri) |
void |
init(Curator curator,
String taskId)
Initializes task
|
protected String |
mangleExpr(String expr,
String prefix) |
protected String[] |
parseTransform(String field) |
int |
perform(DSpaceObject dso)
Perform the curation task upon passed DSO
|
protected int |
processResponse(Document doc,
Item item,
StringBuilder resultSb) |
protected String[] |
tokenize(String text) |
protected String |
transform(String value,
String transDef) |
dereference, distribute, perform, performItem, performObject, report, setResult, taskArrayProperty, taskBooleanProperty, taskIntProperty, taskLongProperty, taskPropertyprotected Pattern ttPattern
protected String urlTemplate
protected String templateParam
protected String lookupField
protected String lookupTransform
protected List<MetadataWebServiceDataInfo> dataList
protected DocumentBuilder docBuilder
protected String lang
protected String fieldSeparator
public void init(Curator curator, String taskId) throws IOException
init in interface CurationTaskinit in class AbstractCurationTaskcurator - Curator object performing this tasktaskId - the configured local name of the taskIOException - if errorpublic int perform(DSpaceObject dso) throws IOException
perform in interface CurationTaskperform in class AbstractCurationTaskdso - the DSpace objectIOException - if IO errorprotected int callService(String value, Item item, StringBuilder resultSb) throws IOException
IOExceptionprotected int processResponse(Document doc, Item item, StringBuilder resultSb) throws IOException
IOExceptionprotected int getMapIndex(String mapping)
protected void checkNamespaces(Document document) throws IOException
IOExceptionpublic String getNamespaceURI(String prefix)
getNamespaceURI in interface NamespaceContextpublic String getPrefix(String uri)
getPrefix in interface NamespaceContextpublic Iterator getPrefixes(String uri)
getPrefixes in interface NamespaceContextCopyright © 2016 DuraSpace. All rights reserved.