Package org.dspace.ctask.general
Class MetadataWebService
- java.lang.Object
-
- org.dspace.curate.AbstractCurationTask
-
- org.dspace.ctask.general.MetadataWebService
-
- All Implemented Interfaces:
NamespaceContext,CurationTask
@Mutative @Suspendable public class MetadataWebService extends AbstractCurationTask implements NamespaceContext
MetadataWebService task calls a web service using metadata from passed item to obtain data. Depending on configuration, this data may be assigned to item metadata fields, or just recorded in the task result string. Task succeeds if web service call succeeds and configured updates occur, fails if task user not authorized or item lacks metadata to call service, and returns error in all other cases (except skip status for non-item objects). Intended use: cataloging tool in workflow and general curation. The task uses a URL 'template' to compose the service call, e.g.http://www.sherpa.ac.uk/romeo/api29.php?issn=\{dc.identifier.issn\}Task will substitute the value of the passed item's metadata field in the {parameter} position. If multiple values are present in the item field, the first value is used. The task uses another property (the datamap) to determine what data to extract from the service response and how to use it, e.g.//publisher/name=>dc.publisher,//romeocolourTask will evaluate the left-hand side (or entire token) of each comma-separated token in the property as an XPath 1.0 expression into the response document, and if there is a mapping symbol (e.g.'=>') and value, it will assign the response document value(s) to the named metadata field in the passed item. If the response document contains multiple values, they will all be assigned to the item field. The mapping symbol governs the nature of metadata field assignment:'->'mapping will add to any existing values in the item field'=>'mapping will replace any existing values in the item field'~>'mapping will add *only* if item field has no existing values Unmapped data (without a mapping symbol) will simply be added to the task result string, prepended by the XPath expression (a little prettified). Each label/value pair in the result string is separated by a space, unless the optional 'separator' property is defined. A very rudimentary facility for transformation of data is supported, e.g.http://www.crossref.org/openurl/?id=\{doi:dc.relation.isversionof\}&format=unixrefThe 'doi:' prefix will cause the task to look for a 'transform' with that name, which is applied to the metadata value before parameter substitution occurs. Transforms are defined in a task property such as the following:transform.doi = match 10. trunc 60This means exclude the value string up to the occurrence of '10.', then truncate after 60 characters. The only transform functions currently defined:'cut' <number>= remove number leading characters'trunc' <number>= remove trailing characters after number length'match' <pattern>= start match at pattern'text' <characters>= append literal characters (enclose in ' ' when whitespace needed) If the transform results in an invalid state (e.g. cutting more characters than are in the value), the condition will be logged and the un-transformed value used. Transforms may also be used in datamaps, e.g.//publisher/name=>shorten:dc.publisher,//romeocolourwhich would apply the 'shorten' transform to the service response value(s) prior to metadata field assignment. An optional property 'headers' may be defined to stipulate any HTTP headers required in the service call. The property syntax is double-pipe separated headers:Accept: text/xml||Cache-Control: no-cache- Author:
- richardrodgers
-
-
Field Summary
Fields Modifier and Type Field Description protected List<MetadataWebServiceDataInfo>dataListprotected DocumentBuilderdocBuilderprotected StringfieldSeparatorprotected Map<String,String>headersprotected Stringlangprotected StringlookupFieldprotected StringlookupTransformprotected Map<String,String>nsMapprotected StringtemplateParamprotected PatternttPatternprotected StringurlTemplate-
Fields inherited from class org.dspace.curate.AbstractCurationTask
communityService, configurationService, curator, handleService, itemService, taskId
-
-
Constructor Summary
Constructors Constructor Description MetadataWebService()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected intcallService(String value, Item item, StringBuilder resultSb)protected voidcheckNamespaces(Document document)protected intgetMapIndex(String mapping)StringgetNamespaceURI(String prefix)StringgetPrefix(String uri)IteratorgetPrefixes(String uri)voidinit(Curator curator, String taskId)Initializes taskprotected StringmangleExpr(String expr, String prefix)protected String[]parseTransform(String field)intperform(DSpaceObject dso)Perform the curation task upon passed DSOprotected intprocessResponse(Document doc, Item item, StringBuilder resultSb)protected String[]tokenize(String text)protected Stringtransform(String value, String transDef)-
Methods inherited from class org.dspace.curate.AbstractCurationTask
dereference, distribute, perform, performItem, performObject, report, setResult, taskArrayProperty, taskBooleanProperty, taskIntProperty, taskLongProperty, taskProperty
-
-
-
-
Field Detail
-
ttPattern
protected Pattern ttPattern
-
urlTemplate
protected String urlTemplate
-
templateParam
protected String templateParam
-
lookupField
protected String lookupField
-
lookupTransform
protected String lookupTransform
-
dataList
protected List<MetadataWebServiceDataInfo> dataList
-
docBuilder
protected DocumentBuilder docBuilder
-
lang
protected String lang
-
fieldSeparator
protected String fieldSeparator
-
-
Method Detail
-
init
public void init(Curator curator, String taskId) throws IOException
Initializes task- Specified by:
initin interfaceCurationTask- Overrides:
initin classAbstractCurationTask- Parameters:
curator- Curator object performing this tasktaskId- the configured local name of the task- Throws:
IOException- if error
-
perform
public int perform(DSpaceObject dso) throws IOException
Perform the curation task upon passed DSO- Specified by:
performin interfaceCurationTask- Specified by:
performin classAbstractCurationTask- Parameters:
dso- the DSpace object- Returns:
- status code
- Throws:
IOException- if IO error
-
callService
protected int callService(String value, Item item, StringBuilder resultSb) throws IOException
- Throws:
IOException
-
processResponse
protected int processResponse(Document doc, Item item, StringBuilder resultSb) throws IOException
- Throws:
IOException
-
getMapIndex
protected int getMapIndex(String mapping)
-
checkNamespaces
protected void checkNamespaces(Document document) throws IOException
- Throws:
IOException
-
getNamespaceURI
public String getNamespaceURI(String prefix)
- Specified by:
getNamespaceURIin interfaceNamespaceContext
-
getPrefix
public String getPrefix(String uri)
- Specified by:
getPrefixin interfaceNamespaceContext
-
getPrefixes
public Iterator getPrefixes(String uri)
- Specified by:
getPrefixesin interfaceNamespaceContext
-
-