Package org.dspace.harvest
Class OAIHarvester
java.lang.Object
org.dspace.harvest.OAIHarvester
This class handles OAI harvesting of externally located records into this repository.
- Author:
- Alexey Maslov
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected BitstreamFormatServiceprotected BitstreamServiceprotected BundleServiceprotected CollectionServiceprotected ConfigurationServiceprotected HandleServiceprotected HarvestedCollectionServiceprotected HarvestedItemServiceprotected InstallItemServiceprotected ItemServicestatic final Stringstatic final Stringstatic final Stringstatic final Stringprotected PluginServiceprotected WorkspaceItemService -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected voidalertAdmin(int status, Exception ex) Generate and send an email to the administrator.protected StringextractHandle(Item item) Scan an item's metadata, looking for the value "identifier.*".Return all available metadata formatsstatic org.jdom2.NamespacegetDMDNamespace(String metadataKey) Cycle through the options and find the metadata namespace matching the provided key.protected List<org.jdom2.Element>getMDrecord(String oaiSource, String itemOaiId, String metadataPrefix) Query the OAI-PMH provider for a specific metadata record.static org.jdom2.NamespaceSearch the configuration options and find the ORE serialization stringstatic StringoaiResolveNamespaceToPrefix(String oaiSource, String MDNamespace) Query the OAI-PMH server for its mapping of the supplied namespace and metadata prefix.protected voidprocessRecord(org.jdom2.Element record, String OREPrefix, long currentRecord, long totalListSize) Process an individual PMH record, making (or updating) a corresponding DSpace Item.voidPerforms a harvest cycle on this collection.Verify OAI settings for the current collection
-
Field Details
-
OAI_ADDRESS_ERROR
- See Also:
-
OAI_SET_ERROR
- See Also:
-
OAI_DMD_ERROR
- See Also:
-
OAI_ORE_ERROR
- See Also:
-
bitstreamService
-
bitstreamFormatService
-
bundleService
-
collectionService
-
harvestedCollectionService
-
installItemService
-
itemService
-
handleService
-
harvestedItemService
-
workspaceItemService
-
pluginService
-
configurationService
-
-
Constructor Details
-
OAIHarvester
public OAIHarvester(Context c, DSpaceObject dso, HarvestedCollection hc) throws HarvestingException, SQLException - Throws:
HarvestingExceptionSQLException
-
-
Method Details
-
getORENamespace
public static org.jdom2.Namespace getORENamespace()Search the configuration options and find the ORE serialization string- Returns:
- Namespace of the supported ORE format. Returns null if not found.
-
getDMDNamespace
Cycle through the options and find the metadata namespace matching the provided key.- Parameters:
metadataKey-- Returns:
- Namespace of the designated metadata format. Returns null of not found.
-
runHarvest
Performs a harvest cycle on this collection. This will query the remote OAI-PMH provider, check for updates since last harvest, and ingest the returned items.- Throws:
IOException- A general class of exceptions produced by failed or interrupted I/O operations.SQLException- An exception that provides information on a database access error or other errors.AuthorizeException- Exception indicating the current user of the context does not have permission to perform a particular action.
-
processRecord
protected void processRecord(org.jdom2.Element record, String OREPrefix, long currentRecord, long totalListSize) throws SQLException, AuthorizeException, IOException, CrosswalkException, HarvestingException, ParserConfigurationException, SAXException, XPathExpressionException Process an individual PMH record, making (or updating) a corresponding DSpace Item.- Parameters:
record- a JDOM Element containing the actual PMH record with descriptive metadata.OREPrefix- the metadataprefix value used by the remote PMH server to disseminate ORE. Only used for collections set up to harvest content.currentRecord- current record number to logtotalListSize- The total number of records that this Harvest contains- Throws:
SQLException- An exception that provides information on a database access error or other errors.AuthorizeException- Exception indicating the current user of the context does not have permission to perform a particular action.IOException- A general class of exceptions produced by failed or interrupted I/O operations.CrosswalkException- if crosswalk errorHarvestingException- if harvesting errorParserConfigurationException- XML parsing errorSAXException- if XML processing errorXPathExpressionException- if XPath error
-
extractHandle
Scan an item's metadata, looking for the value "identifier.*". If it meets the parameters that identify it as valid handle as set in dspace.cfg (harvester.acceptedHandleServer and harvester.rejectedHandlePrefix), use that handle instead of minting a new one.- Parameters:
item- a newly created, but not yet installed, DSpace Item- Returns:
- null or the handle to be used.
-
oaiResolveNamespaceToPrefix
public static String oaiResolveNamespaceToPrefix(String oaiSource, String MDNamespace) throws IOException, ParserConfigurationException, SAXException, XPathExpressionException, ConnectException Query the OAI-PMH server for its mapping of the supplied namespace and metadata prefix. For example for a typical OAI-PMH server a query "http://www.openarchives.org/OAI/2.0/oai_dc/" would return "oai_dc".- Parameters:
oaiSource- the address of the OAI-PMH providerMDNamespace- the namespace that we are trying to resolve to the metadataPrefix- Returns:
- metadataPrefix the OAI-PMH provider has assigned to the supplied namespace
- Throws:
IOException- A general class of exceptions produced by failed or interrupted I/O operations.ParserConfigurationException- XML parsing errorSAXException- if XML processing errorXPathExpressionException- if XPath errorConnectException- if could not connect to OAI server
-
alertAdmin
Generate and send an email to the administrator. Prompted by errors encountered during harvesting.- Parameters:
status- the current status of the collection, usually HarvestedCollection.STATUS_OAI_ERROR or HarvestedCollection.STATUS_UNKNOWN_ERRORex- the Exception that prompted this action
-
getMDrecord
protected List<org.jdom2.Element> getMDrecord(String oaiSource, String itemOaiId, String metadataPrefix) throws IOException, ParserConfigurationException, SAXException, XPathExpressionException, HarvestingException Query the OAI-PMH provider for a specific metadata record.- Parameters:
oaiSource- the address of the OAI-PMH provideritemOaiId- the OAI identifier of the target itemmetadataPrefix- the OAI metadataPrefix of the desired metadata- Returns:
- list of JDOM elements corresponding to the metadata entries in the located record.
- Throws:
IOException- A general class of exceptions produced by failed or interrupted I/O operations.ParserConfigurationException- XML parsing errorSAXException- if XML processing errorXPathExpressionException- if XPath errorHarvestingException- if harvesting error
-
verifyOAIharvester
Verify OAI settings for the current collection- Returns:
- list of errors encountered during verification. Empty list indicates a "success" condition.
-
getAvailableMetadataFormats
Return all available metadata formats- Returns:
- a list containing a map for each supported metadata format
-