Package org.dspace.harvest
Class OAIHarvester
- java.lang.Object
-
- org.dspace.harvest.OAIHarvester
-
public class OAIHarvester extends Object
This class handles OAI harvesting of externally located records into this repository.- Author:
- Alexey Maslov
-
-
Field Summary
Fields Modifier and Type Field Description protected BitstreamFormatServicebitstreamFormatServiceprotected BitstreamServicebitstreamServiceprotected BundleServicebundleServiceprotected CollectionServicecollectionServiceprotected ConfigurationServiceconfigurationServiceprotected HandleServicehandleServiceprotected HarvestedCollectionServiceharvestedCollectionServiceprotected HarvestedItemServiceharvestedItemServiceprotected InstallItemServiceinstallItemServiceprotected ItemServiceitemServicestatic StringOAI_ADDRESS_ERRORstatic StringOAI_DMD_ERRORstatic StringOAI_ORE_ERRORstatic StringOAI_SET_ERRORprotected PluginServicepluginServiceprotected WorkspaceItemServiceworkspaceItemService
-
Constructor Summary
Constructors Constructor Description OAIHarvester(Context c, DSpaceObject dso, HarvestedCollection hc)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description protected voidalertAdmin(int status, Exception ex)Generate and send an email to the administrator.protected StringextractHandle(Item item)Scan an item's metadata, looking for the value "identifier.*".static List<Map<String,String>>getAvailableMetadataFormats()Return all available metadata formatsstatic org.jdom2.NamespacegetDMDNamespace(String metadataKey)Cycle through the options and find the metadata namespace matching the provided key.protected List<org.jdom2.Element>getMDrecord(String oaiSource, String itemOaiId, String metadataPrefix)Query the OAI-PMH provider for a specific metadata record.static org.jdom2.NamespacegetORENamespace()Search the configuration options and find the ORE serialization stringstatic StringoaiResolveNamespaceToPrefix(String oaiSource, String MDNamespace)Query the OAI-PMH server for its mapping of the supplied namespace and metadata prefix.protected voidprocessRecord(org.jdom2.Element record, String OREPrefix, long currentRecord, long totalListSize)Process an individual PMH record, making (or updating) a corresponding DSpace Item.voidrunHarvest()Performs a harvest cycle on this collection.List<String>verifyOAIharvester()Verify OAI settings for the current collection
-
-
-
Field Detail
-
OAI_ADDRESS_ERROR
public static final String OAI_ADDRESS_ERROR
- See Also:
- Constant Field Values
-
OAI_SET_ERROR
public static final String OAI_SET_ERROR
- See Also:
- Constant Field Values
-
OAI_DMD_ERROR
public static final String OAI_DMD_ERROR
- See Also:
- Constant Field Values
-
OAI_ORE_ERROR
public static final String OAI_ORE_ERROR
- See Also:
- Constant Field Values
-
bitstreamService
protected BitstreamService bitstreamService
-
bitstreamFormatService
protected BitstreamFormatService bitstreamFormatService
-
bundleService
protected BundleService bundleService
-
collectionService
protected CollectionService collectionService
-
harvestedCollectionService
protected HarvestedCollectionService harvestedCollectionService
-
installItemService
protected InstallItemService installItemService
-
itemService
protected ItemService itemService
-
handleService
protected HandleService handleService
-
harvestedItemService
protected HarvestedItemService harvestedItemService
-
workspaceItemService
protected WorkspaceItemService workspaceItemService
-
pluginService
protected PluginService pluginService
-
configurationService
protected ConfigurationService configurationService
-
-
Constructor Detail
-
OAIHarvester
public OAIHarvester(Context c, DSpaceObject dso, HarvestedCollection hc) throws HarvestingException, SQLException
- Throws:
HarvestingExceptionSQLException
-
-
Method Detail
-
getORENamespace
public static org.jdom2.Namespace getORENamespace()
Search the configuration options and find the ORE serialization string- Returns:
- Namespace of the supported ORE format. Returns null if not found.
-
getDMDNamespace
public static org.jdom2.Namespace getDMDNamespace(String metadataKey)
Cycle through the options and find the metadata namespace matching the provided key.- Parameters:
metadataKey-- Returns:
- Namespace of the designated metadata format. Returns null of not found.
-
runHarvest
public void runHarvest() throws SQLException, IOException, AuthorizeExceptionPerforms a harvest cycle on this collection. This will query the remote OAI-PMH provider, check for updates since last harvest, and ingest the returned items.- Throws:
IOException- A general class of exceptions produced by failed or interrupted I/O operations.SQLException- An exception that provides information on a database access error or other errors.AuthorizeException- Exception indicating the current user of the context does not have permission to perform a particular action.
-
processRecord
protected void processRecord(org.jdom2.Element record, String OREPrefix, long currentRecord, long totalListSize) throws SQLException, AuthorizeException, IOException, CrosswalkException, HarvestingException, ParserConfigurationException, SAXException, XPathExpressionExceptionProcess an individual PMH record, making (or updating) a corresponding DSpace Item.- Parameters:
record- a JDOM Element containing the actual PMH record with descriptive metadata.OREPrefix- the metadataprefix value used by the remote PMH server to disseminate ORE. Only used for collections set up to harvest content.currentRecord- current record number to logtotalListSize- The total number of records that this Harvest contains- Throws:
SQLException- An exception that provides information on a database access error or other errors.AuthorizeException- Exception indicating the current user of the context does not have permission to perform a particular action.IOException- A general class of exceptions produced by failed or interrupted I/O operations.CrosswalkException- if crosswalk errorHarvestingException- if harvesting errorParserConfigurationException- XML parsing errorSAXException- if XML processing errorXPathExpressionException- if XPath error
-
extractHandle
protected String extractHandle(Item item)
Scan an item's metadata, looking for the value "identifier.*". If it meets the parameters that identify it as valid handle as set in dspace.cfg (harvester.acceptedHandleServer and harvester.rejectedHandlePrefix), use that handle instead of minting a new one.- Parameters:
item- a newly created, but not yet installed, DSpace Item- Returns:
- null or the handle to be used.
-
oaiResolveNamespaceToPrefix
public static String oaiResolveNamespaceToPrefix(String oaiSource, String MDNamespace) throws IOException, ParserConfigurationException, SAXException, XPathExpressionException, ConnectException
Query the OAI-PMH server for its mapping of the supplied namespace and metadata prefix. For example for a typical OAI-PMH server a query "http://www.openarchives.org/OAI/2.0/oai_dc/" would return "oai_dc".- Parameters:
oaiSource- the address of the OAI-PMH providerMDNamespace- the namespace that we are trying to resolve to the metadataPrefix- Returns:
- metadataPrefix the OAI-PMH provider has assigned to the supplied namespace
- Throws:
IOException- A general class of exceptions produced by failed or interrupted I/O operations.ParserConfigurationException- XML parsing errorSAXException- if XML processing errorXPathExpressionException- if XPath errorConnectException- if could not connect to OAI server
-
alertAdmin
protected void alertAdmin(int status, Exception ex)Generate and send an email to the administrator. Prompted by errors encountered during harvesting.- Parameters:
status- the current status of the collection, usually HarvestedCollection.STATUS_OAI_ERROR or HarvestedCollection.STATUS_UNKNOWN_ERRORex- the Exception that prompted this action
-
getMDrecord
protected List<org.jdom2.Element> getMDrecord(String oaiSource, String itemOaiId, String metadataPrefix) throws IOException, ParserConfigurationException, SAXException, XPathExpressionException, HarvestingException
Query the OAI-PMH provider for a specific metadata record.- Parameters:
oaiSource- the address of the OAI-PMH provideritemOaiId- the OAI identifier of the target itemmetadataPrefix- the OAI metadataPrefix of the desired metadata- Returns:
- list of JDOM elements corresponding to the metadata entries in the located record.
- Throws:
IOException- A general class of exceptions produced by failed or interrupted I/O operations.ParserConfigurationException- XML parsing errorSAXException- if XML processing errorXPathExpressionException- if XPath errorHarvestingException- if harvesting error
-
verifyOAIharvester
public List<String> verifyOAIharvester()
Verify OAI settings for the current collection- Returns:
- list of errors encountered during verification. Empty list indicates a "success" condition.
-
-