public class OaiPmhHarvester extends GenericHarvester
This plugin harvests metadata records from an OAI-PMH compatible repository using OAI-PMH protocol. If the repository returns a 503, the HTTP headers are checked for Retry-After value, in an effort not to hammer the server.
Sample configuration file for OAI PMH harvester: usq.json
| Option | Description | Required | Default |
|---|---|---|---|
| url | The base URL of the OAI-PMH repository to harvest | Yes | None |
| maxRequests | Limit number of HTTP requests to make. Set this to -1 to configure the harvester to retrieve all records. | No | -1 |
| metadataPrefix | Set the type of metadata records to harvest, the first prefix in the list will be set as the source payload | No | oai_dc |
| setSpec | Set the OAI-PMH set to harvest | No | None |
| useSetInStorage | If true, and a 'setSpec' is provided, it will be used as part of OID generate for storage. This allows for a record that would be it two sets to be harvested twice as two objects, rather than each harvest using the same OID. | No | false |
| from | Harvest records from this date | No | None |
| until | Harvest records up to this date | No | None |
"harvester": {
"type": "oai-pmh",
"oai-pmh": {
"url": "http://eprints.usq.edu.au/cgi/oai2",
"maxRequests": 1
}
}
"harvester": {
"type": "oai-pmh",
"oai-pmh": {
"url": "http://eprints.usq.edu.au/cgi/oai2",
"recordID": "oai:eprints.usq.edu.au:5"
}
}
"harvester": {
"type": "oai-pmh",
"oai-pmh": {
"url": "http://eprints.usq.edu.au/cgi/oai2",
"from": "2009-01-01T00:00:00Z",
"until": "2009-01-31T00:00:00Z"
}
}
Sample rule file for the OAI PMH harvester: usq.py
None
| Modifier and Type | Field and Description |
|---|---|
static String |
DATE_FORMAT
Date format
|
static String |
DATETIME_FORMAT
Date and time format
|
static String |
DEFAULT_METADATA_PREFIX
Default metadataPrefix (Dublin Core)
|
| Constructor and Description |
|---|
OaiPmhHarvester()
Basic constructor.
|
| Modifier and Type | Method and Description |
|---|---|
Set<String> |
getObjectIdList()
Gets a list of digital object IDs.
|
boolean |
hasMoreObjects()
Tests whether there are more objects to retrieve.
|
void |
init()
Basic init() function.
|
getDeletedObjectIdList, getId, getJsonConfig, getName, getObjectId, getPluginDetails, getStorage, hasMoreDeletedObjects, init, init, setStorage, shutdownpublic static final String DATE_FORMAT
public static final String DATETIME_FORMAT
public static final String DEFAULT_METADATA_PREFIX
public void init()
throws HarvesterException
init in class GenericHarvesterHarvesterException - : If there are problems during instantiationpublic Set<String> getObjectIdList() throws HarvesterException
HarvesterException - if there was an error retrieving the objectspublic boolean hasMoreObjects()
Copyright © 2009-2013. All Rights Reserved.