public class FileSystemHarvester extends GenericHarvester
This plugin harvests files in a specified directory or a specified file on the local file system. it can use a cache to do incremental harvests, which only harvests files that have changed since the last time it was run. system.
Sample configuration file for file system harvester: local-files.json
| Option | Description | Required | Default |
|---|---|---|---|
| baseDir | Path of directory or file to be harvested | Yes | ${user.home}/Documents/public/ |
| facetDir | Used to specify the top level directory for the file_path facet | No | ${user.home}/Documents/public/ |
| ignoreFilter | Pipe-separated ('|') list of filename patterns to ignore | No | .svn|.ice|.*|~*|Thumbs.db|.DS_Store |
| recursive | Set true to harvest files recursively | No | true |
| force | Force harvest the specified directory or file again even when it's not modified (ignore cache) | No | false |
| link | Store the digital object as a link in the storage and point to the original file in the file system | No | true |
| caching | Caching method to use. Valid entries are 'basic' and 'hashed' | No | null |
| cacheId | The cache ID to use in the database if caching is in use. | Yes (if valid 'caching' value is provided) | null |
| derbyHome | Path to use for the file store of the database. Should match other Derby paths provided in the configuration file for the application. | Yes (if valid 'caching' value is provided) | null |
"harvester": {
"type": "file-system",
"file-system": {
"targets": [
{
"baseDir": "${user.home}/Documents/public/",
"facetDir": "${user.home}/Documents/public/",
"ignoreFilter": ".svn|.ice|.*|~*|Thumbs.db|.DS_Store",
"recursive": true,
"force": false,
"link": true
}
],
"caching": "basic",
"cacheId": "default",
"derbyHome" : "${fascinator.home}/database"
}
}
Sample rule file for the file system harvester: local-files.py
None
| Constructor and Description |
|---|
FileSystemHarvester()
File System Harvester Constructor
|
| Modifier and Type | Method and Description |
|---|---|
Set<String> |
getDeletedObjectIdList()
Delete cached references to files which no longer exist and return the
set of IDs to delete from the system.
|
Set<String> |
getObjectIdList()
Harvest the next set of files, and return their Object IDs
|
boolean |
hasMoreDeletedObjects()
Check if there are more objects to delete
|
boolean |
hasMoreObjects()
Check if there are more objects to harvest
|
void |
init()
Initialisation of File system harvester plugin
|
void |
shutdown()
Shutdown the plugin
|
getId, getJsonConfig, getName, getObjectId, getPluginDetails, getStorage, init, init, setStoragepublic FileSystemHarvester()
public void init()
throws HarvesterException
init in class GenericHarvesterHarvesterException - if fails to initialisepublic void shutdown()
throws HarvesterException
shutdown in interface Pluginshutdown in class GenericHarvesterHarvesterException - is there are errorspublic Set<String> getObjectIdList() throws HarvesterException
HarvesterException - is there are errorspublic boolean hasMoreObjects()
true if there are more, false otherwisepublic Set<String> getDeletedObjectIdList() throws HarvesterException
getDeletedObjectIdList in interface HarvestergetDeletedObjectIdList in class GenericHarvesterHarvesterException - is there are errorspublic boolean hasMoreDeletedObjects()
hasMoreDeletedObjects in interface HarvesterhasMoreDeletedObjects in class GenericHarvestertrue if there are more, false otherwiseCopyright © 2009-2013. All Rights Reserved.