Package org.dspace.app.bulkedit
Class MetadataImport
- java.lang.Object
-
- org.dspace.scripts.DSpaceRunnable<MetadataImportScriptConfiguration>
-
- org.dspace.app.bulkedit.MetadataImport
-
- All Implemented Interfaces:
Runnable
- Direct Known Subclasses:
MetadataImportCLI
public class MetadataImport extends DSpaceRunnable<MetadataImportScriptConfiguration>
Metadata importer to allow the batch import of metadata from a file- Author:
- Stuart Lewis
-
-
Field Summary
Fields Modifier and Type Field Description protected static StringAC_PREFIXThe prefix of the authority controlled fieldprotected static Set<String>authorityControlledThe authority controlled fieldsprotected AuthorityValueServiceauthorityValueServiceprotected CollectionServicecollectionServiceprotected ConfigurationServiceconfigurationServiceprotected Map<String,Set<Integer>>csvRefMapMap of field:value to csv row number, used to resolve indirect entity target references.protected HashMap<Integer,UUID>csvRowMapMap of csv row number to UUID, used to resolve indirect entity target references.protected HashMap<String,HashMap<String,ArrayList<String>>>entityRelationMapMap of UUIDs to their relations that are referenced within any import with their referrers.protected EntityServiceentityServiceprotected HashMap<UUID,String>entityTypeMapMap of UUIDs to their entity types.protected EntityTypeServiceentityTypeServiceprotected HandleServicehandleServiceprotected InstallItemServiceinstallItemServiceprotected ItemServiceitemServiceprotected static org.apache.logging.log4j.LoggerlogLoggerprotected RelationshipServicerelationshipServiceprotected RelationshipTypeServicerelationshipTypeServiceprotected ArrayList<String>relationValidationErrorsCollection of errors generated during relation validation process.protected IntegerrowCountCounter of rows processed in a CSV.protected booleanvalidateOnlyprotected WorkspaceItemServiceworkspaceItemService-
Fields inherited from class org.dspace.scripts.DSpaceRunnable
commandLine, handler
-
-
Constructor Summary
Constructors Constructor Description MetadataImport()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected voidadd(Context c, String[] fromCSV, String md, BulkEditChange changes)Add an item metadata with a line from CSV, and optionally update the itemprotected voidassignCurrentUserInContext(Context context)protected Stringclean(String in)Clean elements before comparingprotected voidcompare(Context c, Item item, List<String> collections, List<Collection> actualCollections, BulkEditChange bechange, boolean change)Compare changes between an items owning collection and mapped collections and what is in the CSV fileprotected voidcompareAndUpdate(Context c, Item item, String[] fromCSV, boolean change, String md, BulkEditChange changes, DSpaceCSVLine line)Compare an item metadata with a line from CSV, and optionally update the item.protected booleancontains(String needle, String[] haystack)Method to find if a String occurs in an array of Stringsprotected booleandetermineChange(DSpaceRunnableHandler handler)This method determines whether the changes should be applied or not.protected BulkEditMetadataValuegetBulkEditValueFromCSV(Context c, String language, String schema, String element, String qualifier, String value, AuthorityValue fromAuthority)MetadataImportScriptConfigurationgetScriptConfiguration()This method will return the Configuration that the implementing DSpaceRunnable usesvoidinitMetadataImport(DSpaceCSV toImport)Create an instance of the metadata importer.voidinternalRun()This method has to be included in every script and this will be the main execution block for the script that'll contain all the logic neededDSpaceCSVLineresolveEntityRefs(Context c, DSpaceCSVLine line)Gets a copy of the given csv line with all entity target references resolved to UUID strings.List<BulkEditChange>runImport(Context c, boolean change, boolean useWorkflow, boolean workflowNotify, boolean useTemplate)Run an import.voidsetup()This method has to be included in every script and handles the setup of the script by parsing the CommandLine and setting the variablesprotected voidsimplyCopyValue(String value, BulkEditMetadataValue dcv)-
Methods inherited from class org.dspace.scripts.DSpaceRunnable
getEpersonIdentifier, getFileNamesFromInputStreamOptions, initialize, printHelp, run, setEpersonIdentifier
-
-
-
-
Field Detail
-
authorityControlled
protected static Set<String> authorityControlled
The authority controlled fields
-
AC_PREFIX
protected static final String AC_PREFIX
The prefix of the authority controlled field- See Also:
- Constant Field Values
-
csvRefMap
protected Map<String,Set<Integer>> csvRefMap
Map of field:value to csv row number, used to resolve indirect entity target references.
-
csvRowMap
protected HashMap<Integer,UUID> csvRowMap
Map of csv row number to UUID, used to resolve indirect entity target references.
-
entityRelationMap
protected HashMap<String,HashMap<String,ArrayList<String>>> entityRelationMap
Map of UUIDs to their relations that are referenced within any import with their referrers.
-
relationValidationErrors
protected ArrayList<String> relationValidationErrors
Collection of errors generated during relation validation process.
-
rowCount
protected Integer rowCount
Counter of rows processed in a CSV.
-
validateOnly
protected boolean validateOnly
-
log
protected static final org.apache.logging.log4j.Logger log
Logger
-
itemService
protected ItemService itemService
-
installItemService
protected InstallItemService installItemService
-
collectionService
protected CollectionService collectionService
-
handleService
protected HandleService handleService
-
workspaceItemService
protected WorkspaceItemService workspaceItemService
-
relationshipTypeService
protected RelationshipTypeService relationshipTypeService
-
relationshipService
protected RelationshipService relationshipService
-
entityTypeService
protected EntityTypeService entityTypeService
-
entityService
protected EntityService entityService
-
authorityValueService
protected AuthorityValueService authorityValueService
-
configurationService
protected ConfigurationService configurationService
-
-
Method Detail
-
initMetadataImport
public void initMetadataImport(DSpaceCSV toImport)
Create an instance of the metadata importer. Requires a context and an array of CSV lines to examine.- Parameters:
toImport- An array of CSV lines to examine
-
internalRun
public void internalRun() throws ExceptionDescription copied from class:DSpaceRunnableThis method has to be included in every script and this will be the main execution block for the script that'll contain all the logic needed- Specified by:
internalRunin classDSpaceRunnable<MetadataImportScriptConfiguration>- Throws:
Exception- If something goes wrong
-
assignCurrentUserInContext
protected void assignCurrentUserInContext(Context context) throws org.apache.commons.cli.ParseException
- Throws:
org.apache.commons.cli.ParseException
-
determineChange
protected boolean determineChange(DSpaceRunnableHandler handler) throws IOException
This method determines whether the changes should be applied or not. This is default set to true for the REST script as we don't want to interact with the caller. This will be overwritten in the CLI script to ask for confirmation- Parameters:
handler- Applicable DSpaceRunnableHandler- Returns:
- boolean indicating the value
- Throws:
IOException- If something goes wrong
-
getScriptConfiguration
public MetadataImportScriptConfiguration getScriptConfiguration()
Description copied from class:DSpaceRunnableThis method will return the Configuration that the implementing DSpaceRunnable uses- Specified by:
getScriptConfigurationin classDSpaceRunnable<MetadataImportScriptConfiguration>- Returns:
- The
ScriptConfigurationthat this implementing DspaceRunnable uses
-
setup
public void setup() throws org.apache.commons.cli.ParseExceptionDescription copied from class:DSpaceRunnableThis method has to be included in every script and handles the setup of the script by parsing the CommandLine and setting the variables- Specified by:
setupin classDSpaceRunnable<MetadataImportScriptConfiguration>- Throws:
org.apache.commons.cli.ParseException- If something goes wrong
-
runImport
public List<BulkEditChange> runImport(Context c, boolean change, boolean useWorkflow, boolean workflowNotify, boolean useTemplate) throws MetadataImportException, SQLException, AuthorizeException, WorkflowException, IOException
Run an import. The import can either be read-only to detect changes, or can write changes as it goes.- Parameters:
change- Whether or not to write the changes to the databaseuseWorkflow- Whether the workflows should be used when creating new itemsworkflowNotify- If the workflows should be used, whether to send notifications or notuseTemplate- Use collection template if create new item- Returns:
- An array of BulkEditChange elements representing the items that have changed
- Throws:
MetadataImportException- if something goes wrongSQLExceptionAuthorizeExceptionWorkflowExceptionIOException
-
compareAndUpdate
protected void compareAndUpdate(Context c, Item item, String[] fromCSV, boolean change, String md, BulkEditChange changes, DSpaceCSVLine line) throws SQLException, AuthorizeException, MetadataImportException
Compare an item metadata with a line from CSV, and optionally update the item.- Parameters:
item- The current item metadatafromCSV- The metadata from the CSV filechange- Whether or not to make the updatemd- The element to comparechanges- The changes object to populateline- line in CSV file- Throws:
SQLException- if there is a problem accessing a Collection from the database, from its handleAuthorizeException- if there is an authorization problem with permissionsMetadataImportException- custom exception for error handling within metadataimport
-
compare
protected void compare(Context c, Item item, List<String> collections, List<Collection> actualCollections, BulkEditChange bechange, boolean change) throws SQLException, AuthorizeException, IOException, MetadataImportException
Compare changes between an items owning collection and mapped collections and what is in the CSV file- Parameters:
item- The item in questioncollections- The collection handles from the CSV fileactualCollections- The Collections from the actual itembechange- The bulkedit change object for this itemchange- Whether or not to actuate a change- Throws:
SQLException- if there is a problem accessing a Collection from the database, from its handleAuthorizeException- if there is an authorization problem with permissionsIOException- Can be thrown when moving items in communitiesMetadataImportException- If something goes wrong to be reported back to the user
-
add
protected void add(Context c, String[] fromCSV, String md, BulkEditChange changes) throws SQLException, AuthorizeException
Add an item metadata with a line from CSV, and optionally update the item- Parameters:
fromCSV- The metadata from the CSV filemd- The element to comparechanges- The changes object to populate- Throws:
SQLException- when an SQL error has occurred (querying DSpace)AuthorizeException- If the user can't make the changes
-
getBulkEditValueFromCSV
protected BulkEditMetadataValue getBulkEditValueFromCSV(Context c, String language, String schema, String element, String qualifier, String value, AuthorityValue fromAuthority)
-
simplyCopyValue
protected void simplyCopyValue(String value, BulkEditMetadataValue dcv)
-
contains
protected boolean contains(String needle, String[] haystack)
Method to find if a String occurs in an array of Strings- Parameters:
needle- The String to look forhaystack- The array of Strings to search through- Returns:
- Whether or not it is contained
-
clean
protected String clean(String in)
Clean elements before comparing- Parameters:
in- The element to clean- Returns:
- The cleaned up element
-
resolveEntityRefs
public DSpaceCSVLine resolveEntityRefs(Context c, DSpaceCSVLine line) throws MetadataImportException
Gets a copy of the given csv line with all entity target references resolved to UUID strings. Keys being iterated over represent metadatafields or special columns to be processed.- Parameters:
line- the csv line to process.- Returns:
- a copy, with all references resolved.
- Throws:
MetadataImportException- if there is an error resolving any entity target reference.
-
-