Class MetadataImport

All Implemented Interfaces:
Runnable
Direct Known Subclasses:
MetadataImportCLI

public class MetadataImport extends DSpaceRunnable<MetadataImportScriptConfiguration>
Metadata importer to allow the batch import of metadata from a file
Author:
Stuart Lewis
  • Field Details

    • authorityControlled

      protected Set<String> authorityControlled
      The authority controlled fields
    • AC_PREFIX

      protected static final String AC_PREFIX
      The prefix of the authority controlled field
      See Also:
    • csvRefMap

      protected Map<String,Set<Integer>> csvRefMap
      Map of field:value to csv row number, used to resolve indirect entity target references.
      See Also:
      • populateRefAndRowMap(DSpaceCSVLine, UUID)
    • csvRowMap

      protected HashMap<Integer,UUID> csvRowMap
      Map of csv row number to UUID, used to resolve indirect entity target references.
      See Also:
      • populateRefAndRowMap(DSpaceCSVLine, UUID)
    • entityTypeMap

      protected HashMap<UUID,String> entityTypeMap
      Map of UUIDs to their entity types.
      See Also:
      • populateRefAndRowMap(DSpaceCSVLine, UUID)
    • entityRelationMap

      protected HashMap<String,HashMap<String,ArrayList<String>>> entityRelationMap
      Map of UUIDs to their relations that are referenced within any import with their referrers.
      See Also:
      • populateEntityRelationMap(String, String, String)
    • relationValidationErrors

      protected ArrayList<String> relationValidationErrors
      Collection of errors generated during relation validation process.
    • rowCount

      protected Integer rowCount
      Counter of rows processed in a CSV.
    • validateOnly

      protected boolean validateOnly
    • log

      protected static final org.apache.logging.log4j.Logger log
      Logger
    • itemService

      protected ItemService itemService
    • installItemService

      protected InstallItemService installItemService
    • collectionService

      protected CollectionService collectionService
    • handleService

      protected HandleService handleService
    • workspaceItemService

      protected WorkspaceItemService workspaceItemService
    • relationshipTypeService

      protected RelationshipTypeService relationshipTypeService
    • relationshipService

      protected RelationshipService relationshipService
    • entityTypeService

      protected EntityTypeService entityTypeService
    • entityService

      protected EntityService entityService
    • authorityValueService

      protected AuthorityValueService authorityValueService
    • configurationService

      protected ConfigurationService configurationService
  • Constructor Details

    • MetadataImport

      public MetadataImport()
  • Method Details

    • initMetadataImport

      public void initMetadataImport(DSpaceCSV toImport)
      Create an instance of the metadata importer. Requires a context and an array of CSV lines to examine.
      Parameters:
      toImport - An array of CSV lines to examine
    • internalRun

      public void internalRun() throws Exception
      Description copied from class: DSpaceRunnable
      This method has to be included in every script and this will be the main execution block for the script that'll contain all the logic needed
      Specified by:
      internalRun in class DSpaceRunnable<MetadataImportScriptConfiguration>
      Throws:
      Exception - If something goes wrong
    • assignCurrentUserInContext

      protected void assignCurrentUserInContext(Context context) throws org.apache.commons.cli.ParseException
      Throws:
      org.apache.commons.cli.ParseException
    • determineChange

      protected boolean determineChange(DSpaceRunnableHandler handler) throws IOException
      This method determines whether the changes should be applied or not. This is default set to true for the REST script as we don't want to interact with the caller. This will be overwritten in the CLI script to ask for confirmation
      Parameters:
      handler - Applicable DSpaceRunnableHandler
      Returns:
      boolean indicating the value
      Throws:
      IOException - If something goes wrong
    • getScriptConfiguration

      public MetadataImportScriptConfiguration getScriptConfiguration()
      Description copied from class: DSpaceRunnable
      This method will return the Configuration that the implementing DSpaceRunnable uses
      Specified by:
      getScriptConfiguration in class DSpaceRunnable<MetadataImportScriptConfiguration>
      Returns:
      The ScriptConfiguration that this implementing DspaceRunnable uses
    • setup

      public void setup() throws org.apache.commons.cli.ParseException
      Description copied from class: DSpaceRunnable
      This method has to be included in every script and handles the setup of the script by parsing the CommandLine and setting the variables
      Specified by:
      setup in class DSpaceRunnable<MetadataImportScriptConfiguration>
      Throws:
      org.apache.commons.cli.ParseException - If something goes wrong
    • runImport

      public List<BulkEditChange> runImport(Context c, boolean change, boolean useWorkflow, boolean workflowNotify, boolean useTemplate) throws MetadataImportException, SQLException, AuthorizeException, WorkflowException, IOException
      Run an import. The import can either be read-only to detect changes, or can write changes as it goes.
      Parameters:
      change - Whether or not to write the changes to the database
      useWorkflow - Whether the workflows should be used when creating new items
      workflowNotify - If the workflows should be used, whether to send notifications or not
      useTemplate - Use collection template if create new item
      Returns:
      An array of BulkEditChange elements representing the items that have changed
      Throws:
      MetadataImportException - if something goes wrong
      SQLException
      AuthorizeException
      WorkflowException
      IOException
    • compareAndUpdate

      protected void compareAndUpdate(Context c, Item item, String[] fromCSV, boolean change, String md, BulkEditChange changes, DSpaceCSVLine line) throws SQLException, AuthorizeException, MetadataImportException
      Compare an item metadata with a line from CSV, and optionally update the item.
      Parameters:
      item - The current item metadata
      fromCSV - The metadata from the CSV file
      change - Whether or not to make the update
      md - The element to compare
      changes - The changes object to populate
      line - line in CSV file
      Throws:
      SQLException - if there is a problem accessing a Collection from the database, from its handle
      AuthorizeException - if there is an authorization problem with permissions
      MetadataImportException - custom exception for error handling within metadataimport
    • compare

      protected void compare(Context c, Item item, List<String> collections, List<Collection> actualCollections, BulkEditChange bechange, boolean change) throws SQLException, AuthorizeException, IOException, MetadataImportException
      Compare changes between an items owning collection and mapped collections and what is in the CSV file
      Parameters:
      item - The item in question
      collections - The collection handles from the CSV file
      actualCollections - The Collections from the actual item
      bechange - The bulkedit change object for this item
      change - Whether or not to actuate a change
      Throws:
      SQLException - if there is a problem accessing a Collection from the database, from its handle
      AuthorizeException - if there is an authorization problem with permissions
      IOException - Can be thrown when moving items in communities
      MetadataImportException - If something goes wrong to be reported back to the user
    • add

      protected void add(Context c, String[] fromCSV, String md, BulkEditChange changes) throws SQLException, AuthorizeException
      Add an item metadata with a line from CSV, and optionally update the item
      Parameters:
      fromCSV - The metadata from the CSV file
      md - The element to compare
      changes - The changes object to populate
      Throws:
      SQLException - when an SQL error has occurred (querying DSpace)
      AuthorizeException - If the user can't make the changes
    • getBulkEditValueFromCSV

      protected BulkEditMetadataValue getBulkEditValueFromCSV(Context c, String language, String schema, String element, String qualifier, String value, AuthorityValue fromAuthority)
    • simplyCopyValue

      protected void simplyCopyValue(String value, BulkEditMetadataValue dcv)
    • contains

      protected boolean contains(String needle, String[] haystack)
      Method to find if a String occurs in an array of Strings
      Parameters:
      needle - The String to look for
      haystack - The array of Strings to search through
      Returns:
      Whether or not it is contained
    • clean

      protected String clean(String in)
      Clean elements before comparing
      Parameters:
      in - The element to clean
      Returns:
      The cleaned up element
    • resolveEntityRefs

      public DSpaceCSVLine resolveEntityRefs(Context c, DSpaceCSVLine line) throws MetadataImportException
      Gets a copy of the given csv line with all entity target references resolved to UUID strings. Keys being iterated over represent metadatafields or special columns to be processed.
      Parameters:
      line - the csv line to process.
      Returns:
      a copy, with all references resolved.
      Throws:
      MetadataImportException - if there is an error resolving any entity target reference.