Interface DuplicateDetectionService

All Known Implementing Classes:
DuplicateDetectionServiceImpl

public interface DuplicateDetectionService
Duplicate Detection Service handles get, search and validation operations for duplicate detection.
Author:
Kim Shepherd
See Also:
  • Field Details

    • log

      static final org.apache.logging.log4j.Logger log
      Logger
  • Method Details

    • getPotentialDuplicates

      List<PotentialDuplicate> getPotentialDuplicates(Context context, Item item) throws SearchServiceException
      Get a list of PotentialDuplicate objects (wrappers with some metadata included for previewing) that are identified as potential duplicates of the given item
      Parameters:
      context - DSpace context
      item - Item to check
      Returns:
      List of potential duplicates (empty if none found)
      Throws:
      SearchServiceException - if an error occurs performing the discovery search
    • validateDuplicateResult

      Optional<PotentialDuplicate> validateDuplicateResult(Context context, IndexableObject indexableObject, Item original) throws SQLException, AuthorizeException
      Validate an indexable object (returned by discovery search) to ensure it is permissible, readable and valid and can be added to a list of results. An Optional is returned, if it is empty then it was invalid or did not pass validation.
      Parameters:
      context - The DSpace context
      indexableObject - The discovery search result
      original - The original item (to compare IDs, submitters, etc)
      Returns:
      An Optional potential duplicate
      Throws:
      SQLException
      AuthorizeException
    • searchDuplicates

      DiscoverResult searchDuplicates(Context context, Item item) throws SearchServiceException
      Search discovery for potential duplicates of a given item. The search uses levenshtein distance (configurable) and a single-term "comparison value" constructed out of the item title
      Parameters:
      context - DSpace context
      item - The item to check
      Returns:
      DiscoverResult as a result of performing search. Null if invalid.
      Throws:
      SearchServiceException - if an error was encountered during the discovery search itself.
    • buildComparisonValue

      String buildComparisonValue(Context context, Item item)
      Build a comparison value string made up of values of configured fields, used when indexing and querying items for deduplication
      Parameters:
      context - DSpace context
      item - The DSpace item
      Returns:
      a constructed, normalised string