Class DuplicateDetectionServiceImpl

java.lang.Object
org.dspace.content.DuplicateDetectionServiceImpl
All Implemented Interfaces:
DuplicateDetectionService

public class DuplicateDetectionServiceImpl extends Object implements DuplicateDetectionService
Default implementation of DuplicateDetectionService. Duplicate Detection Service handles get, search and validation operations for duplicate detection.
Author:
Kim Shepherd
  • Constructor Details

    • DuplicateDetectionServiceImpl

      public DuplicateDetectionServiceImpl()
  • Method Details

    • getPotentialDuplicates

      public List<PotentialDuplicate> getPotentialDuplicates(Context context, Item item) throws SearchServiceException
      Get a list of PotentialDuplicate objects (wrappers with some metadata included for previewing) that are identified as potential duplicates of the given item
      Specified by:
      getPotentialDuplicates in interface DuplicateDetectionService
      Parameters:
      context - DSpace context
      item - Item to check
      Returns:
      List of potential duplicates (empty if none found)
      Throws:
      SearchServiceException - if an error occurs performing the discovery search
    • validateDuplicateResult

      public Optional<PotentialDuplicate> validateDuplicateResult(Context context, IndexableObject indexableObject, Item original) throws SQLException, AuthorizeException
      Validate an indexable object (returned by discovery search) to ensure it is permissible, readable and valid and can be added to a list of results. An Optional is returned, if it is empty then it was invalid or did not pass validation.
      Specified by:
      validateDuplicateResult in interface DuplicateDetectionService
      Parameters:
      context - The DSpace context
      indexableObject - The discovery search result
      original - The original item (to compare IDs, submitters, etc)
      Returns:
      An Optional potential duplicate
      Throws:
      SQLException
      AuthorizeException
    • searchDuplicates

      public DiscoverResult searchDuplicates(Context context, Item item) throws SearchServiceException
      Search discovery for potential duplicates of a given item. The search uses levenshtein distance (configurable) and a single-term "comparison value" constructed out of the item title
      Specified by:
      searchDuplicates in interface DuplicateDetectionService
      Parameters:
      context - DSpace context
      item - The item to check
      Returns:
      DiscoverResult as a result of performing search. Null if invalid.
      Throws:
      SearchServiceException - if an error was encountered during the discovery search itself.
    • buildComparisonValue

      public String buildComparisonValue(Context context, Item item)
      Build a comparison value string made up of values of configured fields, used when indexing and querying items for deduplication
      Specified by:
      buildComparisonValue in interface DuplicateDetectionService
      Parameters:
      context - DSpace context
      item - The DSpace item
      Returns:
      a constructed, normalised string