Package org.dspace.content
Class DuplicateDetectionServiceImpl
java.lang.Object
org.dspace.content.DuplicateDetectionServiceImpl
- All Implemented Interfaces:
DuplicateDetectionService
Default implementation of DuplicateDetectionService.
Duplicate Detection Service handles get, search and validation operations for duplicate detection.
- Author:
- Kim Shepherd
-
Field Summary
Fields inherited from interface org.dspace.content.service.DuplicateDetectionService
log -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionbuildComparisonValue(Context context, Item item) Build a comparison value string made up of values of configured fields, used when indexing and querying items for deduplicationgetPotentialDuplicates(Context context, Item item) Get a list of PotentialDuplicate objects (wrappers with some metadata included for previewing) that are identified as potential duplicates of the given itemsearchDuplicates(Context context, Item item) Search discovery for potential duplicates of a given item.validateDuplicateResult(Context context, IndexableObject indexableObject, Item original) Validate an indexable object (returned by discovery search) to ensure it is permissible, readable and valid and can be added to a list of results.
-
Constructor Details
-
DuplicateDetectionServiceImpl
public DuplicateDetectionServiceImpl()
-
-
Method Details
-
getPotentialDuplicates
public List<PotentialDuplicate> getPotentialDuplicates(Context context, Item item) throws SearchServiceException Get a list of PotentialDuplicate objects (wrappers with some metadata included for previewing) that are identified as potential duplicates of the given item- Specified by:
getPotentialDuplicatesin interfaceDuplicateDetectionService- Parameters:
context- DSpace contextitem- Item to check- Returns:
- List of potential duplicates (empty if none found)
- Throws:
SearchServiceException- if an error occurs performing the discovery search
-
validateDuplicateResult
public Optional<PotentialDuplicate> validateDuplicateResult(Context context, IndexableObject indexableObject, Item original) throws SQLException, AuthorizeException Validate an indexable object (returned by discovery search) to ensure it is permissible, readable and valid and can be added to a list of results. An Optional is returned, if it is empty then it was invalid or did not pass validation.- Specified by:
validateDuplicateResultin interfaceDuplicateDetectionService- Parameters:
context- The DSpace contextindexableObject- The discovery search resultoriginal- The original item (to compare IDs, submitters, etc)- Returns:
- An Optional potential duplicate
- Throws:
SQLExceptionAuthorizeException
-
searchDuplicates
Search discovery for potential duplicates of a given item. The search uses levenshtein distance (configurable) and a single-term "comparison value" constructed out of the item title- Specified by:
searchDuplicatesin interfaceDuplicateDetectionService- Parameters:
context- DSpace contextitem- The item to check- Returns:
- DiscoverResult as a result of performing search. Null if invalid.
- Throws:
SearchServiceException- if an error was encountered during the discovery search itself.
-
buildComparisonValue
Build a comparison value string made up of values of configured fields, used when indexing and querying items for deduplication- Specified by:
buildComparisonValuein interfaceDuplicateDetectionService- Parameters:
context- DSpace contextitem- The DSpace item- Returns:
- a constructed, normalised string
-