Package org.dspace.content.service
Interface DuplicateDetectionService
- All Known Implementing Classes:
DuplicateDetectionServiceImpl
public interface DuplicateDetectionService
Duplicate Detection Service handles get, search and validation operations for duplicate detection.
- Author:
- Kim Shepherd
- See Also:
-
Field Summary
Fields -
Method Summary
Modifier and TypeMethodDescriptionbuildComparisonValue(Context context, Item item) Build a comparison value string made up of values of configured fields, used when indexing and querying items for deduplicationgetPotentialDuplicates(Context context, Item item) Get a list of PotentialDuplicate objects (wrappers with some metadata included for previewing) that are identified as potential duplicates of the given itemsearchDuplicates(Context context, Item item) Search discovery for potential duplicates of a given item.validateDuplicateResult(Context context, IndexableObject indexableObject, Item original) Validate an indexable object (returned by discovery search) to ensure it is permissible, readable and valid and can be added to a list of results.
-
Field Details
-
log
static final org.apache.logging.log4j.Logger logLogger
-
-
Method Details
-
getPotentialDuplicates
List<PotentialDuplicate> getPotentialDuplicates(Context context, Item item) throws SearchServiceException Get a list of PotentialDuplicate objects (wrappers with some metadata included for previewing) that are identified as potential duplicates of the given item- Parameters:
context- DSpace contextitem- Item to check- Returns:
- List of potential duplicates (empty if none found)
- Throws:
SearchServiceException- if an error occurs performing the discovery search
-
validateDuplicateResult
Optional<PotentialDuplicate> validateDuplicateResult(Context context, IndexableObject indexableObject, Item original) throws SQLException, AuthorizeException Validate an indexable object (returned by discovery search) to ensure it is permissible, readable and valid and can be added to a list of results. An Optional is returned, if it is empty then it was invalid or did not pass validation.- Parameters:
context- The DSpace contextindexableObject- The discovery search resultoriginal- The original item (to compare IDs, submitters, etc)- Returns:
- An Optional potential duplicate
- Throws:
SQLExceptionAuthorizeException
-
searchDuplicates
Search discovery for potential duplicates of a given item. The search uses levenshtein distance (configurable) and a single-term "comparison value" constructed out of the item title- Parameters:
context- DSpace contextitem- The item to check- Returns:
- DiscoverResult as a result of performing search. Null if invalid.
- Throws:
SearchServiceException- if an error was encountered during the discovery search itself.
-
buildComparisonValue
Build a comparison value string made up of values of configured fields, used when indexing and querying items for deduplication- Parameters:
context- DSpace contextitem- The DSpace item- Returns:
- a constructed, normalised string
-