Class ReIdentificationRisk


  • public class ReIdentificationRisk
    extends java.lang.Object
    • Constructor Summary

      Constructors 
      Constructor Description
      ReIdentificationRisk​(java.util.Map<java.lang.String,​java.lang.Double> measures, AttackerSuccess attackerSuccessRate, java.util.List<java.lang.String> quasiIdentifiers, java.lang.String populationModel)  
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      private static double averageProsecutorRisk​(org.deidentifier.arx.risk.RiskModelSampleRisks riskModelSampleRisks)
      Returns a double that shows the average prosecutor re-identification risk found in the data set, based on the population model that is defined.
      static ReIdentificationRisk create​(org.deidentifier.arx.DataHandle data, org.deidentifier.arx.ARXPopulationModel pModel)  
      boolean equals​(java.lang.Object o)  
      private static double estimatedJournalistRisk​(org.deidentifier.arx.risk.RiskModelSampleRisks riskModelSampleRisks)
      Returns a double that shows the estimated journalist re-identification risk found in the data set, based on the population model that is defined.
      private static double estimatedMarketerRisk​(org.deidentifier.arx.risk.RiskModelSampleRisks riskModelSampleRisks)
      Returns a double that shows the estimated marketer re-identification risk found in the data set, based on the population model that is defined.
      private static double estimatedProsecutorRisk​(org.deidentifier.arx.risk.RiskModelSampleRisks riskModelSampleRisks)
      Returns a double that shows the estimated prosecutor re-identification risk found in the data set, based on the population model that is defined.
      AttackerSuccess getAttackerSuccessRate()  
      java.util.Map<java.lang.String,​java.lang.Double> getMeasures()  
      java.lang.String getPopulationModel()  
      java.util.List<java.lang.String> getQuasiIdentifiers()  
      int hashCode()  
      private static double highestJournalistRisk​(org.deidentifier.arx.risk.RiskModelSampleSummary riskModelSampleSummary)
      Returns a double that shows the highest journalist re-identification risk found in the data set, based on the population model that is defined.
      private static double highestProsecutorRisk​(org.deidentifier.arx.risk.RiskModelSampleRisks riskModelSampleRisks)
      Returns a double that shows the highest prosecutor re-identification risk found in the data set, based on the population model that is defined.
      private static double lowestProsecutorRisk​(org.deidentifier.arx.risk.RiskModelSampleRisks riskModelSampleRisks)
      Returns a double that shows the lowest prosecutor re-identification risk found in the data set, based on the population model that is defined.
      private static org.deidentifier.arx.risk.RiskModelPopulationUniqueness.PopulationUniquenessModel populationUniquenessModel​(org.deidentifier.arx.risk.RiskEstimateBuilder builder)
      Returns the method name used to estimating population uniqueness that assumes that the data set is a uniform sample of the population.
      private static double populationUniques​(org.deidentifier.arx.risk.RiskEstimateBuilder builder)
      Returns a double that shows the amount of unique records/fields in the data set, which are also unique within the underlying population model from which the data is a part of.
      private static java.util.List<java.lang.String> quasiIdentifiers​(org.deidentifier.arx.DataHandle data)
      Returns a set of strings that contains field names from the data set that has an attribute type of quasi-identifying
      private static double recordsAffectByRisk​(org.deidentifier.arx.risk.RiskModelSampleRiskDistribution sampleRiskDistribution, double risk)
      Returns a double that shows the amount of records/fields that are affected by a specific amount of risk.
      private static java.util.Map<java.lang.String,​java.lang.Double> riskMeasures​(org.deidentifier.arx.DataHandle data, org.deidentifier.arx.ARXPopulationModel pModel)  
      private static double sampleUniques​(org.deidentifier.arx.risk.RiskModelSampleUniqueness riskModelSampleUniqueness)
      Returns a double that shows the amount of unique records/fields in the data set.
      java.lang.String toString()  
      • Methods inherited from class java.lang.Object

        clone, finalize, getClass, notify, notifyAll, wait, wait, wait
    • Field Detail

      • measures

        private final java.util.Map<java.lang.String,​java.lang.Double> measures
      • quasiIdentifiers

        private final java.util.List<java.lang.String> quasiIdentifiers
      • populationModel

        private final java.lang.String populationModel
    • Constructor Detail

      • ReIdentificationRisk

        public ReIdentificationRisk​(java.util.Map<java.lang.String,​java.lang.Double> measures,
                                    AttackerSuccess attackerSuccessRate,
                                    java.util.List<java.lang.String> quasiIdentifiers,
                                    java.lang.String populationModel)
    • Method Detail

      • create

        public static ReIdentificationRisk create​(org.deidentifier.arx.DataHandle data,
                                                  org.deidentifier.arx.ARXPopulationModel pModel)
      • riskMeasures

        private static java.util.Map<java.lang.String,​java.lang.Double> riskMeasures​(org.deidentifier.arx.DataHandle data,
                                                                                           org.deidentifier.arx.ARXPopulationModel pModel)
      • lowestProsecutorRisk

        private static double lowestProsecutorRisk​(org.deidentifier.arx.risk.RiskModelSampleRisks riskModelSampleRisks)
        Returns a double that shows the lowest prosecutor re-identification risk found in the data set, based on the population model that is defined.
        Parameters:
        riskModelSampleRisks - SampleRisks for the dataset
        Returns:
        lowest risk found in the data set
      • recordsAffectByRisk

        private static double recordsAffectByRisk​(org.deidentifier.arx.risk.RiskModelSampleRiskDistribution sampleRiskDistribution,
                                                  double risk)
        Returns a double that shows the amount of records/fields that are affected by a specific amount of risk.
        Parameters:
        sampleRiskDistribution - RiskModelSampleRiskDistribution for the dataset
        risk - specific amount of risk that affects one or more records
        Returns:
        records affect by a specific amount of risk
      • averageProsecutorRisk

        private static double averageProsecutorRisk​(org.deidentifier.arx.risk.RiskModelSampleRisks riskModelSampleRisks)
        Returns a double that shows the average prosecutor re-identification risk found in the data set, based on the population model that is defined.
        Parameters:
        riskModelSampleRisks - SampleRisks for the dataset
        Returns:
        average risk found in the data set
      • highestProsecutorRisk

        private static double highestProsecutorRisk​(org.deidentifier.arx.risk.RiskModelSampleRisks riskModelSampleRisks)
        Returns a double that shows the highest prosecutor re-identification risk found in the data set, based on the population model that is defined.
        Parameters:
        riskModelSampleRisks - SampleRisks for the dataset
        Returns:
        highest prosecutor risk found in the data set
      • estimatedProsecutorRisk

        private static double estimatedProsecutorRisk​(org.deidentifier.arx.risk.RiskModelSampleRisks riskModelSampleRisks)
        Returns a double that shows the estimated prosecutor re-identification risk found in the data set, based on the population model that is defined.
        Parameters:
        riskModelSampleRisks - SampleRisks for the dataset
        Returns:
        estimated prosecutor risk found in the data set
      • highestJournalistRisk

        private static double highestJournalistRisk​(org.deidentifier.arx.risk.RiskModelSampleSummary riskModelSampleSummary)
        Returns a double that shows the highest journalist re-identification risk found in the data set, based on the population model that is defined.
        Parameters:
        riskModelSampleSummary - containing summary of the dataset risks
        Returns:
        highest journalist risk found in the data set
      • estimatedJournalistRisk

        private static double estimatedJournalistRisk​(org.deidentifier.arx.risk.RiskModelSampleRisks riskModelSampleRisks)
        Returns a double that shows the estimated journalist re-identification risk found in the data set, based on the population model that is defined.
        Parameters:
        riskModelSampleRisks - SampleRisks for the dataset
        Returns:
        estimated journalist risk found in the data set
      • estimatedMarketerRisk

        private static double estimatedMarketerRisk​(org.deidentifier.arx.risk.RiskModelSampleRisks riskModelSampleRisks)
        Returns a double that shows the estimated marketer re-identification risk found in the data set, based on the population model that is defined.
        Parameters:
        riskModelSampleRisks - SampleRisks for the dataset
        Returns:
        estimated marketer risk found in the data set
      • sampleUniques

        private static double sampleUniques​(org.deidentifier.arx.risk.RiskModelSampleUniqueness riskModelSampleUniqueness)
        Returns a double that shows the amount of unique records/fields in the data set.
        Parameters:
        riskModelSampleUniqueness - RiskModelSampleUniqueness for the dataset
        Returns:
        amount of unique records/fields found in the data set
      • populationUniques

        private static double populationUniques​(org.deidentifier.arx.risk.RiskEstimateBuilder builder)
        Returns a double that shows the amount of unique records/fields in the data set, which are also unique within the underlying population model from which the data is a part of.
        Parameters:
        builder - RiskEstimateBuilder for the dataset
        Returns:
        amount of unique records/fields found in the data set which are also unique in the population model
      • populationUniquenessModel

        private static org.deidentifier.arx.risk.RiskModelPopulationUniqueness.PopulationUniquenessModel populationUniquenessModel​(org.deidentifier.arx.risk.RiskEstimateBuilder builder)
        Returns the method name used to estimating population uniqueness that assumes that the data set is a uniform sample of the population.
        Parameters:
        builder - RiskEstimateBuilder for the dataset
        Returns:
        PopulationUniquenessModel for det dataset
      • quasiIdentifiers

        private static java.util.List<java.lang.String> quasiIdentifiers​(org.deidentifier.arx.DataHandle data)
        Returns a set of strings that contains field names from the data set that has an attribute type of quasi-identifying
        Parameters:
        data - tabular data set to be analysed against re-identification risk
        Returns:
        set of strings containing quasi-identifying fields
      • toString

        public java.lang.String toString()
        Overrides:
        toString in class java.lang.Object
      • equals

        public boolean equals​(java.lang.Object o)
        Overrides:
        equals in class java.lang.Object
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class java.lang.Object
      • getQuasiIdentifiers

        public java.util.List<java.lang.String> getQuasiIdentifiers()
      • getPopulationModel

        public java.lang.String getPopulationModel()
      • getMeasures

        public java.util.Map<java.lang.String,​java.lang.Double> getMeasures()