Class NormalizedModifiedPurity<V>

    • Constructor Detail

      • NormalizedModifiedPurity

        public NormalizedModifiedPurity()
        Construct a normalized modified purity calculator.
      • NormalizedModifiedPurity

        public NormalizedModifiedPurity​(boolean normalized,
                                        boolean modified)
        Construct a normalized modified purity calculator that allows turning normalized and/or modified options off.
        Parameters:
        normalized - normalized purity is on
        modified - modified purity is on
    • Method Detail

      • transform

        public static <V> Collection<Map<V,​Double>> transform​(Collection<Collection<V>> clusters)
        Transform a collection of clusters into a collection of weighted cluster elements.
        Type Parameters:
        V - the type of cluster elements
        Parameters:
        clusters - the collection of clusters
        Returns:
        a collection of weighted cluster elements
      • normalize

        public static <V> Collection<Map<V,​Double>> normalize​(Collection<Map<V,​Double>> clusters)
        Normalize weights of the cluster elements to allow using normalized (modified) purity.
        Type Parameters:
        V - the type of cluster elements
        Parameters:
        clusters - the collection of clusters
        Returns:
        a collection of weight-normalized clusters
      • evaluate

        public static <V> PrecisionRecall evaluate​(NormalizedModifiedPurity<V> precision,
                                                   NormalizedModifiedPurity<V> recall,
                                                   Collection<Map<V,​Double>> clusters,
                                                   Collection<Map<V,​Double>> classes)
        Compute a precision and recall using purity and inverse purity, correspondingly.
        Type Parameters:
        V - the type of cluster elements
        Parameters:
        precision - the purity
        recall - the inverse purity
        clusters - the collection of the clusters to evaluate
        classes - the collection of the gold standard clusters
        Returns:
        precision and recalled wrapped in an instance of PrecisionRecall
      • purity

        public double purity​(Collection<Map<V,​Double>> clusters,
                             Collection<Map<V,​Double>> classes)
        Computes the (modified) purity of the given clusters as according to the gold standard clustering, classes.
        Parameters:
        clusters - the collection of the clusters to evaluate
        classes - the collection of the gold standard clusters
        Returns:
        (modified) purity
        See Also:
        Kawahara et al. (ACL 2014)
      • score

        public double score​(Map<V,​Double> cluster,
                            Collection<Map<V,​Double>> classes)
        Compute the (modified) cluster score on a defined collection of classes.
        Parameters:
        cluster - the cluster to evaluate
        classes - the collection of the gold standard clusters
        Returns:
        cluster score
      • delta

        public double delta​(Map<V,​Double> cluster,
                            Map<V,​Double> klass)
        Compute the fuzzy overlap between two clusters, cluster and klass.

        In case of modified purity the singleton clusters are ignored.

        Parameters:
        cluster - the first cluster
        klass - the second cluster
        Returns:
        cluster overlap measure
        See Also:
        Kawahara et al. (ACL 2014)