Class MultiQuantileDigest
- java.lang.Object
-
- org.pipecraft.infra.monitoring.quantile.MultiQuantileDigest
-
- All Implemented Interfaces:
ConcurrentQuantileEstimator
public class MultiQuantileDigest extends Object implements ConcurrentQuantileEstimator
Thread Safe high-throughput quantile and cdf estimation.Maintains multiple (
digestCount)QuantileDigests and responds to quantile queries by merging the digests into a single digest and querying the merged (i.e.summarized) digest. This increases the throughput of theQuantileDigestby routing add operations to the different digests. A rule thumb for settingdigestCount:
Most applications would be fine with MUCH smaller numbers than that. Setno_of_write_threads < digestCount < 2 * no_of_write_threadsdigestCount = 1for single threaded apps. WithdigestCount = 1, this should be equivalent to aQuantileDigestwith some overhead.When merging multiple digests, it is recommended that the individual digests have a higher compression factor than the final merged digest. This leads to a more accurate merged digest. The compression factor of each of the internal digests is:
The merged digest that responds to queries has a compression factor equal tocompressionInflation * compressioncompression. Empirical results show that a compression value of 100 performs well for most use cases. The average serialized size of eachQuantileDigestwith compression 100 is less than 1 KB.- Author:
- Mojtaba Kohram
-
-
Constructor Summary
Constructors Constructor Description MultiQuantileDigest(int digestCount)Constructor with default valuesMultiQuantileDigest(int digestCount, double compression)Constructor with default valuesMultiQuantileDigest(int digestCount, double compression, double compressionInflationMultiplier)Fully specified constructor
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidadd(double x, int w)Add a weighted sample.doublecdf(double x)Get an estimate for the cdf of the distribution atx.List<Double>cdf(List<Double> coords)Get an estimate for the cdf of the distribution at every coordinate of input list.doublegetCompression()The compression factordoublegetCompressionInflation()The compression factor of each internal digest is equal tocompressionInflation*compressiondoublequantile(double q)Get an estimate of the quantile atq.List<Double>quantile(List<Double> qs)Get an estimated quantile for every value in the input list.QuantileDigestreset()Reset the digest.longsize()Returns the number of samples added to the current Estimator.QuantileDigestsummarize()Merge internal digests into a singleQuantileDigest.booleantryAdd(double x, int w)Attempts to add a weighted sample to this estimator.-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.pipecraft.infra.monitoring.quantile.ConcurrentQuantileEstimator
add, tryAdd
-
-
-
-
Constructor Detail
-
MultiQuantileDigest
public MultiQuantileDigest(int digestCount)
Constructor with default values- Parameters:
digestCount- number of digests
-
MultiQuantileDigest
public MultiQuantileDigest(int digestCount, double compression)Constructor with default values- Parameters:
digestCount- number of digests to keepcompression- compression factor of each digest
-
MultiQuantileDigest
public MultiQuantileDigest(int digestCount, double compression, double compressionInflationMultiplier)Fully specified constructor- Parameters:
digestCount- number of digests to keepcompression- compression factor of final digestcompressionInflationMultiplier- compression factor multiplier
-
-
Method Detail
-
getCompressionInflation
public double getCompressionInflation()
The compression factor of each internal digest is equal tocompressionInflation*compression- Returns:
- the compression inflation factor
-
getCompression
public double getCompression()
The compression factor- Returns:
- the compression factor
-
reset
public QuantileDigest reset()
Reset the digest. Returns aQuantileDigestrepresenting the final state of this object prior to reset.- Returns:
- a digest representing the final state of this object prior to reset
-
summarize
public QuantileDigest summarize()
Merge internal digests into a singleQuantileDigest. The returnedQuantileDigestis independent of this object and the caller is free to modify it.- Returns:
- the merged digest
-
quantile
public double quantile(double q)
Get an estimate of the quantile atq. This function could be expensive depending on implementation. Consider usingConcurrentQuantileEstimator.quantile(List)when querying more than one quantile value.- Specified by:
quantilein interfaceConcurrentQuantileEstimator- Parameters:
q- quantile to query, must be between 0 and 1- Returns:
- the estimated quantile value at
q
-
quantile
public List<Double> quantile(List<Double> qs)
Get an estimated quantile for every value in the input list.- Specified by:
quantilein interfaceConcurrentQuantileEstimator- Parameters:
qs- list of quantiles to query, must all be between 0 and 1- Returns:
- the estimated quantile value for every element in
quantileList, in order
-
size
public long size()
Returns the number of samples added to the current Estimator.- Specified by:
sizein interfaceConcurrentQuantileEstimator- Returns:
- the number of samples currently added
-
cdf
public double cdf(double x)
Get an estimate for the cdf of the distribution atx. This function could be expensive depending on implementation. Consider usingConcurrentQuantileEstimator.cdf(List)when querying more than one cdf value.- Specified by:
cdfin interfaceConcurrentQuantileEstimator- Parameters:
x- the value to get cdf at- Returns:
- the estimated cumulative distribution function value at
x, always between 0 and 1
-
cdf
public List<Double> cdf(List<Double> coords)
Get an estimate for the cdf of the distribution at every coordinate of input list.- Specified by:
cdfin interfaceConcurrentQuantileEstimator- Parameters:
coords- the list of coordinates to compute the cdf at- Returns:
- the estimated cumulative distribution function value for every element in
coordinates, in order, results are always between 0 and 1
-
tryAdd
public boolean tryAdd(double x, int w)Attempts to add a weighted sample to this estimator. Returns false if a lock is held by another thread.- Specified by:
tryAddin interfaceConcurrentQuantileEstimator- Parameters:
x- data to addw- weight- Returns:
- false if this object's lock is held by another thread, true otherwise
-
add
public void add(double x, int w)Add a weighted sample. This is a blocking call, for non-blocking adds seetryAdd(double, int).- Specified by:
addin interfaceConcurrentQuantileEstimator- Parameters:
x- data to addw- weight
-
-