org.apache.spark.sql.catalyst.util
HyperLogLogPlusPlusHelper
Companion object HyperLogLogPlusPlusHelper
class HyperLogLogPlusPlusHelper extends Serializable
- Alphabetic
- By Inheritance
- HyperLogLogPlusPlusHelper
- Serializable
- Serializable
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
- new HyperLogLogPlusPlusHelper(relativeSD: Double)
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
estimateBias(e: Double): Double
Estimate the bias using the raw estimates with their respective biases from the HLL++ appendix.
Estimate the bias using the raw estimates with their respective biases from the HLL++ appendix. We currently use KNN interpolation to determine the bias (as suggested in the paper).
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
def
merge(buffer1: InternalRow, buffer2: InternalRow, offset1: Int, offset2: Int): Unit
Merge the HLL buffers by iterating through the registers in both buffers and select the maximum number of leading zeros for each register.
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
val
numWords: Int
The number of words used to store the registers.
The number of words used to store the registers. We use Longs for storage because this is the most compact way of storage; Spark aligns to 8-byte words or uses Long wrappers.
We only store whole registers per word in order to prevent overly complex bitwise operations. In practice this means we only use 60 out of 64 bits.
-
def
query(buffer: InternalRow, bufferOffset: Int): Long
Compute the HyperLogLog estimate.
Compute the HyperLogLog estimate.
Variable names in the HLL++ paper match variable names in the code.
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
def
trueRsd: Double
The
rsdof HLL++ is always equal to or better than thersdrequested.The
rsdof HLL++ is always equal to or better than thersdrequested. This method returns thersdthis instance actually guarantees.- returns
the actual
rsd.
-
def
update(buffer: InternalRow, bufferOffset: Int, value: Any, dataType: DataType): Unit
Update the HLL++ buffer.
Update the HLL++ buffer.
Variable names in the HLL++ paper match variable names in the code.
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()