Packages

c

org.apache.spark.sql.catalyst.expressions

RowBasedKeyValueBatch

abstract class RowBasedKeyValueBatch extends MemoryConsumer with Closeable

RowBasedKeyValueBatch stores key value pairs in contiguous memory region.

Each key or value is stored as a single UnsafeRow. Each record contains one key and one value and some auxiliary data, which differs based on implementation: i.e., FixedLengthRowBasedKeyValueBatch and VariableLengthRowBasedKeyValueBatch.

We use FixedLengthRowBasedKeyValueBatch if all fields in the key and the value are fixed-length data types. Otherwise we use VariableLengthRowBasedKeyValueBatch.

RowBasedKeyValueBatch is backed by a single page / MemoryBlock (ranges from 1 to 64MB depending on the system configuration). If the page is full, the aggregate logic should fallback to a second level, larger hash map. We intentionally use the single-page design because it simplifies memory address encoding & decoding for each key-value pair. Because the maximum capacity for RowBasedKeyValueBatch is only 2^16, it is unlikely we need a second page anyway. Filling the page requires an average size for key value pairs to be larger than 1024 bytes.

Linear Supertypes
Closeable, AutoCloseable, MemoryConsumer, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. RowBasedKeyValueBatch
  2. Closeable
  3. AutoCloseable
  4. MemoryConsumer
  5. AnyRef
  6. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new RowBasedKeyValueBatch(keySchema: StructType, valueSchema: StructType, maxRows: Int, manager: TaskMemoryManager)
    Attributes
    protected[expressions]

Abstract Value Members

  1. abstract def appendRow(kbase: Any, koff: Long, klen: Int, vbase: Any, voff: Long, vlen: Int): UnsafeRow

    Append a key value pair.

    Append a key value pair. It copies data into the backing MemoryBlock. Returns an UnsafeRow pointing to the value if succeeds, otherwise returns null.

  2. abstract def getKeyRow(rowId: Int): UnsafeRow

    Returns the key row in this batch at rowId.

    Returns the key row in this batch at rowId. Returned key row is reused across calls.

  3. abstract def getValueFromKey(rowId: Int): UnsafeRow

    Returns the value row by two steps: 1) looking up the key row with the same id (skipped if the key row is cached) 2) retrieve the value row by reusing the metadata from step 1) In most times, 1) is skipped because getKeyRow(id) is often called before getValueRow(id).

    Returns the value row by two steps: 1) looking up the key row with the same id (skipped if the key row is cached) 2) retrieve the value row by reusing the metadata from step 1) In most times, 1) is skipped because getKeyRow(id) is often called before getValueRow(id).

    Attributes
    protected[expressions]
  4. abstract def rowIterator(): KVIterator[UnsafeRow, UnsafeRow]

    Returns an iterator to go through all rows

Concrete Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. def acquireMemory(arg0: Long): Long
    Definition Classes
    MemoryConsumer
  5. def allocateArray(arg0: Long): LongArray
    Definition Classes
    MemoryConsumer
  6. def allocatePage(arg0: Long): MemoryBlock
    Attributes
    protected[memory]
    Definition Classes
    MemoryConsumer
  7. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  8. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  9. final def close(): Unit
    Definition Classes
    RowBasedKeyValueBatch → Closeable → AutoCloseable
  10. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  11. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  12. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  13. def freeArray(arg0: LongArray): Unit
    Definition Classes
    MemoryConsumer
  14. def freeMemory(arg0: Long): Unit
    Definition Classes
    MemoryConsumer
  15. def freePage(arg0: MemoryBlock): Unit
    Attributes
    protected[memory]
    Definition Classes
    MemoryConsumer
  16. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  17. def getMode(): MemoryMode
    Definition Classes
    MemoryConsumer
  18. def getUsed(): Long
    Definition Classes
    MemoryConsumer
  19. final def getValueRow(rowId: Int): UnsafeRow

    Returns the value row in this batch at rowId.

    Returns the value row in this batch at rowId. Returned value row is reused across calls. Because getValueRow(id) is always called after getKeyRow(id) with the same id, we use getValueFromKey(id) to retrieve value row, which reuses metadata from the cached key.

  20. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  21. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  22. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  23. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  24. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  25. final def numRows(): Int
  26. final def spill(size: Long, trigger: MemoryConsumer): Long

    Sometimes the TaskMemoryManager may call spill() on its associated MemoryConsumers to make space for new consumers.

    Sometimes the TaskMemoryManager may call spill() on its associated MemoryConsumers to make space for new consumers. For RowBasedKeyValueBatch, we do not actually spill and return 0. We should not throw OutOfMemory exception here because other associated consumers might spill

    Definition Classes
    RowBasedKeyValueBatch → MemoryConsumer
  27. def spill(): Unit
    Definition Classes
    MemoryConsumer
    Annotations
    @throws( classOf[java.io.IOException] )
  28. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  29. def toString(): String
    Definition Classes
    AnyRef → Any
  30. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  31. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  32. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()

Inherited from Closeable

Inherited from AutoCloseable

Inherited from MemoryConsumer

Inherited from AnyRef

Inherited from Any

Ungrouped