Class

net.gonzberg.spark.sorting

SecondarySortGroupingPairRDDFunctions

Related Doc: package sorting

Permalink

final class SecondarySortGroupingPairRDDFunctions[K, V] extends Serializable

Linear Supertypes
Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. SecondarySortGroupingPairRDDFunctions
  2. Serializable
  3. Serializable
  4. AnyRef
  5. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new SecondarySortGroupingPairRDDFunctions(rdd: RDD[(K, V)])(implicit arg0: Ordering[K], arg1: ClassTag[K], arg2: Ordering[V], arg3: ClassTag[V])

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  6. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  8. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  9. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  10. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  11. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  12. def mapValuesWithKeyedPreparedResource[R, R1, A](resources: RDD[(K, R)], prepareResource: (R) ⇒ R1, op: (R1, V) ⇒ A)(implicit arg0: ClassTag[R]): RDD[(K, A)]

    Permalink

    Applies op to every value with some resource, where values and resources share the same key.

    Applies op to every value with some resource, where values and resources share the same key. This allows you to send data to executors based on key, so that: (1) the entire set of resources are not held in memory on all executors; and, (2) a specific resource is not duplicated; it is reused for all corresponding data values One example usage might be when conducting a geospatial operation. If the keys indicate a geographic area, and the value contains geospatial resources in that geographic area, one can apply a method using geospatially local resources to all values while reducing data duplication and shuffling.

    R

    the type of resources being used

    A

    the type returned by applying the operation with the resource to each value

    resources

    a PairRDD of keys and resources, where keys are used to determine what data the resource is associated with for the operation. There must be exactly one resource for each key in the RDD this method is applied to

    prepareResource

    a function to transform the resource into what will be used in op

    op

    the operation to apply to each value. The operation takes a resource and value and returns the transformed value

    returns

    PairRDD of values transformed by applying the operation with the appropriate resource

  13. def mapValuesWithKeyedPreparedResource[R, R1, A](resources: RDD[(K, R)], prepareResource: (R) ⇒ R1, op: (R1, V) ⇒ A, numPartitions: Int)(implicit arg0: ClassTag[R]): RDD[(K, A)]

    Permalink

    Applies op to every value with some resource, where values and resources share the same key.

    Applies op to every value with some resource, where values and resources share the same key. This allows you to send data to executors based on key, so that: (1) the entire set of resources are not held in memory on all executors; and, (2) a specific resource is not duplicated; it is reused for all corresponding data values One example usage might be when conducting a geospatial operation. If the keys indicate a geographic area, and the value contains geospatial resources in that geographic area, one can apply a method using geospatially local resources to all values while reducing data duplication and shuffling.

    R

    the type of resources being used

    A

    the type returned by applying the operation with the resource to each value

    resources

    a PairRDD of keys and resources, where keys are used to determine what data the resource is associated with for the operation. There must be exactly one resource for each key in the RDD this method is applied to

    prepareResource

    a function to transform the resource into what will be used in op

    op

    the operation to apply to each value. The operation takes a resource and value and returns the transformed value

    numPartitions

    the number of partitions for shuffling

    returns

    PairRDD of values transformed by applying the operation with the appropriate resource

  14. def mapValuesWithKeyedPreparedResource[R, R1, A](resources: RDD[(K, R)], prepareResource: (R) ⇒ R1, op: (R1, V) ⇒ A, partitioner: Partitioner)(implicit arg0: ClassTag[R]): RDD[(K, A)]

    Permalink

    Applies op to every value with some resource, where values and resources share the same key.

    Applies op to every value with some resource, where values and resources share the same key. This allows you to send data to executors based on key, so that: (1) the entire set of resources are not held in memory on all executors; and, (2) a specific resource is not duplicated; it is reused for all corresponding data values One example usage might be when conducting a geospatial operation. If the keys indicate a geographic area, and the value contains geospatial resources in that geographic area, one can apply a method using geospatially local resources to all values while reducing data duplication and shuffling.

    R

    the type of resources being used

    A

    the type returned by applying the operation with the resource to each value

    resources

    a PairRDD of keys and resources, where keys are used to determine what data the resource is associated with for the operation. There must be exactly one resource for each key in the RDD this method is applied to

    prepareResource

    a function to transform the resource into what will be used in op

    op

    the operation to apply to each value. The operation takes a resource and value and returns the transformed value

    partitioner

    the partitioner for shuffling

    returns

    PairRDD of values transformed by applying the operation with the appropriate resource

  15. def mapValuesWithKeyedPreparedResource[R, A](resources: RDD[(K, R)], op: (R) ⇒ (V) ⇒ A)(implicit arg0: ClassTag[R]): RDD[(K, A)]

    Permalink

    Applies op to every value with some resource, where values and resources share the same key.

    Applies op to every value with some resource, where values and resources share the same key. This allows you to send data to executors based on key, so that: (1) the entire set of resources are not held in memory on all executors; and, (2) a specific resource is not duplicated; it is reused for all corresponding data values One example usage might be when conducting a geospatial operation. If the keys indicate a geographic area, and the value contains geospatial resources in that geographic area, one can apply a method using geospatially local resources to all values while reducing data duplication and shuffling.

    R

    the type of resources being used

    A

    the type returned by applying the operation with the resource to each value

    resources

    a PairRDD of keys and resources, where keys are used to determine what data the resource is associated with for the operation. There must be exactly one resource for each key in the RDD this method is applied to

    op

    The operation to apply to each value. The operation takes a resource and returns a function, which will then be applied to each value.

    returns

    PairRDD of values transformed by applying the operation with the appropriate resource

  16. def mapValuesWithKeyedPreparedResource[R, A](resources: RDD[(K, R)], op: (R) ⇒ (V) ⇒ A, numPartitions: Int)(implicit arg0: ClassTag[R]): RDD[(K, A)]

    Permalink

    Applies op to every value with some resource, where values and resources share the same key.

    Applies op to every value with some resource, where values and resources share the same key. This allows you to send data to executors based on key, so that: (1) the entire set of resources are not held in memory on all executors; and, (2) a specific resource is not duplicated; it is reused for all corresponding data values One example usage might be when conducting a geospatial operation. If the keys indicate a geographic area, and the value contains geospatial resources in that geographic area, one can apply a method using geospatially local resources to all values while reducing data duplication and shuffling.

    R

    the type of resources being used

    A

    the type returned by applying the operation with the resource to each value

    resources

    a PairRDD of keys and resources, where keys are used to determine what data the resource is associated with for the operation. There must be exactly one resource for each key in the RDD this method is applied to

    op

    The operation to apply to each value. The operation takes a resource and returns a function, which will then be applied to each value.

    numPartitions

    the number of partitions for shuffling

    returns

    PairRDD of values transformed by applying the operation with the appropriate resource

  17. def mapValuesWithKeyedPreparedResource[R, A](resources: RDD[(K, R)], op: (R) ⇒ (V) ⇒ A, partitioner: Partitioner)(implicit arg0: ClassTag[R]): RDD[(K, A)]

    Permalink

    Applies op to every value with some resource, where values and resources share the same key.

    Applies op to every value with some resource, where values and resources share the same key. This allows you to send data to executors based on key, so that: (1) the entire set of resources are not held in memory on all executors; and, (2) a specific resource is not duplicated; it is reused for all corresponding data values One example usage might be when conducting a geospatial operation. If the keys indicate a geographic area, and the value contains geospatial resources in that geographic area, one can apply a method using geospatially local resources to all values while reducing data duplication and shuffling.

    R

    the type of resources being used

    A

    the type returned by applying the operation with the resource to each value

    resources

    a PairRDD of keys and resources, where keys are used to determine what data the resource is associated with for the operation. There must be exactly one resource for each key in the RDD this method is applied to

    op

    The operation to apply to each value. The operation takes a resource and returns a function, which will then be applied to each value.

    partitioner

    the partitioner for shuffling

    returns

    PairRDD of values transformed by applying the operation with the appropriate resource

  18. def mapValuesWithKeyedResource[R, A](resources: RDD[(K, R)], op: (R, V) ⇒ A)(implicit arg0: ClassTag[R]): RDD[(K, A)]

    Permalink

    Applies op to every value with some resource, where values and resources share the same key.

    Applies op to every value with some resource, where values and resources share the same key. This allows you to send data to executors based on key, so that: (1) the entire set of resources are not held in memory on all executors; and, (2) a specific resource is not duplicated; it is reused for all corresponding data values One example usage might be when conducting a geospatial operation. If the keys indicate a geographic area, and the value contains geospatial resources in that geographic area, one can apply a method using geospatially local resources to all values while reducing data duplication and shuffling.

    R

    the type of resources being used

    A

    the type returned by applying the operation with the resource to each value

    resources

    a PairRDD of keys and resources, where keys are used to determine what data the resource is associated with for the operation. There must be exactly one resource for each key in the RDD this method is applied to

    op

    the operation to apply to each value. Takes a resource and value and returns the transformed value

    returns

    PairRDD of values transformed by applying the operation with the appropriate resource

  19. def mapValuesWithKeyedResource[R, A](resources: RDD[(K, R)], op: (R, V) ⇒ A, numPartitions: Int)(implicit arg0: ClassTag[R]): RDD[(K, A)]

    Permalink

    Applies op to every value with some resource, where values and resources share the same key.

    Applies op to every value with some resource, where values and resources share the same key. This allows you to send data to executors based on key, so that: (1) the entire set of resources are not held in memory on all executors; and, (2) a specific resource is not duplicated; it is reused for all corresponding data values One example usage might be when conducting a geospatial operation. If the keys indicate a geographic area, and the value contains geospatial resources in that geographic area, one can apply a method using geospatially local resources to all values while reducing data duplication and shuffling.

    R

    the type of resources being used

    A

    the type returned by applying the operation with the resource to each value

    resources

    a PairRDD of keys and resources, where keys are used to determine what data the resource is associated with for the operation. There must be exactly one resource for each key in the RDD this method is applied to

    op

    the operation to apply to each value. Takes a resource and value and returns the transformed value

    numPartitions

    the number of partitions for shuffling

    returns

    PairRDD of values transformed by applying the operation with the appropriate resource

  20. def mapValuesWithKeyedResource[R, A](resources: RDD[(K, R)], op: (R, V) ⇒ A, partitioner: Partitioner)(implicit arg0: ClassTag[R]): RDD[(K, A)]

    Permalink

    Applies op to every value with some resource, where values and resources share the same key.

    Applies op to every value with some resource, where values and resources share the same key. This allows you to send data to executors based on key, so that: (1) the entire set of resources are not held in memory on all executors; and, (2) a specific resource is not duplicated; it is reused for all corresponding data values One example usage might be when conducting a geospatial operation. If the keys indicate a geographic area, and the value contains geospatial resources in that geographic area, one can apply a method using geospatially local resources to all values while reducing data duplication and shuffling.

    R

    the type of resources being used

    A

    the type returned by applying the operation with the resource to each value

    resources

    a PairRDD of keys and resources, where keys are used to determine what data the resource is associated with for the operation. There must be exactly one resource for each key in the RDD this method is applied to

    op

    the operation to apply to each value. Takes a resource and value and returns the transformed value

    partitioner

    the partitioner for shuffling

    returns

    PairRDD of values transformed by applying the operation with the appropriate resource

  21. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  22. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  23. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  24. def sortedFoldLeftByKey[A](startValue: A, op: (A, V) ⇒ A): RDD[(K, A)]

    Permalink

    Groups by key and applies a binary operation using foldLeft over the values sorted by some implicit ordering

    Groups by key and applies a binary operation using foldLeft over the values sorted by some implicit ordering

    A

    the result type of the folding operation

    startValue

    the start value for the fold

    op

    the binary operation for folding

    returns

    PairRDD with keys and values, where values are the result of applying foldLeft across the sorted values

  25. def sortedFoldLeftByKey[A](startValue: A, op: (A, V) ⇒ A, numPartitions: Int): RDD[(K, A)]

    Permalink

    Groups by key and applies a binary operation using foldLeft over the values sorted by some implicit ordering

    Groups by key and applies a binary operation using foldLeft over the values sorted by some implicit ordering

    A

    the result type of the folding operation

    startValue

    the start value for the fold

    op

    the binary operation for folding

    numPartitions

    the number of partitions for shuffling

    returns

    PairRDD with keys and values, where values are the result of applying foldLeft across the sorted values

  26. def sortedFoldLeftByKey[A](startValue: A, op: (A, V) ⇒ A, partitioner: Partitioner): RDD[(K, A)]

    Permalink

    Groups by key and applies a binary operation using foldLeft over the values sorted by some implicit ordering

    Groups by key and applies a binary operation using foldLeft over the values sorted by some implicit ordering

    A

    the result type of the folding operation

    startValue

    the start value for the fold

    op

    the binary operation for folding

    partitioner

    the partitioner for shuffling

    returns

    PairRDD with keys and values, where values are the result of applying foldLeft across the sorted values

  27. def sortedGroupByKey: RDD[(K, Iterable[V])]

    Permalink

    Groups by key and sorts the values by some implicit ordering

    Groups by key and sorts the values by some implicit ordering

    returns

    a PairRDD of keys and sorted values

  28. def sortedGroupByKey(numPartitions: Int): RDD[(K, Iterable[V])]

    Permalink

    Groups by key and sorts the values by some implicit ordering

    Groups by key and sorts the values by some implicit ordering

    numPartitions

    the number of partitions for shuffling

    returns

    PairRDD of keys and sorted values

  29. def sortedGroupByKey(partitioner: Partitioner): RDD[(K, Iterable[V])]

    Permalink

    Groups by key and sorts the values by some implicit ordering

    Groups by key and sorts the values by some implicit ordering

    partitioner

    the partitioner for shuffling

    returns

    PairRDD of keys and sorted values

  30. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  31. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  32. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  33. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  34. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped