Converts this bag into a distributed collection of type DColl[A].
Converts this bag into a distributed collection of type DColl[A].
Find the bottom n elements in the collection with respect to the natural ordering of the
elements' type.
Find the bottom n elements in the collection with respect to the natural ordering of the
elements' type.
number of elements to return
the implicit Ordering of elements
an ordered (ascending) List of the bottom n elements
Converts the DataBag back into a scala Seq.
Count the number of elements in the collection that satisfy a predicate.
Count the number of elements in the collection that satisfy a predicate.
the predicate to test against
the number of elements that satisfy p
Removes duplicate entries from the bag, e.g.
Test if at least one element of the collection satisfies p.
Test if at least one element of the collection satisfies p.
predicate to test against the elements of the collection
true if the collection contains an element that satisfies the predicate
Finds some element in the collection that satisfies a given predicate.
Finds some element in the collection that satisfies a given predicate.
the predicate to test against
Some element if one exists, None otherwise
Monad flatMap.
Structural recursion over the bag.
Structural recursion over the bag. Assumes an algebraic specification of the DataBag type using three constructors:
sealed trait DataBag[A] case class Sng[A](x: A) extends DataBag[A] case class Union[A](xs: DataBag[A], ys: Bag[A]) extends DataBag[A] case object Empty extends DataBag[Nothing]
The function then denotes the following recursive computation:
this match { case Empty => agg.zero case Sng(x) => agg.init(x) case Union(xs, ys) => p(xs.fold(agg), ys.fold(agg)) }
Delegates to fold(Alg(zero, init, plus)).
Delegates to fold(Alg(zero, init, plus)).
Test if all elements of the collection satisfy p.
Test if all elements of the collection satisfy p.
predicate to test against the elements of the collection
true if all the elements of the collection satisfy the predicate
Groups the bag by key.
Test the collection for emptiness.
Test the collection for emptiness.
true if the collection contains no elements at all
Monad map.
Find the largest element in the collection with respect to the natural ordering of the elements' type.
Find the largest element in the collection with respect to the natural ordering of the elements' type.
the implicit natural Ordering of the elements
Exception if the collection is empty
Find the smallest element in the collection with respect to the natural ordering of the elements' type.
Find the smallest element in the collection with respect to the natural ordering of the elements' type.
the implicit natural Ordering of the elements
Exception if the collection is empty
Tet the collection for emptiness.
Tet the collection for emptiness.
true if the collection has at least one element
Calculate the product over all elements in the collection.
Calculate the product over all elements in the collection.
implicit Numeric operations of the elements
one if the collection is empty
Shortcut for fold(z)(identity, f).
Shortcut for fold(z)(identity, f).
return type (super class of the element type)
the result of combining all elements into one
Shortcut for fold(None)(Some, Option.lift2(f)), which is the same as reducing the collection
to a single element by applying a binary operator.
Shortcut for fold(None)(Some, Option.lift2(f)), which is the same as reducing the collection
to a single element by applying a binary operator.
the result of reducing all elements into one
Creates a sample of up to k elements using reservoir sampling initialized with the given seed.
Creates a sample of up to k elements using reservoir sampling initialized with the given seed.
If the collection represented by the DataBag instance contains less then n elements,
the resulting collection is trimmed to a smaller size.
The method should be deterministic for a fixed DataBag instance with a materialized result.
In other words, calling xs.sample(n)(seed) two times in succession will return the same result.
The result, however, might vary between program runs and DataBag implementations.
the number of elements in the collection
Calculate the sum over all elements in the collection.
Calculate the sum over all elements in the collection.
implicit Numeric operations of the elements
zero if the collection is empty
Find the top n elements in the collection with respect to the natural ordering of the
elements' type.
Find the top n elements in the collection with respect to the natural ordering of the
elements' type.
number of elements to return
the implicit Ordering of elements
an ordered (descending) List of the bottom n elements
Union operator.
Monad filter.
Writes a DataBag into the specified path in a CSV format.
Writes a DataBag into the specified path in a CSV format.
Writes a DataBag into the specified path as plain text.
Zips the elements of this collection with a unique dense index.
Zips the elements of this collection with a unique dense index.
The method should be deterministic for a fixed DataBag instance with a materialized result.
In other words, calling xs.zipWithIndex() two times in succession will return the same result.
The result, however, might vary between program runs and DataBag implementations.
A
DataBagimplementation backed by a ScalaSeq.