A function that converts a batch of objects of some input class into a sequence of FeatureObservations for an output class O.
A bag-of-words featurizer that simply tokenizes the input String by using whitespace and creates an observation for each token.
A DataMatrix stores a double-valued label along with the double-valued features that go with it.
A feature map that stores all feature strings and their indices in an in-memory Map.
Represents a single example from a collection of data.
Indexes the labels and features of a series of examples.
A trait for classes that can index features represented as Strings.
A feature with its observed magnitude in some context.
A function that converts objects of some input class into a sequence of FeatureObservations for an output class O.
Indexes the labels and features of a series of examples.
A feature map that uses the MurmurHash3 hash and mods on a prime giving the largest feature index that can be used.
A trait for classes that indexes labels and get labels of indexes.
Something that has a label.
For any class that has one or more labels.
Represents a single example from a collection of data.
Represents a single unlabeled example from a collection of data.
Dataset of the form
A BatchFeaturizer that computes the tf-idf score of the terms in each Example.
Provides useful utilties for dealing with datasets that have a defined order.