org.pmml4s.metadata

Type members

Classlikes

abstract class AbstractField extends Field

Abstract class for field in a PMML with common implementations.

Abstract class for field in a PMML with common implementations.

object Algorithm extends Enumeration

Specifies which scoring algorithm to use when computing the output value. It applies only to Association Rules models.

Specifies which scoring algorithm to use when computing the output value. It applies only to Association Rules models.

trait Attribute extends HasLabels with HasMissingValues with HasInvalidValues with HasValidValues with HasIntervals with ValueIndexer with Serializable
Companion:
object
object Attribute
Companion:
class
sealed trait AttributeType
Companion:
object
Companion:
class
object CastInteger extends Enumeration

If a regression model should predict integers, use the attribute castInteger to control how decimal places should be handled.

If a regression model should predict integers, use the attribute castInteger to control how decimal places should be handled.

abstract class CategoricalAttribute(val invalidValues: Set[Any], val missingValues: Set[Any], val labels: Map[Any, String]) extends Attribute
Companion:
object
Companion:
class
class ContinuousAttribute(val intervals: Array[Interval], val validValues: Array[Any], val invalidValues: Set[Any], val missingValues: Set[Any], val labels: Map[Any, String]) extends Attribute with HasIntervals
Companion:
object
Companion:
class
class DataDictionary(val fields: Array[DataField]) extends Dictionary[DataField] with PmmlElement

Contains definitions for fields as used in mining models. It specifies the types and value ranges. These definitions are assumed to be independent of specific data sets as used for training or scoring a specific model.

Contains definitions for fields as used in mining models. It specifies the types and value ranges. These definitions are assumed to be independent of specific data sets as used for training or scoring a specific model.

Companion:
object
Companion:
class
class DataField(val name: String, val displayName: Option[String], val dataType: DataType, val opType: OpType, val intervals: Array[Interval], val values: Array[Value], val taxonomy: Option[String], val isCyclic: Boolean) extends AbstractField with PmmlElement

Defines a field as used in mining models. It specifies the types and value ranges.

Defines a field as used in mining models. It specifies the types and value ranges.

class Decision(val value: String, val displayValue: Option[String], val description: Option[String]) extends PmmlElement
class Decisions(val decisions: Array[Decision], val businessProblem: Option[String], val description: Option[String]) extends PmmlElement

The Decisions element contains an element Decision for every possible value of the decision.

The Decisions element contains an element Decision for every possible value of the decision.

abstract class Dictionary[T <: Field] extends Seq[T] with HasField
abstract class Field extends HasDataType with HasOpType with Attribute

Abstract class for field in a PMML.

Abstract class for field in a PMML.

trait FieldScope extends HasField
sealed trait FieldType
Companion:
object
object FieldType
Companion:
class
trait HasField
trait HasLabels
trait HasOutput

The Output section in the model specifies names for columns in an output table and describes how to compute the corresponding values.

The Output section in the model specifies names for columns in an output table and describes how to compute the corresponding values.

class ImmutableCategoricalAttribute(val validValues: Array[Any], val invalidValues: Set[Any], val missingValues: Set[Any], val labels: Map[Any, String]) extends CategoricalAttribute
object InvalidValueTreatment extends Enumeration

This field specifies how invalid input values are handled.

This field specifies how invalid input values are handled.

  • returnInvalid is the default and specifies that, when an invalid input is encountered, the model should return a value indicating an invalid result has been returned.
  • asIs means to use the input without modification.
  • asMissing specifies that an invalid input value should be treated as a missing value and follow the behavior specified by the missingValueReplacement attribute if present (see above). If asMissing is specified but there is no respective missingValueReplacement present, a missing value is passed on for eventual handling by successive transformations via DerivedFields or in the actual mining model.
  • asValue specifies that an invalid input value should be replaced with the value specified by attribute invalidValueReplacement which must be present in this case, or the PMML is invalid.
class MiningField(val name: String, val usageType: UsageType, val opType: Option[OpType], val importance: Option[Double], val outliers: OutlierTreatmentMethod, val lowValue: Option[Double], val highValue: Option[Double], val missingValueReplacement: Option[Any], val missingValueTreatment: Option[MissingValueTreatment], val invalidValueTreatment: InvalidValueTreatment, val invalidValueReplacement: Option[Any]) extends HasUsageType with PmmlElement

MiningFields also define the usage of each field (active, supplementary, target, ...) as well as policies for treating missing, invalid or outlier values.

MiningFields also define the usage of each field (active, supplementary, target, ...) as well as policies for treating missing, invalid or outlier values.

Value parameters:
importance

States the relative importance of the field.

invalidValueTreatment

Specifies how invalid input values are handled.

missingValueReplacement

If this attribute is specified then a missing input value is automatically replaced by the given value. That is, the model itself works as if the given value was found in the original input. For example the surrogate operator in TreeModel does not apply if the MiningField specifies a replacement value.

missingValueTreatment

This field is for information only.

name

Symbolic name of field, must refer to a field in the scope of the parent of the MiningSchema's model element.

opType

The attribute value overrides the corresponding value in the DataField. That is, a DataField can be used with different optypes in different models. For example, a 0/1 indicator could be used as a numeric input field in a regression model while the same field is used as a categorical field in a tree model.

class MiningSchema(val miningFields: Array[MiningField]) extends HasTargetFields with PmmlElement

The MiningSchema is the Gate Keeper for its model element. All data entering a model must pass through the MiningSchema. Each model element contains one MiningSchema which lists fields as used in that model. While the MiningSchema contains information that is specific to a certain model, the DataDictionary contains data definitions which do not vary per model. The main purpose of the MiningSchema is to list the fields that have to be provided in order to apply the model.

The MiningSchema is the Gate Keeper for its model element. All data entering a model must pass through the MiningSchema. Each model element contains one MiningSchema which lists fields as used in that model. While the MiningSchema contains information that is specific to a certain model, the DataDictionary contains data definitions which do not vary per model. The main purpose of the MiningSchema is to list the fields that have to be provided in order to apply the model.

object MissingValueTreatment extends Enumeration

In a PMML consumer this field is for information only, unless the value is returnInvalid, in which case if a missing value is encountered in the given field, the model should return a value indicating an invalid result; otherwise, the consumer only looks at missingValueReplacement - if a value is present it replaces missing values. Except as described above, the missingValueTreatment attribute just indicates how the missingValueReplacement was derived, but places no behavioral requirement on the consumer.

In a PMML consumer this field is for information only, unless the value is returnInvalid, in which case if a missing value is encountered in the given field, the model should return a value indicating an invalid result; otherwise, the consumer only looks at missingValueReplacement - if a value is present it replaces missing values. Except as described above, the missingValueTreatment attribute just indicates how the missingValueReplacement was derived, but places no behavioral requirement on the consumer.

class MutableCategoricalAttribute(val invalidValues: Set[Any], val missingValues: Set[Any], val labels: Map[Any, String]) extends CategoricalAttribute
class MutableFieldScope[T <: Field] extends FieldScope
object OutlierTreatmentMethod extends Enumeration

Outliers

Outliers

  • asIs: field values treated at face value.
  • asMissingValues: outlier values are treated as if they were missing.
  • asExtremeValues: outlier values are changed to a specific high or low value defined in MiningField.
class Output(val outputFields: Array[OutputField]) extends HasOutputFields with HasField with PmmlElement

Output element describes a set of result values that can be returned from a model.

Output element describes a set of result values that can be returned from a model.

class OutputField(val name: String, val displayName: Option[String], val dataType: DataType, val opType: OpType, val feature: ResultFeature, val targetField: Option[String], val value: Option[Any], val ruleFeature: RuleFeature, val algorithm: Algorithm, val rank: Int, val rankBasis: RankBasis, val rankOrder: RankOrder, val isMultiValued: Boolean, val segmentId: Option[String], val isFinalResult: Boolean, val decisions: Option[Decisions], val expr: Option[Expression]) extends AbstractField with PmmlElement

OutputField elements specify names, types and rules for calculating specific result features. This information can be used while writing an output table.

OutputField elements specify names, types and rules for calculating specific result features. This information can be used while writing an output table.

Companion:
object
Companion:
class
object RankBasis extends Enumeration

Applies only to Association Rules and is used to specify which criterion is used to sort the output result. For instance, the result could be sorted by the confidence, support or lift of the rules.

Applies only to Association Rules and is used to specify which criterion is used to sort the output result. For instance, the result could be sorted by the confidence, support or lift of the rules.

object RankOrder extends Enumeration

Determines the sorting order when ranking the results. The default behavior (rankOrder="descending") indicates that the result with the highest rank will appear first on the sorted list.

Determines the sorting order when ranking the results. The default behavior (rankOrder="descending") indicates that the result with the highest rank will appear first on the sorted list.

object ResultFeature extends Enumeration

Result Features

Result Features

@PmmlDeprecated(since = "4.2")
object RuleFeature extends Enumeration

Specifies which feature of an association rule to return. This attribute has been deprecated as of PMML 4.2. The rule feature values can now be specified in the feature attribute.

Specifies which feature of an association rule to return. This attribute has been deprecated as of PMML 4.2. The rule feature values can now be specified in the feature attribute.

class Target(val field: Option[String], val optype: Option[OpType], val castInteger: Option[CastInteger], val min: Option[Double], val max: Option[Double], val rescaleConstant: Double, val rescaleFactor: Double, val targetValues: Array[TargetValue]) extends PmmlElement

Note that castInteger, min, max, rescaleConstant and rescaleFactor only apply to models of type regression. Furthermore, they must be applied in sequence, which is:

Note that castInteger, min, max, rescaleConstant and rescaleFactor only apply to models of type regression. Furthermore, they must be applied in sequence, which is:

min and max rescaleFactor rescaleConstant castInteger

Value parameters:
castInteger

If a regression model should predict integers, use the attribute castInteger to control how decimal places should be handled.

field

must refer to a name of a DataField or DerivedField. It can be absent when the model is used inside a Segment of a MiningModel and does not have a real target field in the input data

max

If max is present, the predicted value will be max if it is larger than that.

min

If min is present, the predicted value will be the value of min if it is smaller than that.

optype

When Target specifies optype then it overrides the optype attribute in a corresponding MiningField, if it exists. If the target does not specify optype then the MiningField is used as default. And, in turn, if the MiningField does not specify an optype, it is taken from the corresponding DataField. In other words, a MiningField overrides a DataField, and a Target overrides a MiningField.

rescaleConstant

can be used for simple rescale of the predicted value: First off, the predicted value is multiplied by rescaleFactor.

rescaleFactor

after that, rescaleConstant is added to the predicted value.

targetValues

In classification models, TargetValue is required. For regression models, TargetValue is only optional.

class TargetValue(val value: Option[Any], val displayValue: Option[String], val priorProbability: Option[Double], val defaultValue: Option[Double]) extends PmmlElement
Value parameters:
defaultValue

the counterpart of prior probabilities for continuous fields. Usually the value is the mean of the target values in the training data. The attribute defaultValue is used only if the optype of the field is continuous.

displayValue

usually more readable version which can be used by PMML consumers to display values in scoring results or other applications.

priorProbability

specifies a default probability for the corresponding target category. It is used if the prediction logic itself did not produce a result. The attribute priorProbability is used only if the optype of the field is categorical or ordinal.

value

corresponds to the class labels in a classification model.

class Targets(val targets: Array[Target]) extends HasTargetFields with PmmlElement
object UsageType extends Enumeration

Usage type

Usage type

  • active: field used as input (independent field).
  • target: field that was used a training target for supervised models.
  • predicted: field whose value is predicted by the model. As of PMML 4.2, this is deprecated and it has been replaced by the usage type target.
  • supplementary: field holding additional descriptive information. Supplementary fields are not required to apply a model. They are provided as additional information for explanatory purpose, though. When some field has gone through preprocessing transformations before a model is built, then an additional supplementary field is typically used to describe the statistics for the original field values.
  • group: field similar to the SQL GROUP BY. For example, this is used by AssociationModel and SequenceModel to group items into transactions by customerID or by transactionID.
  • order: This field defines the order of items or transactions and is currently used in SequenceModel and TimeSeriesModel. Similarly to group, it is motivated by the SQL syntax, namely by the ORDER BY statement.
  • frequencyWeight and analysisWeight: These fields are not needed for scoring, but provide very important information on how the model was built. Frequency weight usually has positive integer values and is sometimes called "replication weight". Its values can be interpreted as the number of times each record appears in the data. Analysis weight can have fractional positive values, it could be used for regression weight in regression models or for case weight in trees, etc. It can be interpreted as different importance of the cases in the model. Counts in ModelStats and Partitions can be computed using frequency weight, mean and standard deviation values can be computed using both weights.
class WrappedField(val name: String) extends Field

Defines the wrapped field that contains an internal field acts all operations.

Defines the wrapped field that contains an internal field acts all operations.