Packages

p

org.pmml4s

transformations

package transformations

At various places the mining models use simple functions in order to map user data to values that are easier to use in the specific model. For example, neural networks internally work with numbers, usually in the range from 0 to 1. Numeric input data are mapped to the range [0..1], and categorical fields are mapped to series of 0/1 indicators.

PMML defines various kinds of simple data transformations:

  • Normalization: map values to numbers, the input can be continuous or discrete.
  • Discretization: map continuous values to discrete values.
  • Value mapping: map discrete values to discrete values.
  • Text Indexing: derive a frequency-based value for a given term.
  • Functions: derive a value by applying a function to one or more parameters
  • Aggregation: summarize or collect groups of values, e.g., compute average.
  • Lag: use a previous value of the given input field.
Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. transformations
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. Protected

Type Members

  1. class Apply extends Expression

    Apply defines the application of a function.

    Apply defines the application of a function. The function itself is identified by name with the function attribute. The actual parameters of the function application are given in the content of the element. Each actual argument value is given by an EXPRESSION and are mapped by position to the formal parameters in the corresponding function definition.

  2. trait BinaryArithmetic extends BinaryFunction
  3. trait BinaryBoolean extends BinaryFunction
  4. trait BinaryCompare extends BinaryFunction
  5. trait BinaryFunction extends Function
  6. trait BinaryString extends BinaryFunction
  7. class Constant extends LeafExpression

    Constant values can be used in expressions which have multiple arguments.

    Constant values can be used in expressions which have multiple arguments. . The actual value of a constant is given by the content of the element. For example, <Constant>1.05</Constant> represents the number 1.05. The dataType of Constant can be optionally specified.

  8. class DefineFunction extends Function with HasOpType with HasDataType with PmmlElement

    Defines new (user-defined) functions as variations or compositions of existing functions or transformations.

    Defines new (user-defined) functions as variations or compositions of existing functions or transformations. The function's name must be unique and must not conflict with other function names, either defined by PMML or other user-defined functions. The EXPRESSION in the content of DefineFunction is the function body that actually defines the meaning of the new function. The function body must not refer to fields other than the parameter fields.

  9. class DerivedField extends DataField with Expression

    Provides a common element for the various mappings.

    Provides a common element for the various mappings. They can also appear at several places in the definition of specific models such as neural network or Naive Bayes models. Transformed fields have a name such that statistics and the model can refer to these fields.

  10. class Discretize extends FieldExpression

    Discretization of numerical input fields is a mapping from continuous to discrete values using intervals.

  11. class DiscretizeBin extends PmmlElement
  12. trait Expression extends Evaluator with PmmlElement

    Trait of Expression that defines how the values of the new field are computed.

  13. class FieldColumnPair extends PmmlElement
  14. trait FieldExpression extends UnaryExpression
  15. class FieldRef extends FieldExpression with MixedEvaluator

    Field references are simply pass-throughs to fields previously defined in the DataDictionary, a DerivedField, or a result field.

    Field references are simply pass-throughs to fields previously defined in the DataDictionary, a DerivedField, or a result field. For example, they are used in clustering models in order to define center coordinates for fields that don't need further normalization.

    A missing input will produce a missing result. The optional attribute mapMissingTo may be used to map a missing result to the value specified by the attribute. If the attribute is not present, the result remains missing.

  16. trait Function extends PmmlElement

  17. trait FunctionProvider extends AnyRef
  18. trait HasFunctionProvider extends AnyRef
  19. trait HasLocalTransformations extends AnyRef
  20. trait LeafExpression extends Expression
  21. class LinearNorm extends PmmlElement
  22. class LocalTransformations extends TransformationDictionary

    LocalTransformations holds derived fields that are local to the model.

  23. class MapValues extends Expression

    Any discrete value can be mapped to any possibly different discrete value by listing the pairs of values.

    Any discrete value can be mapped to any possibly different discrete value by listing the pairs of values. This list is implemented by a table, so it can be given inline by a sequence of XML markups or by a reference to an external table.

  24. trait MultipleArithmetic extends Function
  25. trait MultipleBoolean extends Function
  26. class MutableFunctionProvider extends FunctionProvider
  27. class NormContinuous extends NumericFieldExpression

    Normalization provides a basic framework for mapping input values to specific value ranges, usually the numeric range [0 ..

    Normalization provides a basic framework for mapping input values to specific value ranges, usually the numeric range [0 .. 1]. Normalization is used, e.g., in neural networks and clustering models.

    Defines how to normalize an input field by piecewise linear interpolation. The mapMissingTo attribute defines the value the output is to take if the input is missing. If the mapMissingTo attribute is not specified, then missing input values produce a missing result.

  28. class NormDiscrete extends FieldExpression

    Encode string values into numeric values in order to perform mathematical computations.

    Encode string values into numeric values in order to perform mathematical computations. For example, regression and neural network models often split categorical and ordinal fields into multiple dummy fields. This kind of normalization is supported in PMML by the element NormDiscrete.

    An element (f, v) defines that the unit has value 1.0 if the value of input field f is v, otherwise it is 0.

    The set of NormDiscrete instances which refer to a certain input field define a fan-out function which maps a single input field to a set of normalized fields.

    If the input value is missing and the attribute mapMissingTo is not specified then the result is a missing value as well. If the input value is missing and the attribute mapMissingTo is specified then the result is the value of the attribute mapMissingTo.

  29. trait NumericFieldExpression extends FieldExpression
  30. class ParameterField extends AbstractField
  31. trait TernaryArithmetic extends TernaryFunction
  32. trait TernaryFunction extends Function
  33. class TextIndex extends NumericFieldExpression

    The TextIndex element fully configures how the text in textField should be processed and translated into a frequency metric for a particular term of interest.

    The TextIndex element fully configures how the text in textField should be processed and translated into a frequency metric for a particular term of interest. The actual frequency metric to be returned is defined through the localTermWeights attribute.

  34. class TextIndexNormalization extends PmmlElement

    A TextIndexNormalization element offers more advanced ways of normalizing text input into a more controlled vocabulary that corresponds to the terms being used in invocations of this indexing function.

    A TextIndexNormalization element offers more advanced ways of normalizing text input into a more controlled vocabulary that corresponds to the terms being used in invocations of this indexing function. The normalization operation is defined through a translation table, specified through a TableLocator or InlineTable element.

  35. class TransformationDictionary extends Dictionary[DerivedField] with Transformer with FunctionProvider with PmmlElement

    The TransformationDictionary allows for transformations to be defined once and used by any model element in the PMML document.

  36. trait UnaryArithmetic extends UnaryFunction
  37. trait UnaryBoolean extends UnaryFunction
  38. trait UnaryExpression extends Expression
  39. trait UnaryFunction extends Function
  40. trait UnaryString extends UnaryFunction

Value Members

  1. object ACos extends UnaryArithmetic
  2. object ASin extends UnaryArithmetic
  3. object ATan extends UnaryArithmetic
  4. object Abs extends UnaryArithmetic
  5. object Add extends BinaryArithmetic
  6. object And extends MultipleBoolean
  7. object Avg extends MultipleArithmetic
  8. object BuiltInFunctions extends FunctionProvider
  9. object Ceil extends UnaryArithmetic
  10. object Concat extends Function
  11. object Cos extends UnaryArithmetic
  12. object CosH extends UnaryArithmetic
  13. object CountHits extends Enumeration

    - allHits: count all hits - bestHits: count all hits with the lowest Levenshtein distance

  14. object DateDaysSinceYear extends BinaryFunction
  15. object DateSecondsSinceMidnight extends UnaryFunction
  16. object DateSecondsSinceYear extends BinaryFunction
  17. object Divide extends BinaryArithmetic
  18. object Equal extends BinaryBoolean
  19. object Erf extends UnaryArithmetic
  20. object Exp extends UnaryArithmetic
  21. object Expm1 extends UnaryArithmetic
  22. object Expression extends Serializable
  23. object Floor extends UnaryArithmetic
  24. object FormatDatetime extends BinaryFunction
  25. object FormatNumber extends BinaryFunction
  26. object GreaterOrEqual extends BinaryCompare
  27. object GreaterThan extends BinaryCompare
  28. object Hypot extends BinaryArithmetic
  29. object If extends Function
  30. object IsIn extends Function
  31. object IsMissing extends UnaryBoolean
  32. object IsNotIn extends Function
  33. object IsNotMissing extends UnaryBoolean
  34. object IsNotValid extends UnaryBoolean
  35. object IsValid extends UnaryBoolean
  36. object LessOrEqual extends BinaryCompare
  37. object LessThan extends BinaryCompare
  38. object Ln extends UnaryArithmetic
  39. object Ln1p extends UnaryArithmetic
  40. object LocalTermWeights extends Enumeration

    - termFrequency: use the number of times the term occurs in the document (x = freqi).

    - termFrequency: use the number of times the term occurs in the document (x = freqi). - binary: use 1 if the term occurs in the document or 0 if it doesn't (x = χ(freqi)). - logarithmic: take the logarithm (base 10) of 1 + the number of times the term occurs in the document. (x = log(1 + freqi)) - augmentedNormalizedTermFrequency: this formula adds to the binary frequency a "normalized" component expressing the frequency of a term relative to the highest frequency of terms observed in that document (x = 0.5 * (χ(freqi) + (freqi / maxk(freqk))) )

  41. object Log10 extends UnaryArithmetic
  42. object Lowercase extends UnaryString
  43. object Matches extends BinaryBoolean
  44. object Max extends MultipleArithmetic
  45. object Median extends MultipleArithmetic
  46. object Min extends MultipleArithmetic
  47. object Modulo extends BinaryArithmetic
  48. object Multiply extends BinaryArithmetic
  49. object NormalCDF extends TernaryArithmetic
  50. object NormalIDF extends TernaryArithmetic
  51. object NormalPDF extends TernaryArithmetic
  52. object Not extends UnaryFunction
  53. object NotEqual extends BinaryBoolean
  54. object Or extends MultipleBoolean
  55. object Pow extends BinaryArithmetic
  56. object Product extends MultipleArithmetic
  57. object RInt extends UnaryArithmetic
  58. object Replace extends TernaryFunction
  59. object Round extends UnaryArithmetic
  60. object SAS-EM-String-Normalize extends BinaryFunction

    <DefineFunction name="SAS-EM-String-Normalize" optype="categorical" dataType="string">
     <ParameterField name="FMTWIDTH" optype="continuous"/>
     <ParameterField name="AnyCInput" optype="categorical"/>
     <Apply function="trimBlanks">
       <Apply function="uppercase">
         <Apply function="substring">
         <FieldRef field="AnyCInput"/>
         <Constant>1</Constant>
         <Constant>FMTWIDTH</Constant>
         </Apply>
       </Apply>
     </Apply>
    </DefineFunction>
  61. object SAS-FORMAT-$CHARw extends BinaryFunction

    <DefineFunction name="SAS-FORMAT-$CHARw" optype="categorical" dataType="string">
     <ParameterField name="FMTWIDTH" optype="continuous"/>
     <ParameterField name="AnyCInput" optype="continuous"/>
     <Apply function="substring">
       <FieldRef field="AnyCInput"/>
       <Constant>1</Constant>
       <Constant>FMTWIDTH</Constant>
     </Apply>
    </DefineFunction>
  62. object SAS-FORMAT-BESTw extends BinaryFunction

    <DefineFunction name="SAS-FORMAT-BESTw" optype="categorical" dataType="string">
     <ParameterField name="FMTWIDTH" optype="continuous"/>
     <ParameterField name="AnyNInput" optype="continuous"/>
     <Apply function="formatNumber">
       <FieldRef field="AnyNInput"/>
       <Constant>FMTWIDTH</Constant>
     </Apply>
    </DefineFunction>
  63. object Sin extends UnaryArithmetic
  64. object SinH extends UnaryArithmetic
  65. object Sqrt extends UnaryArithmetic
  66. object StdNormalCDF extends UnaryArithmetic
  67. object StdNormalIDF extends UnaryArithmetic
  68. object StdNormalPDF extends UnaryArithmetic
  69. object StringLength extends UnaryFunction
  70. object Substring extends TernaryFunction
  71. object Subtract extends BinaryArithmetic
  72. object Sum extends MultipleArithmetic
  73. object Tan extends UnaryArithmetic
  74. object TanH extends UnaryArithmetic
  75. object TextIndex extends Serializable
  76. object Threshold extends BinaryArithmetic
  77. object TrimBlanks extends UnaryString
  78. object Uppercase extends UnaryString
  79. object UserDefinedFunctions extends FunctionProvider

    Defines several user-defined functions produced by various vendors, actually, well-defined "DefineFunction" is fully supported by pmml4s, while some could be not.

    Defines several user-defined functions produced by various vendors, actually, well-defined "DefineFunction" is fully supported by pmml4s, while some could be not. Here is the place for those user-defined functions are not well defined.

Inherited from AnyRef

Inherited from Any

Ungrouped