Packages

  • package root
    Definition Classes
    root
  • package org
    Definition Classes
    root
  • package apache
    Definition Classes
    org
  • package spark
    Definition Classes
    apache
  • package sql
    Definition Classes
    spark
  • package catalyst

    Catalyst is a library for manipulating relational query plans.

    Catalyst is a library for manipulating relational query plans. All classes in catalyst are considered an internal API to Spark SQL and are subject to change between minor releases.

    Definition Classes
    sql
  • package expressions

    A set of classes that can be used to represent trees of relational expressions.

    A set of classes that can be used to represent trees of relational expressions. A key goal of the expression library is to hide the details of naming and scoping from developers who want to manipulate trees of relational operators. As such, the library defines a special type of expression, a NamedExpression in addition to the standard collection of expressions.

    Standard Expressions

    A library of standard expressions (e.g., Add, EqualTo), aggregates (e.g., SUM, COUNT), and other computations (e.g. UDFs). Each expression type is capable of determining its output schema as a function of its children's output schema.

    Named Expressions

    Some expression are named and thus can be referenced by later operators in the dataflow graph. The two types of named expressions are AttributeReferences and Aliases. AttributeReferences refer to attributes of the input tuple for a given operator and form the leaves of some expression trees. Aliases assign a name to intermediate computations. For example, in the SQL statement SELECT a+b AS c FROM ..., the expressions a and b would be represented by AttributeReferences and c would be represented by an Alias.

    During analysis, all named expressions are assigned a globally unique expression id, which can be used for equality comparisons. While the original names are kept around for debugging purposes, they should never be used to check if two attributes refer to the same value, as plan transformations can result in the introduction of naming ambiguity. For example, consider a plan that contains subqueries, both of which are reading from the same table. If an optimization removes the subqueries, scoping information would be destroyed, eliminating the ability to reason about which subquery produced a given attribute.

    Evaluation

    The result of expressions can be evaluated using the Expression.apply(Row) method.

    Definition Classes
    catalyst
  • package aggregate
    Definition Classes
    expressions
  • package codegen

    A collection of generators that build custom bytecode at runtime for performing the evaluation of catalyst expression.

    A collection of generators that build custom bytecode at runtime for performing the evaluation of catalyst expression.

    Definition Classes
    expressions
  • package objects
    Definition Classes
    expressions
  • AssertNotNull
  • CatalystToExternalMap
  • CreateExternalRow
  • DecodeUsingSerializer
  • EncodeUsingSerializer
  • ExternalMapToCatalyst
  • GetExternalRowField
  • InitializeJavaBean
  • Invoke
  • InvokeLike
  • LambdaVariable
  • MapObjects
  • NewInstance
  • SerializerSupport
  • StaticInvoke
  • UnresolvedCatalystToExternalMap
  • UnresolvedMapObjects
  • UnwrapOption
  • ValidateExternalType
  • WrapOption
  • package xml
    Definition Classes
    expressions

package objects

Ordering
  1. Alphabetic
Visibility
  1. Public
  2. All

Type Members

  1. case class AssertNotNull(child: Expression, walkedTypePath: Seq[String] = Nil) extends UnaryExpression with NonSQLExpression with Product with Serializable

    Asserts that input values of a non-nullable child expression are not null.

    Asserts that input values of a non-nullable child expression are not null.

    Note that there are cases where child.nullable == true, while we still need to add this assertion. Consider a nullable column s whose data type is a struct containing a non-nullable Int field named i. Expression s.i is nullable because s can be null. However, for all non-null s, s.i can't be null.

  2. case class CatalystToExternalMap extends Expression with NonSQLExpression with Product with Serializable

    Expression used to convert a Catalyst Map to an external Scala Map.

    Expression used to convert a Catalyst Map to an external Scala Map. The collection is constructed using the associated builder, obtained by calling newBuilder on the collection's companion object.

  3. case class CreateExternalRow(children: Seq[Expression], schema: StructType) extends Expression with NonSQLExpression with Product with Serializable

    Constructs a new external row, using the result of evaluating the specified expressions as content.

    Constructs a new external row, using the result of evaluating the specified expressions as content.

    children

    A list of expression to use as content of the external row.

  4. case class DecodeUsingSerializer[T](child: Expression, tag: ClassTag[T], kryo: Boolean) extends UnaryExpression with NonSQLExpression with SerializerSupport with Product with Serializable

    Serializes an input object using a generic serializer (Kryo or Java).

    Serializes an input object using a generic serializer (Kryo or Java). Note that the ClassTag is not an implicit parameter because TreeNode cannot copy implicit parameters.

    kryo

    if true, use Kryo. Otherwise, use Java.

  5. case class EncodeUsingSerializer(child: Expression, kryo: Boolean) extends UnaryExpression with NonSQLExpression with SerializerSupport with Product with Serializable

    Serializes an input object using a generic serializer (Kryo or Java).

    Serializes an input object using a generic serializer (Kryo or Java).

    kryo

    if true, use Kryo. Otherwise, use Java.

  6. case class ExternalMapToCatalyst extends Expression with NonSQLExpression with Product with Serializable

    Converts a Scala/Java map object into catalyst format, by applying the key/value converter when iterate the map.

  7. case class GetExternalRowField(child: Expression, index: Int, fieldName: String) extends UnaryExpression with NonSQLExpression with Product with Serializable

    Returns the value of field at index index from the external row child.

    Returns the value of field at index index from the external row child. This class can be viewed as GetStructField for Rows instead of InternalRows.

    Note that the input row and the field we try to get are both guaranteed to be not null, if they are null, a runtime exception will be thrown.

  8. case class InitializeJavaBean(beanInstance: Expression, setters: Map[String, Expression]) extends Expression with NonSQLExpression with Product with Serializable

    Initialize a Java Bean instance by setting its field values via setters.

  9. case class Invoke(targetObject: Expression, functionName: String, dataType: DataType, arguments: Seq[Expression] = Nil, propagateNull: Boolean = true, returnNullable: Boolean = true) extends Expression with InvokeLike with Product with Serializable

    Calls the specified function on an object, optionally passing arguments.

    Calls the specified function on an object, optionally passing arguments. If the targetObject expression evaluates to null then null will be returned.

    In some cases, due to erasure, the schema may expect a primitive type when in fact the method is returning java.lang.Object. In this case, we will generate code that attempts to unbox the value automatically.

    targetObject

    An expression that will return the object to call the method on.

    functionName

    The name of the method to call.

    dataType

    The expected return type of the function.

    arguments

    An optional list of expressions, whose evaluation will be passed to the function.

    propagateNull

    When true, and any of the arguments is null, null will be returned instead of calling the function.

    returnNullable

    When false, indicating the invoked method will always return non-null value.

  10. trait InvokeLike extends Expression with NonSQLExpression

    Common base class for StaticInvoke, Invoke, and NewInstance.

  11. case class LambdaVariable(name: String, dataType: DataType, nullable: Boolean, id: Long = ...) extends LeafExpression with NonSQLExpression with Product with Serializable

    A placeholder for the loop variable used in MapObjects.

    A placeholder for the loop variable used in MapObjects. This should never be constructed manually, but will instead be passed into the provided lambda function.

  12. case class MapObjects extends Expression with NonSQLExpression with Product with Serializable

    Applies the given expression to every element of a collection of items, returning the result as an ArrayType or ObjectType.

    Applies the given expression to every element of a collection of items, returning the result as an ArrayType or ObjectType. This is similar to a typical map operation, but where the lambda function is expressed using catalyst expressions.

    The type of the result is determined as follows: - ArrayType - when customCollectionCls is None - ObjectType(collection) - when customCollectionCls contains a collection class

    The following collection ObjectTypes are currently supported on input: Seq, Array, ArrayData, java.util.List

  13. case class NewInstance(cls: Class[_], arguments: Seq[Expression], propagateNull: Boolean, dataType: DataType, outerPointer: Option[() ⇒ AnyRef]) extends Expression with InvokeLike with Product with Serializable

    Constructs a new instance of the given class, using the result of evaluating the specified expressions as arguments.

    Constructs a new instance of the given class, using the result of evaluating the specified expressions as arguments.

    cls

    The class to construct.

    arguments

    A list of expression to use as arguments to the constructor.

    propagateNull

    When true, if any of the arguments is null, then null will be returned instead of trying to construct the object.

    dataType

    The type of object being constructed, as a Spark SQL datatype. This allows you to manually specify the type when the object in question is a valid internal representation (i.e. ArrayData) instead of an object.

    outerPointer

    If the object being constructed is an inner class, the outerPointer for the containing class must be specified. This parameter is defined as an optional function, which allows us to get the outer pointer lazily,and it's useful if the inner class is defined in REPL.

  14. trait SerializerSupport extends AnyRef

    Common trait for DecodeUsingSerializer and EncodeUsingSerializer

  15. case class StaticInvoke(staticObject: Class[_], dataType: DataType, functionName: String, arguments: Seq[Expression] = Nil, propagateNull: Boolean = true, returnNullable: Boolean = true) extends Expression with InvokeLike with Product with Serializable

    Invokes a static function, returning the result.

    Invokes a static function, returning the result. By default, any of the arguments being null will result in returning null instead of calling the function.

    staticObject

    The target of the static call. This can either be the object itself (methods defined on scala objects), or the class object (static methods defined in java).

    dataType

    The expected return type of the function call

    functionName

    The name of the method to call.

    arguments

    An optional list of expressions to pass as arguments to the function.

    propagateNull

    When true, and any of the arguments is null, null will be returned instead of calling the function.

    returnNullable

    When false, indicating the invoked method will always return non-null value.

  16. case class UnresolvedCatalystToExternalMap(child: Expression, keyFunction: (Expression) ⇒ Expression, valueFunction: (Expression) ⇒ Expression, collClass: Class[_]) extends UnaryExpression with Unevaluable with Product with Serializable

    Similar to UnresolvedMapObjects, this is a placeholder of CatalystToExternalMap.

    Similar to UnresolvedMapObjects, this is a placeholder of CatalystToExternalMap.

    child

    An expression that when evaluated returns a map object.

    keyFunction

    The function applied on the key collection elements.

    valueFunction

    The function applied on the value collection elements.

    collClass

    The type of the resulting collection.

  17. case class UnresolvedMapObjects(function: (Expression) ⇒ Expression, child: Expression, customCollectionCls: Option[Class[_]] = None) extends UnaryExpression with Unevaluable with Product with Serializable

    When constructing MapObjects, the element type must be given, which may not be available before analysis.

    When constructing MapObjects, the element type must be given, which may not be available before analysis. This class acts like a placeholder for MapObjects, and will be replaced by MapObjects during analysis after the input data is resolved. Note that, ideally we should not serialize and send unresolved expressions to executors, but users may accidentally do this(e.g. mistakenly reference an encoder instance when implementing Aggregator). Here we mark function as transient because it may reference scala Type, which is not serializable. Then even users mistakenly reference unresolved expression and serialize it, it's just a performance issue(more network traffic), and will not fail.

  18. case class UnwrapOption(dataType: DataType, child: Expression) extends UnaryExpression with NonSQLExpression with ExpectsInputTypes with Product with Serializable

    Given an expression that returns on object of type Option[_], this expression unwraps the option into the specified Spark SQL datatype.

    Given an expression that returns on object of type Option[_], this expression unwraps the option into the specified Spark SQL datatype. In the case of None, the nullbit is set instead.

    dataType

    The expected unwrapped option type.

    child

    An expression that returns an Option

  19. case class ValidateExternalType(child: Expression, expected: DataType) extends UnaryExpression with NonSQLExpression with ExpectsInputTypes with Product with Serializable

    Validates the actual data type of input expression at runtime.

    Validates the actual data type of input expression at runtime. If it doesn't match the expectation, throw an exception.

  20. case class WrapOption(child: Expression, optType: DataType) extends UnaryExpression with NonSQLExpression with ExpectsInputTypes with Product with Serializable

    Converts the result of evaluating child into an option, checking both the isNull bit and (in the case of reference types) equality with null.

    Converts the result of evaluating child into an option, checking both the isNull bit and (in the case of reference types) equality with null.

    child

    The expression to evaluate and wrap.

    optType

    The type of this option.

Value Members

  1. object CatalystToExternalMap extends Serializable
  2. object ExternalMapToCatalyst extends Serializable
  3. object LambdaVariable extends Serializable
  4. object MapObjects extends Serializable
  5. object NewInstance extends Serializable
  6. object SerializerSupport

Ungrouped