Packages

  • package root
    Definition Classes
    root
  • package org
    Definition Classes
    root
  • package apache
    Definition Classes
    org
  • package spark
    Definition Classes
    apache
  • package sql
    Definition Classes
    spark
  • package catalyst

    Catalyst is a library for manipulating relational query plans.

    Catalyst is a library for manipulating relational query plans. All classes in catalyst are considered an internal API to Spark SQL and are subject to change between minor releases.

    Definition Classes
    sql
  • package analysis

    Provides a logical query plan Analyzer and supporting classes for performing analysis.

    Provides a logical query plan Analyzer and supporting classes for performing analysis. Analysis consists of translating UnresolvedAttributes and UnresolvedRelations into fully typed objects using information in a schema Catalog.

    Definition Classes
    catalyst
  • package catalog
    Definition Classes
    catalyst
  • package csv
    Definition Classes
    catalyst
  • package dsl

    A collection of implicit conversions that create a DSL for constructing catalyst data structures.

    A collection of implicit conversions that create a DSL for constructing catalyst data structures.

    scala> import org.apache.spark.sql.catalyst.dsl.expressions._
    
    // Standard operators are added to expressions.
    scala> import org.apache.spark.sql.catalyst.expressions.Literal
    scala> Literal(1) + Literal(1)
    res0: org.apache.spark.sql.catalyst.expressions.Add = (1 + 1)
    
    // There is a conversion from 'symbols to unresolved attributes.
    scala> 'a.attr
    res1: org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute = 'a
    
    // These unresolved attributes can be used to create more complicated expressions.
    scala> 'a === 'b
    res2: org.apache.spark.sql.catalyst.expressions.EqualTo = ('a = 'b)
    
    // SQL verbs can be used to construct logical query plans.
    scala> import org.apache.spark.sql.catalyst.plans.logical._
    scala> import org.apache.spark.sql.catalyst.dsl.plans._
    scala> LocalRelation('key.int, 'value.string).where('key === 1).select('value).analyze
    res3: org.apache.spark.sql.catalyst.plans.logical.LogicalPlan =
    Project [value#3]
     Filter (key#2 = 1)
      LocalRelation [key#2,value#3], []
    Definition Classes
    catalyst
  • package encoders
    Definition Classes
    catalyst
  • DummyExpressionHolder
  • ExpressionEncoder
  • OuterScopes
  • RowEncoder
  • package errors

    Functions for attaching and retrieving trees that are associated with errors.

    Functions for attaching and retrieving trees that are associated with errors.

    Definition Classes
    catalyst
  • package expressions

    A set of classes that can be used to represent trees of relational expressions.

    A set of classes that can be used to represent trees of relational expressions. A key goal of the expression library is to hide the details of naming and scoping from developers who want to manipulate trees of relational operators. As such, the library defines a special type of expression, a NamedExpression in addition to the standard collection of expressions.

    Standard Expressions

    A library of standard expressions (e.g., Add, EqualTo), aggregates (e.g., SUM, COUNT), and other computations (e.g. UDFs). Each expression type is capable of determining its output schema as a function of its children's output schema.

    Named Expressions

    Some expression are named and thus can be referenced by later operators in the dataflow graph. The two types of named expressions are AttributeReferences and Aliases. AttributeReferences refer to attributes of the input tuple for a given operator and form the leaves of some expression trees. Aliases assign a name to intermediate computations. For example, in the SQL statement SELECT a+b AS c FROM ..., the expressions a and b would be represented by AttributeReferences and c would be represented by an Alias.

    During analysis, all named expressions are assigned a globally unique expression id, which can be used for equality comparisons. While the original names are kept around for debugging purposes, they should never be used to check if two attributes refer to the same value, as plan transformations can result in the introduction of naming ambiguity. For example, consider a plan that contains subqueries, both of which are reading from the same table. If an optimization removes the subqueries, scoping information would be destroyed, eliminating the ability to reason about which subquery produced a given attribute.

    Evaluation

    The result of expressions can be evaluated using the Expression.apply(Row) method.

    Definition Classes
    catalyst
  • package json
    Definition Classes
    catalyst
  • package optimizer
    Definition Classes
    catalyst
  • package parser
    Definition Classes
    catalyst
  • package planning

    Contains classes for enumerating possible physical plans for a given logical query plan.

    Contains classes for enumerating possible physical plans for a given logical query plan.

    Definition Classes
    catalyst
  • package plans

    A collection of common abstractions for query plans as well as a base logical plan representation.

    A collection of common abstractions for query plans as well as a base logical plan representation.

    Definition Classes
    catalyst
  • package rules

    A framework for applying batches rewrite rules to trees, possibly to fixed point.

    A framework for applying batches rewrite rules to trees, possibly to fixed point.

    Definition Classes
    catalyst
  • package trees

    A library for easily manipulating trees of operators.

    A library for easily manipulating trees of operators. Operators that extend TreeNode are granted the following interface:

    • Scala collection like methods (foreach, map, flatMap, collect, etc)

    - transform - accepts a partial function that is used to generate a new tree. When the partial function can be applied to a given tree segment, that segment is replaced with the result. After attempting to apply the partial function to a given node, the transform function recursively attempts to apply the function to that node's children.

    • debugging support - pretty printing, easy splicing of trees, etc.
    Definition Classes
    catalyst
  • package util
    Definition Classes
    catalyst

package encoders

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. encoders
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. case class DummyExpressionHolder(exprs: Seq[Expression]) extends LeafNode with Product with Serializable
  2. case class ExpressionEncoder[T](objSerializer: Expression, objDeserializer: Expression, clsTag: ClassTag[T]) extends Encoder[T] with Product with Serializable

    A generic encoder for JVM objects that uses Catalyst Expressions for a serializer and a deserializer.

    A generic encoder for JVM objects that uses Catalyst Expressions for a serializer and a deserializer.

    objSerializer

    An expression that can be used to encode a raw object to corresponding Spark SQL representation that can be a primitive column, array, map or a struct. This represents how Spark SQL generally serializes an object of type T.

    objDeserializer

    An expression that will construct an object given a Spark SQL representation. This represents how Spark SQL generally deserializes a serialized value in Spark SQL representation back to an object of type T.

    clsTag

    A classtag for T.

Value Members

  1. def encoderFor[A](implicit arg0: Encoder[A]): ExpressionEncoder[A]

    Returns an internal encoder object that can be used to serialize / deserialize JVM objects into Spark SQL rows.

    Returns an internal encoder object that can be used to serialize / deserialize JVM objects into Spark SQL rows. The implicit encoder should always be unresolved (i.e. have no attribute references from a specific schema.) This requirement allows us to preserve whether a given object type is being bound by name or by ordinal when doing resolution.

  2. object ExpressionEncoder extends Serializable

    A factory for constructing encoders that convert objects and primitives to and from the internal row format using catalyst expressions and code generation.

    A factory for constructing encoders that convert objects and primitives to and from the internal row format using catalyst expressions and code generation. By default, the expressions used to retrieve values from an input row when producing an object will be created as follows:

    • Classes will have their sub fields extracted by name using UnresolvedAttribute expressions and UnresolvedExtractValue expressions.
    • Tuples will have their subfields extracted by position using BoundReference expressions.
    • Primitives will have their values extracted from the first ordinal with a schema that defaults to the name value.
  3. object OuterScopes
  4. object RowEncoder

    A factory for constructing encoders that convert external row to/from the Spark SQL internal binary representation.

    A factory for constructing encoders that convert external row to/from the Spark SQL internal binary representation.

    The following is a mapping between Spark SQL types and its allowed external types:

    BooleanType -> java.lang.Boolean
    ByteType -> java.lang.Byte
    ShortType -> java.lang.Short
    IntegerType -> java.lang.Integer
    FloatType -> java.lang.Float
    DoubleType -> java.lang.Double
    StringType -> String
    DecimalType -> java.math.BigDecimal or scala.math.BigDecimal or Decimal
    
    DateType -> java.sql.Date if spark.sql.datetime.java8API.enabled is false
    DateType -> java.time.LocalDate if spark.sql.datetime.java8API.enabled is true
    
    TimestampType -> java.sql.Timestamp if spark.sql.datetime.java8API.enabled is false
    TimestampType -> java.time.Instant if spark.sql.datetime.java8API.enabled is true
    
    BinaryType -> byte array
    ArrayType -> scala.collection.Seq or Array
    MapType -> scala.collection.Map
    StructType -> org.apache.spark.sql.Row

Inherited from AnyRef

Inherited from Any

Ungrouped