t

org.apache.spark.sql.connector.catalog

StagingTableCatalog

trait StagingTableCatalog extends TableCatalog

An optional mix-in for implementations of TableCatalog that support staging creation of the a table before committing the table's metadata along with its contents in CREATE TABLE AS SELECT or REPLACE TABLE AS SELECT operations.

It is highly recommended to implement this trait whenever possible so that CREATE TABLE AS SELECT and REPLACE TABLE AS SELECT operations are atomic. For example, when one runs a REPLACE TABLE AS SELECT operation, if the catalog does not implement this trait, the planner will first drop the table via TableCatalog#dropTable(Identifier), then create the table via StructType, Transform[], Map), and then perform the write via SupportsWrite#newWriteBuilder(LogicalWriteInfo). However, if the write operation fails, the catalog will have already dropped the table, and the planner cannot roll back the dropping of the table.

If the catalog implements this plugin, the catalog can implement the methods to "stage" the creation and the replacement of a table. After the table's BatchWrite#commit(WriterCommitMessage[]) is called, StagedTable#commitStagedChanges() is called, at which point the staged table can complete both the data write and the metadata swap operation atomically.

Since

3.0.0

Linear Supertypes
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. StagingTableCatalog
  2. TableCatalog
  3. CatalogPlugin
  4. AnyRef
  5. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Abstract Value Members

  1. abstract def alterTable(ident: Identifier, changes: <repeated...>[TableChange]): Table

    Apply a set of changes to a table in the catalog.

    Apply a set of changes to a table in the catalog.

    Implementations may reject the requested changes. If any change is rejected, none of the changes should be applied to the table.

    The requested changes must be applied in the order given.

    If the catalog supports views and contains a view for the identifier and not a table, this must throw NoSuchTableException.

    ident

    a table identifier

    changes

    changes to apply to the table

    returns

    updated metadata for the table

    Definition Classes
    TableCatalog
    Exceptions thrown

    IllegalArgumentException If any change is rejected by the implementation.

    NoSuchTableException If the table doesn't exist or is a view

  2. abstract def createTable(ident: Identifier, schema: StructType, partitions: Array[Transform], properties: Map[String, String]): Table

    Create a table in the catalog.

    Create a table in the catalog.

    ident

    a table identifier

    schema

    the schema of the new table, as a struct type

    partitions

    transforms to use for partitioning data in the table

    properties

    a string map of table properties

    returns

    metadata for the new table

    Definition Classes
    TableCatalog
    Exceptions thrown

    NoSuchNamespaceException If the identifier namespace does not exist (optional)

    TableAlreadyExistsException If a table or view already exists for the identifier

    UnsupportedOperationException If a requested partition transform is not supported

  3. abstract def dropTable(ident: Identifier): Boolean

    Drop a table in the catalog.

    Drop a table in the catalog.

    If the catalog supports views and contains a view for the identifier and not a table, this must not drop the view and must return false.

    ident

    a table identifier

    returns

    true if a table was deleted, false if no table exists for the identifier

    Definition Classes
    TableCatalog
  4. abstract def initialize(name: String, options: CaseInsensitiveStringMap): Unit

    Called to initialize configuration.

    Called to initialize configuration.

    This method is called once, just after the provider is instantiated.

    name

    the name used to identify and load this catalog

    options

    a case-insensitive string map of configuration

    Definition Classes
    CatalogPlugin
  5. abstract def listTables(namespace: Array[String]): Array[Identifier]

    List the tables in a namespace from the catalog.

    List the tables in a namespace from the catalog.

    If the catalog supports views, this must return identifiers for only tables and not views.

    namespace

    a multi-part namespace

    returns

    an array of Identifiers for tables

    Definition Classes
    TableCatalog
    Exceptions thrown

    NoSuchNamespaceException If the namespace does not exist (optional).

  6. abstract def loadTable(ident: Identifier): Table

    Load table metadata by identifier from the catalog.

    Load table metadata by identifier from the catalog.

    If the catalog supports views and contains a view for the identifier and not a table, this must throw NoSuchTableException.

    ident

    a table identifier

    returns

    the table's metadata

    Definition Classes
    TableCatalog
    Exceptions thrown

    NoSuchTableException If the table doesn't exist or is a view

  7. abstract def name(): String

    Called to get this catalog's name.

    Called to get this catalog's name.

    This method is only called after CaseInsensitiveStringMap) is called to pass the catalog's name.

    Definition Classes
    CatalogPlugin
  8. abstract def renameTable(oldIdent: Identifier, newIdent: Identifier): Unit

    Renames a table in the catalog.

    Renames a table in the catalog.

    If the catalog supports views and contains a view for the old identifier and not a table, this throws NoSuchTableException. Additionally, if the new identifier is a table or a view, this throws TableAlreadyExistsException.

    If the catalog does not support table renames between namespaces, it throws UnsupportedOperationException.

    oldIdent

    the table identifier of the existing table to rename

    newIdent

    the new table identifier of the table

    Definition Classes
    TableCatalog
    Exceptions thrown

    NoSuchTableException If the table to rename doesn't exist or is a view

    TableAlreadyExistsException If the new table name already exists or is a view

    UnsupportedOperationException If the namespaces of old and new identifiers do not match (optional)

  9. abstract def stageCreate(ident: Identifier, schema: StructType, partitions: Array[Transform], properties: Map[String, String]): StagedTable

    Stage the creation of a table, preparing it to be committed into the metastore.

    Stage the creation of a table, preparing it to be committed into the metastore.

    When the table is committed, the contents of any writes performed by the Spark planner are committed along with the metadata about the table passed into this method's arguments. If the table exists when this method is called, the method should throw an exception accordingly. If another process concurrently creates the table before this table's staged changes are committed, an exception should be thrown by StagedTable#commitStagedChanges().

    ident

    a table identifier

    schema

    the schema of the new table, as a struct type

    partitions

    transforms to use for partitioning data in the table

    properties

    a string map of table properties

    returns

    metadata for the new table

    Exceptions thrown

    NoSuchNamespaceException If the identifier namespace does not exist (optional)

    TableAlreadyExistsException If a table or view already exists for the identifier

    UnsupportedOperationException If a requested partition transform is not supported

  10. abstract def stageCreateOrReplace(ident: Identifier, schema: StructType, partitions: Array[Transform], properties: Map[String, String]): StagedTable

    Stage the creation or replacement of a table, preparing it to be committed into the metastore when the returned table's StagedTable#commitStagedChanges() is called.

    Stage the creation or replacement of a table, preparing it to be committed into the metastore when the returned table's StagedTable#commitStagedChanges() is called.

    When the table is committed, the contents of any writes performed by the Spark planner are committed along with the metadata about the table passed into this method's arguments. If the table exists, the metadata and the contents of this table replace the metadata and contents of the existing table. If a concurrent process commits changes to the table's data or metadata while the write is being performed but before the staged changes are committed, the catalog can decide whether to move forward with the table replacement anyways or abort the commit operation.

    If the table does not exist when the changes are committed, the table should be created in the backing data source. This differs from the expected semantics of StructType, Transform[], Map), which should fail when the staged changes are committed but the table doesn't exist at commit time.

    ident

    a table identifier

    schema

    the schema of the new table, as a struct type

    partitions

    transforms to use for partitioning data in the table

    properties

    a string map of table properties

    returns

    metadata for the new table

    Exceptions thrown

    NoSuchNamespaceException If the identifier namespace does not exist (optional)

    UnsupportedOperationException If a requested partition transform is not supported

  11. abstract def stageReplace(ident: Identifier, schema: StructType, partitions: Array[Transform], properties: Map[String, String]): StagedTable

    Stage the replacement of a table, preparing it to be committed into the metastore when the returned table's StagedTable#commitStagedChanges() is called.

    Stage the replacement of a table, preparing it to be committed into the metastore when the returned table's StagedTable#commitStagedChanges() is called.

    When the table is committed, the contents of any writes performed by the Spark planner are committed along with the metadata about the table passed into this method's arguments. If the table exists, the metadata and the contents of this table replace the metadata and contents of the existing table. If a concurrent process commits changes to the table's data or metadata while the write is being performed but before the staged changes are committed, the catalog can decide whether to move forward with the table replacement anyways or abort the commit operation.

    If the table does not exist, committing the staged changes should fail with NoSuchTableException. This differs from the semantics of StructType, Transform[], Map), which should create the table in the data source if the table does not exist at the time of committing the operation.

    ident

    a table identifier

    schema

    the schema of the new table, as a struct type

    partitions

    transforms to use for partitioning data in the table

    properties

    a string map of table properties

    returns

    metadata for the new table

    Exceptions thrown

    NoSuchNamespaceException If the identifier namespace does not exist (optional)

    NoSuchTableException If the table does not exist

    UnsupportedOperationException If a requested partition transform is not supported

Concrete Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  6. def defaultNamespace(): Array[String]

    Return a default namespace for the catalog.

    Return a default namespace for the catalog.

    When this catalog is set as the current catalog, the namespace returned by this method will be set as the current namespace.

    The namespace returned by this method is not required to exist.

    returns

    a multi-part namespace

    Definition Classes
    CatalogPlugin
  7. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  8. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  9. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  10. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  11. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  12. def invalidateTable(ident: Identifier): Unit

    Invalidate cached table metadata for an identifier.

    Invalidate cached table metadata for an identifier.

    If the table is already loaded or cached, drop cached data. If the table does not exist or is not cached, do nothing. Calling this method should not query remote services.

    ident

    a table identifier

    Definition Classes
    TableCatalog
  13. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  14. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  15. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  16. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  17. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  18. def tableExists(ident: Identifier): Boolean

    Test whether a table exists using an identifier from the catalog.

    Test whether a table exists using an identifier from the catalog.

    If the catalog supports views and contains a view for the identifier and not a table, this must return false.

    ident

    a table identifier

    returns

    true if the table exists, false otherwise

    Definition Classes
    TableCatalog
  19. def toString(): String
    Definition Classes
    AnyRef → Any
  20. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  21. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  22. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()

Inherited from TableCatalog

Inherited from CatalogPlugin

Inherited from AnyRef

Inherited from Any

Ungrouped