Packages

package streaming

Type Members

  1. trait StreamingDataWriterFactory extends Serializable

    A factory of DataWriter returned by StreamingWrite#createStreamingWriterFactory(PhysicalWriteInfo), which is responsible for creating and initializing the actual data writer at executor side.

    A factory of DataWriter returned by StreamingWrite#createStreamingWriterFactory(PhysicalWriteInfo), which is responsible for creating and initializing the actual data writer at executor side.

    Note that, the writer factory will be serialized and sent to executors, then the data writer will be created on executors and do the actual writing. So this interface must be serializable and DataWriter doesn't need to be.

    Since

    3.0.0

  2. trait StreamingWrite extends AnyRef

    An interface that defines how to write the data to data source in streaming queries.

    An interface that defines how to write the data to data source in streaming queries.

    The writing procedure is:

    1. Create a writer factory by #createStreamingWriterFactory(PhysicalWriteInfo), serialize and send it to all the partitions of the input data(RDD). 2. For each epoch in each partition, create the data writer, and write the data of the epoch in the partition with this writer. If all the data are written successfully, call DataWriter#commit(). If exception happens during the writing, call DataWriter#abort(). 3. If writers in all partitions of one epoch are successfully committed, call WriterCommitMessage[]). If some writers are aborted, or the job failed with an unknown reason, call WriterCommitMessage[]).

    While Spark will retry failed writing tasks, Spark won't retry failed writing jobs. Users should do it manually in their Spark applications if they want to retry.

    Please refer to the documentation of commit/abort methods for detailed specifications.

    Since

    3.0.0

Ungrouped