Class

ch.cern.sparkmeasure

StageMetrics

Related Doc: package sparkmeasure

Permalink

case class StageMetrics(sparkSession: SparkSession) extends Product with Serializable

Stage Metrics: collects stage-level metrics with Stage granularity and provides aggregation and reporting functions for the end-user

Example usage for stage metrics: val stageMetrics = ch.cern.sparkmeasure.StageMetrics(spark) stageMetrics.runAndMeasure(spark.sql("select count(*) from range(1000) cross join range(1000) cross join range(1000)").show)

The tool is based on using Spark Listeners as data source and collecting metrics in a ListBuffer of a case class that encapsulates Spark task metrics. The List Buffer is then transformed into a DataFrame for ease of reporting and analysis.

Stage metrics are stored in memory and use to produce a report that aggregates resource consumption they can also be consumed "raw" (transformed into a DataFrame and/or saved to a file)

Linear Supertypes
Serializable, Serializable, Product, Equals, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. StageMetrics
  2. Serializable
  3. Serializable
  4. Product
  5. Equals
  6. AnyRef
  7. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new StageMetrics(sparkSession: SparkSession)

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. def aggregateStageMetrics(nameTempView: String = "PerfStageMetrics"): DataFrame

    Permalink
  5. def aggregateStageMetrics(): LinkedHashMap[String, Long]

    Permalink
  6. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  7. def begin(): Long

    Permalink
  8. var beginSnapshot: Long

    Permalink
  9. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  10. def createStageMetricsDF(nameTempView: String = "PerfStageMetrics"): DataFrame

    Permalink
  11. def end(): Long

    Permalink
  12. var endSnapshot: Long

    Permalink
  13. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  14. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  15. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  16. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  17. val listenerStage: StageInfoRecorderListener

    Permalink
  18. lazy val logger: Logger

    Permalink
  19. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  20. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  21. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  22. def printReport(): Unit

    Permalink
  23. def registerListener(spark: SparkSession, listener: StageInfoRecorderListener): Unit

    Permalink
  24. def removeListener(): Unit

    Permalink
  25. def report(): String

    Permalink
  26. def reportUsingDataFrame(): String

    Permalink
  27. def runAndMeasure[T](f: ⇒ T): T

    Permalink
  28. def saveData(df: DataFrame, fileName: String, fileFormat: String = "json", saveMode: String = "default"): Unit

    Permalink
  29. def sendReportPrometheus(serverIPnPort: String, metricsJob: String, labelName: String = sparkSession.sparkContext.appName, labelValue: String = ...): Unit

    Permalink

    Send the metrics to Prometheus.

    Send the metrics to Prometheus. serverIPnPort: String with prometheus pushgateway address, format is hostIP:Port, metricsJob: job name, labelName: metrics label name, default is sparkSession.sparkContext.appName, labelValue: metrics label value, default is sparkSession.sparkContext.applicationId

  30. val sparkSession: SparkSession

    Permalink
  31. def stagesDuration(): LinkedHashMap[Int, Long]

    Permalink
  32. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  33. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  34. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  35. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from AnyRef

Inherited from Any

Ungrouped