case class StageMetrics(sparkSession: SparkSession) extends Product with Serializable
Stage Metrics: collects stage-level metrics with Stage granularity and provides aggregation and reporting functions for the end-user
Example: val stageMetrics = ch.cern.sparkmeasure.StageMetrics(spark) stageMetrics.runAndMeasure(spark.sql("select count(*) from range(1000) cross join range(1000) cross join range(1000)").show)
The tool is based on using Spark Listeners as the data source and collecting metrics into a ListBuffer of a case class that encapsulates Spark task metrics. The List Buffer may optionally be transformed into a DataFrame for ease of reporting and analysis.
Stage metrics are stored in memory and used to produce a report. The report shows aggregated resource consumption on the measured period.
- Alphabetic
- By Inheritance
- StageMetrics
- Serializable
- Serializable
- Product
- Equals
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
- new StageMetrics(sparkSession: SparkSession)
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- def aggregateStageMetrics(nameTempView: String = "PerfStageMetrics"): DataFrame
- def aggregateStageMetrics(): LinkedHashMap[String, Long]
- def aggregateStageMetricsJavaMap(): Map[String, Long]
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
- def begin(): Long
- var beginSnapshot: Long
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native() @IntrinsicCandidate()
- def createStageMetricsDF(nameTempView: String = "PerfStageMetrics"): DataFrame
- def end(): Long
- var endSnapshot: Long
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- val executorMetricsNames: Array[String]
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native() @IntrinsicCandidate()
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- val listenerStage: StageInfoRecorderListener
- lazy val logger: Logger
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @IntrinsicCandidate()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @IntrinsicCandidate()
- def printMemoryReport(): Unit
- def printReport(): Unit
- def registerListener(spark: SparkSession, listener: StageInfoRecorderListener): Unit
- def removeListener(): Unit
- def report(): String
- def reportMemory(): String
- def reportUsingDataFrame(): String
- def runAndMeasure[T](f: ⇒ T): T
- def saveData(df: DataFrame, fileName: String, fileFormat: String = "json", saveMode: String = "default"): Unit
-
def
sendReportPrometheus(serverIPnPort: String, metricsJob: String, labelName: String = sparkSession.sparkContext.appName, labelValue: String = ...): Unit
Send the metrics to Prometheus.
Send the metrics to Prometheus. serverIPnPort: String with prometheus pushgateway address, format is hostIP:Port, metricsJob: job name, labelName: metrics label name, default is sparkSession.sparkContext.appName, labelValue: metrics label value, default is sparkSession.sparkContext.applicationId
- val sparkSession: SparkSession
- val stageInfoVerbose: Boolean
- def stagesDuration(): LinkedHashMap[Int, Long]
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
Deprecated Value Members
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] ) @Deprecated
- Deprecated