Class

ch.cern.sparkmeasure

FlightRecorderStageMetrics

Related Doc: package sparkmeasure

Permalink

class FlightRecorderStageMetrics extends StageInfoRecorderListener

Spark Measure package: proof-of-concept tool for measuring Spark performance metrics This is based on using Spark Listeners as data source and collecting metrics in a ListBuffer The list buffer is then transformed into a DataFrame for analysis

Stage Metrics: collects and aggregates metrics at the end of each stage Task Metrics: collects data at task granularity

Use modes: Interactive mode from the REPL Flight recorder mode: records data and saves it for later processing

Supported languages: The tool is written in Scala, but it can be used both from Scala and Python

Example usage for stage metrics: val stageMetrics = ch.cern.sparkmeasure.StageMetrics(spark) stageMetrics.runAndMeasure(spark.sql("select count(*) from range(1000) cross join range(1000) cross join range(1000)").show)

for task metrics: val taskMetrics = ch.cern.sparkmeasure.TaskMetrics(spark) spark.sql("select count(*) from range(1000) cross join range(1000) cross join range(1000)").show() val df = taskMetrics.createTaskMetricsDF()

To use in flight recorder mode add: --conf spark.extraListeners=ch.cern.sparkmeasure.FlightRecorderStageMetrics

Created by Luca.Canali@cern.ch, March 2017

Linear Supertypes
StageInfoRecorderListener, SparkListener, SparkListenerInterface, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. FlightRecorderStageMetrics
  2. StageInfoRecorderListener
  3. SparkListener
  4. SparkListenerInterface
  5. AnyRef
  6. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new FlightRecorderStageMetrics(conf: SparkConf)

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. val StageIdtoJobGroup: HashMap[Int, String]

    Permalink
    Definition Classes
    StageInfoRecorderListener
  5. val StageIdtoJobId: HashMap[Int, Int]

    Permalink
    Definition Classes
    StageInfoRecorderListener
  6. val accumulablesMetricsData: ListBuffer[StageAccumulablesInfo]

    Permalink
    Definition Classes
    StageInfoRecorderListener
  7. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  8. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  9. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  10. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  11. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  12. val fullPath: String

    Permalink
  13. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  14. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  15. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  16. lazy val logger: Logger

    Permalink
  17. val metricsFileName: String

    Permalink
  18. val metricsFormat: String

    Permalink
  19. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  20. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  21. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  22. def onApplicationEnd(applicationEnd: SparkListenerApplicationEnd): Unit

    Permalink

    when the application stops serialize the content of stageMetricsData into a file in the driver's filesystem

    when the application stops serialize the content of stageMetricsData into a file in the driver's filesystem

    Definition Classes
    FlightRecorderStageMetrics → SparkListener → SparkListenerInterface
  23. def onApplicationStart(applicationStart: SparkListenerApplicationStart): Unit

    Permalink
    Definition Classes
    SparkListener → SparkListenerInterface
  24. def onBlockManagerAdded(blockManagerAdded: SparkListenerBlockManagerAdded): Unit

    Permalink
    Definition Classes
    SparkListener → SparkListenerInterface
  25. def onBlockManagerRemoved(blockManagerRemoved: SparkListenerBlockManagerRemoved): Unit

    Permalink
    Definition Classes
    SparkListener → SparkListenerInterface
  26. def onBlockUpdated(blockUpdated: SparkListenerBlockUpdated): Unit

    Permalink
    Definition Classes
    SparkListener → SparkListenerInterface
  27. def onEnvironmentUpdate(environmentUpdate: SparkListenerEnvironmentUpdate): Unit

    Permalink
    Definition Classes
    SparkListener → SparkListenerInterface
  28. def onExecutorAdded(executorAdded: SparkListenerExecutorAdded): Unit

    Permalink
    Definition Classes
    SparkListener → SparkListenerInterface
  29. def onExecutorBlacklisted(executorBlacklisted: SparkListenerExecutorBlacklisted): Unit

    Permalink
    Definition Classes
    SparkListener → SparkListenerInterface
  30. def onExecutorMetricsUpdate(executorMetricsUpdate: SparkListenerExecutorMetricsUpdate): Unit

    Permalink
    Definition Classes
    SparkListener → SparkListenerInterface
  31. def onExecutorRemoved(executorRemoved: SparkListenerExecutorRemoved): Unit

    Permalink
    Definition Classes
    SparkListener → SparkListenerInterface
  32. def onExecutorUnblacklisted(executorUnblacklisted: SparkListenerExecutorUnblacklisted): Unit

    Permalink
    Definition Classes
    SparkListener → SparkListenerInterface
  33. def onJobEnd(jobEnd: SparkListenerJobEnd): Unit

    Permalink
    Definition Classes
    SparkListener → SparkListenerInterface
  34. def onJobStart(jobStart: SparkListenerJobStart): Unit

    Permalink
    Definition Classes
    StageInfoRecorderListener → SparkListener → SparkListenerInterface
  35. def onNodeBlacklisted(nodeBlacklisted: SparkListenerNodeBlacklisted): Unit

    Permalink
    Definition Classes
    SparkListener → SparkListenerInterface
  36. def onNodeUnblacklisted(nodeUnblacklisted: SparkListenerNodeUnblacklisted): Unit

    Permalink
    Definition Classes
    SparkListener → SparkListenerInterface
  37. def onOtherEvent(event: SparkListenerEvent): Unit

    Permalink
    Definition Classes
    SparkListener → SparkListenerInterface
  38. def onSpeculativeTaskSubmitted(speculativeTask: SparkListenerSpeculativeTaskSubmitted): Unit

    Permalink
    Definition Classes
    SparkListener → SparkListenerInterface
  39. def onStageCompleted(stageCompleted: SparkListenerStageCompleted): Unit

    Permalink

    This methods fires at the end of the stage and collects metrics flattened into the stageMetricsData ListBuffer Note all times are in ms, cpu time and shufflewrite are originally in nanosec, thus in the code are divided by 1e6

    This methods fires at the end of the stage and collects metrics flattened into the stageMetricsData ListBuffer Note all times are in ms, cpu time and shufflewrite are originally in nanosec, thus in the code are divided by 1e6

    Definition Classes
    StageInfoRecorderListener → SparkListener → SparkListenerInterface
  40. def onStageSubmitted(stageSubmitted: SparkListenerStageSubmitted): Unit

    Permalink
    Definition Classes
    SparkListener → SparkListenerInterface
  41. def onTaskEnd(taskEnd: SparkListenerTaskEnd): Unit

    Permalink
    Definition Classes
    SparkListener → SparkListenerInterface
  42. def onTaskGettingResult(taskGettingResult: SparkListenerTaskGettingResult): Unit

    Permalink
    Definition Classes
    SparkListener → SparkListenerInterface
  43. def onTaskStart(taskStart: SparkListenerTaskStart): Unit

    Permalink
    Definition Classes
    SparkListener → SparkListenerInterface
  44. def onUnpersistRDD(unpersistRDD: SparkListenerUnpersistRDD): Unit

    Permalink
    Definition Classes
    SparkListener → SparkListenerInterface
  45. val stageMetricsData: ListBuffer[StageVals]

    Permalink
    Definition Classes
    StageInfoRecorderListener
  46. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  47. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  48. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  49. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  50. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from StageInfoRecorderListener

Inherited from SparkListener

Inherited from SparkListenerInterface

Inherited from AnyRef

Inherited from Any

Ungrouped