This methods fires at the end of each Task and collects metrics flattened into the taskMetricsData ListBuffer Note all times are in ms, cpu time and shufflewrite are originally in nanosec, thus in the code are divided by 1e6
This methods fires at the end of each Task and collects metrics flattened into the taskMetricsData ListBuffer Note all times are in ms, cpu time and shufflewrite are originally in nanosec, thus in the code are divided by 1e6
FlightRecorderTaskMetrics - Use a Spark Listener to record task metrics data and save them to a file
Use: by adding the following configuration to spark-submit (or Spark Session) configuration --conf spark.extraListeners=ch.cern.sparkmeasure.FlightRecorderTaskMetrics
Additional configuration parameters: --conf spark.sparkmeasure.outputFormat=<format>, valid values: java,json,json_to_hadoop default "json" note: json and java serialization formats, write to the driver local filesystem json_to_hadoop, writes to JSON serialized metrics to HDFS or to an Hadoop compliant filesystem, such as s3a
--conf spark.sparkmeasure.outputFilename=<output file>, default: "/tmp/taskMetrics_flightRecorder" --conf spark.sparkmeasure.printToStdout=<true|false>, default false. Set to true to print JSON serialized metrics to stdout.