Object

org.cert.netsa.mothra.tools

RollupDayMain

Related Doc: package tools

Permalink

object RollupDayMain extends App with StrictLogging

Object to implement the RollupDay application.

Typical Usage in a Spark environment:

spark-submit --class org.cert.netsa.mothra.packer.tools.RollupDayMain mothra-tools.jar <s1> [<s2> <s3> ...]

where:

s1..sn: Directories to process, as Hadoop URIs

RollupDay reduces the number of data files in a Mothra repository. It may also be used to modify the files' compression.

RollupDay runs as a batch process, not as a daemon.

RollupDay makes a single recursive scan of the source directories <s1>, <s2>, ... for files whose names match the pattern "YYYYMMDD.HH." or "YYYYMMDD.HH-PTddH." (It looks for files matching the regular expression ^\d{8}\.\d{2}(?:-PT\d\d?H)?\.) Files whose names match that pattern and reside in the same directory are processed by RollupDay to create a single new file (see next paragraph) in the same directory containing the records in all files in that directory.

RollupDay joins the files in a directory into a single file by default. The mothra.rollupday.maximumSize Java property may be used to limit the maximum file size. The size is for the compressed file if compression is active. The value is approximate since it is only checked after the data appears on disk which occurs in large blocks because of buffering by the Java stream code and the compression algorithm.

There is always a single thread that recursively scans the directories. The number of threads that joins the files may be set by specifying the mothra.rollupday.maxThreads Java property. If not specified, the default is 6.

By default, RollupDay does not compress the files it writes. (NOTE: It should support writing the output using the same compression as the input.) To specify the compression codec that it should use, specify the mothra.rollupday.compression Java property. Values typically supported by Hadoop include bzip2, gzip, lz4, lzo, lzop, snappy, and default. The empty string indicates no compression.

Linear Supertypes
StrictLogging, App, DelayedInit, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. RollupDayMain
  2. StrictLogging
  3. App
  4. DelayedInit
  5. AnyRef
  6. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. val DEFAULT_COMPRESSION: String

    Permalink

    The default compression codec to use for files written to HDFS.

    The default compression codec to use for files written to HDFS. This may be modified by specifying the following property: mothra.rollupday.compression.

    Values typically supported by Hadoop include bzip2, gzip, lz4, lzo, lzop, snappy, and default. The empty string indicates no compression.

  5. val DEFAULT_MAX_THREADS: Int

    Permalink

    The default number of threads to run for joining files when the mothra.rollupday.maxThreads Java property is not set.

    The default number of threads to run for joining files when the mothra.rollupday.maxThreads Java property is not set. (The scanning task always runs in its own thread.)

  6. def args: Array[String]

    Permalink
    Attributes
    protected
    Definition Classes
    App
    Annotations
    @deprecatedOverriding( "args should not be overridden" , "2.11.0" )
  7. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  8. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  9. val compressCodec: Option[CompressionCodec]

    Permalink

    The compression codec used for files written to HDFS.

    The compression codec used for files written to HDFS. This may be set by setting the "mothra.rollupday.compression" property. If that property is not set, DEFAULT_COMPRESSION is used.

  10. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  11. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  12. val executionStart: Long

    Permalink
    Definition Classes
    App
  13. val fileSystem: FileSystem

    Permalink
  14. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  15. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  16. implicit val hadoopConf: Configuration

    Permalink

    The Hadoop configuration

  17. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  18. implicit val infoModel: InfoModel

    Permalink

    The information model

  19. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  20. val logTaskCountInterval: Int

    Permalink

    How often to print log messages regarding the number of tasks, in seconds.

  21. val logger: Logger

    Permalink
    Attributes
    protected
    Definition Classes
    StrictLogging
  22. def main(args: Array[String]): Unit

    Permalink
    Definition Classes
    App
    Annotations
    @deprecatedOverriding( "main should not be overridden" , "2.11.0" )
  23. val maxThreads: Int

    Permalink

    The maximum number of filejoiner threads to start.

    The maximum number of filejoiner threads to start. It defaults to the value DEFAULT_MAX_THREADS.

    This run-time behavior may be modified by setting the mothra.rollupday.maxThreads property.

  24. val maximumSize: Option[Long]

    Permalink

    The (approximate) maximum size file to create.

    The (approximate) maximum size file to create. The default is no maximum. When a file's size exceeds this value, the file is closed and a new file is started. Typically a file's size will not exceed this value by more than the maximum size of an IPFIX message, 64k.

  25. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  26. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  27. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  28. val positionalArgs: Array[String]

    Permalink
  29. val switches: Array[String]

    Permalink
  30. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  31. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  32. def usage(full: Boolean = false): Unit

    Permalink
  33. def version(): Unit

    Permalink
  34. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  35. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  36. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Deprecated Value Members

  1. def delayedInit(body: ⇒ Unit): Unit

    Permalink
    Definition Classes
    App → DelayedInit
    Annotations
    @deprecated
    Deprecated

    (Since version 2.11.0) The delayedInit mechanism will disappear.

Inherited from StrictLogging

Inherited from App

Inherited from DelayedInit

Inherited from AnyRef

Inherited from Any

Ungrouped