Class Pipeline


  • public class Pipeline
    extends Object
    • Field Detail

      • RATE_LIMITED_LOG

        public static final com.swrve.ratelimitedlogger.RateLimitedLog RATE_LIMITED_LOG
    • Constructor Detail

      • Pipeline

        public Pipeline()
    • Method Detail

      • create

        public static org.apache.beam.sdk.Pipeline create​(NephronOptions options)
        Creates a new pipeline from the given set of runtime options.
        Parameters:
        options - runtime options
        Returns:
        a new pipeline
      • create

        public static org.apache.beam.sdk.Pipeline create​(NephronOptions options,
                                                          org.apache.beam.sdk.io.kafka.TimestampPolicyFactory<byte[],​org.opennms.netmgt.flows.persistence.model.FlowDocument> timestampPolicyFactory)
        Creates a new pipeline from the given set of runtime options using the given TimestampPolicyFactory.
      • accumulateSummariesIfNecessary

        public static org.apache.beam.sdk.values.PCollection<org.apache.beam.sdk.values.KV<CompoundKey,​Aggregate>> accumulateSummariesIfNecessary​(NephronOptions options,
                                                                                                                                                        org.apache.beam.sdk.values.PCollection<org.apache.beam.sdk.values.KV<CompoundKey,​Aggregate>> flowSummaries)
      • accumulateFlowSummaries

        public static org.apache.beam.sdk.values.PCollection<org.apache.beam.sdk.values.KV<CompoundKey,​Aggregate>> accumulateFlowSummaries​(org.apache.beam.sdk.values.PCollection<org.apache.beam.sdk.values.KV<CompoundKey,​Aggregate>> input,
                                                                                                                                                 org.joda.time.Duration accumulationDelay)
      • attachWriteToElastic

        public static void attachWriteToElastic​(NephronOptions options,
                                                org.apache.beam.sdk.values.PCollection<org.apache.beam.sdk.values.KV<CompoundKey,​Aggregate>> flowSummaries)
      • attachWriteToKafka

        public static void attachWriteToKafka​(NephronOptions options,
                                              org.apache.beam.sdk.values.PCollection<org.apache.beam.sdk.values.KV<CompoundKey,​Aggregate>> flowSummaries)
      • attachWriteToCortex

        public static void attachWriteToCortex​(NephronOptions options,
                                               org.apache.beam.sdk.values.PCollection<org.apache.beam.sdk.values.KV<CompoundKey,​Aggregate>> flowSummaries)
      • attachWriteToCortex

        public static void attachWriteToCortex​(NephronOptions options,
                                               org.apache.beam.sdk.values.PCollection<org.apache.beam.sdk.values.KV<CompoundKey,​Aggregate>> flowSummaries,
                                               Consumer<org.opennms.nephron.cortex.CortexIo.Write<CompoundKey,​Aggregate>> additionalConfig)
        Parameters:
        additionalConfig - Allows for additional configuration of the Cortex writer; used by the benchmark application for adding a label that differentiates benchmark runs.
      • registerCoders

        public static void registerCoders​(org.apache.beam.sdk.Pipeline p)
      • getKafkaInputTimestampPolicyFactory

        public static org.apache.beam.sdk.io.kafka.TimestampPolicyFactory<byte[],​org.opennms.netmgt.flows.persistence.model.FlowDocument> getKafkaInputTimestampPolicyFactory​(org.joda.time.Duration maxDelay)
      • attachTimestamps

        public static org.apache.beam.sdk.transforms.ParDo.SingleOutput<org.opennms.netmgt.flows.persistence.model.FlowDocument,​org.opennms.netmgt.flows.persistence.model.FlowDocument> attachTimestamps​(org.joda.time.Duration fixedWindowSize,
                                                                                                                                                                                                                org.joda.time.Duration maxFlowDuration)
        Dispatches a FlowDocument to all of the windows that overlap with the flow range.
        Returns:
        transform
      • toFlowSummary

        public static FlowSummary toFlowSummary​(org.apache.beam.sdk.values.KV<CompoundKey,​Aggregate> fsd,
                                                org.apache.beam.sdk.transforms.windowing.IntervalWindow window)
      • toWindow

        public static org.apache.beam.sdk.transforms.windowing.Window<org.opennms.netmgt.flows.persistence.model.FlowDocument> toWindow​(org.joda.time.Duration fixedWindowSize,
                                                                                                                                        org.joda.time.Duration earlyProcessingDelay,
                                                                                                                                        org.joda.time.Duration lateProcessingDelay,
                                                                                                                                        org.joda.time.Duration allowedLateness)
      • bytesInWindow

        public static long bytesInWindow​(long deltaSwitched,
                                         long lastSwitchedInclusive,
                                         double multipliedNumBytes,
                                         long windowStart,
                                         long windowEndInclusive)
      • aggregatize

        public static Aggregate aggregatize​(org.apache.beam.sdk.transforms.windowing.IntervalWindow window,
                                            org.opennms.netmgt.flows.persistence.model.FlowDocument flow,
                                            String hostname,
                                            String hostname2)
      • aggregateParentTotal

        public static Pipeline.TotalAndSummary aggregateParentTotal​(String transformPrefix,
                                                                    org.apache.beam.sdk.values.PCollection<org.apache.beam.sdk.values.KV<CompoundKey,​Aggregate>> child)
        Aggregates over parent keys.

        The result collection is "total" collection, i.e. it is not capped by a topK transform.

        Parameters:
        child - A total collection that is keyed by subkeys.
      • aggregateSumsAndTopKs

        public static Pipeline.SumsAndTopKs aggregateSumsAndTopKs​(String transformPrefix,
                                                                  org.apache.beam.sdk.values.PCollection<org.apache.beam.sdk.values.KV<CompoundKey,​Aggregate>> groupedByKeyWithTos,
                                                                  CompoundKeyType typeWithoutTos,
                                                                  int k,
                                                                  org.apache.beam.sdk.transforms.SerializableFunction<CompoundKey,​Boolean> includeKeyInTopK)
        Aggregates the sums and topKs for the input collection and a projection of the input collection where the tos (i.e. dscp) key dimension is ignored.
        Parameters:
        groupedByKeyWithTos - a total collection that is a multimap (i.e. the collection may contain several entries with the same CompoundKey but different values)
        typeWithoutTos - a type that considers the same dimension as the entries in the input collection but ignores the dscp field
        k - count for the topK calculation
        includeKeyInTopK - filters the entries that are considered in topK calculations
      • aggregateSumAndTopK

        public static Pipeline.SumAndTopK aggregateSumAndTopK​(String transformPrefix,
                                                              org.apache.beam.sdk.values.PCollection<org.apache.beam.sdk.values.KV<CompoundKey,​Aggregate>> groupedByKey,
                                                              int k,
                                                              org.apache.beam.sdk.transforms.SerializableFunction<CompoundKey,​Boolean> includeKeyInTopK)
        Reduces the input multimap collection into a collection with unique keys and the summed aggregates and calculates the topK entries of these sums when selected over their parent keys.
        Parameters:
        groupedByKey - a multimap that may contain several entries with the same key but different values
        k - count for the topK calculation
        includeKeyInTopK - filters the entries that are considered in topK calculations