Class SparkRowSource

java.lang.Object
org.gorpipe.gor.model.GenomicIteratorBase
gorsat.process.ProcessSource
gorsat.process.SparkRowSource
All Implemented Interfaces:
AutoCloseable, Iterator<org.gorpipe.gor.model.Row>, org.gorpipe.gor.model.GenomicIterator, org.gorpipe.gor.model.RowSourceStats

public class SparkRowSource extends gorsat.process.ProcessSource
Created by sigmar on 12/02/16.
  • Field Summary

    Fields inherited from class org.gorpipe.gor.model.GenomicIteratorBase

    statsSenderAnnotation, statsSenderName
  • Constructor Summary

    Constructors
    Constructor
    Description
    SparkRowSource(String[] cmds, String type, boolean nor, org.gorpipe.gor.session.GorSession gpSession, String chr, int pos, int end, int bs)
     
    SparkRowSource(String sql, String profile, String parquet, String type, boolean nor, org.gorpipe.spark.GorSparkSession gpSession, String filter, String filterFile, String filterColumn, String splitFile, String chr, int pos, int end, boolean usestreaming, String jobId, boolean useCpp, String parts, int buckets, boolean tag, String ddl, String format, String option)
     
  • Method Summary

    Modifier and Type
    Method
    Description
    static org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
    analyse(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataset, String gor)
     
    checkNested(String cmd, org.gorpipe.gor.session.GorSession gpSession, String[] errorStr)
     
    static boolean
    checkNor(org.apache.spark.sql.types.StructField[] fields)
     
    org.apache.spark.sql.Dataset<? extends org.gorpipe.gor.model.Row>
    checkRowFormat(org.apache.spark.sql.Dataset<? extends org.apache.spark.sql.Row> dataset)
     
    void
     
    org.apache.spark.sql.Dataset<? extends org.apache.spark.sql.Row>
     
     
    void
    gor()
     
    void
    gorpipe(gorsat.Commands.Analysis pipeStep, boolean gor)
     
    static org.apache.spark.sql.Dataset<org.gorpipe.gor.model.Row>
    gorpipe(org.apache.spark.sql.Dataset<? extends org.apache.spark.sql.Row> dataset, String gor)
     
    boolean
     
    void
     
    boolean
     
    boolean
     
    org.gorpipe.gor.model.Row
     
    boolean
    pushdownCalc(String formula, String colName)
     
    boolean
     
    boolean
     
    boolean
     
    boolean
     
    boolean
    pushdownTop(int limit)
     
    boolean
     
    static org.apache.spark.sql.types.StructType
    schemaFromRow(String[] header, org.gorpipe.gor.model.Row row)
     
    boolean
    seek(String seekChr, int seekPos)
     
    setRange(String seekChr, int startPos, int endPos)
     

    Methods inherited from class org.gorpipe.gor.model.GenomicIteratorBase

    clone, decStat, getBufferSize, getContext, getSourceName, getTypes, incStat, init, initStats, isSourceAlreadyInserted, setBufferSize, setContext, setHeader, setSourceAlreadyInserted, setSourceName, setTypes

    Methods inherited from class java.lang.Object

    equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

    Methods inherited from interface org.gorpipe.gor.model.GenomicIterator

    filter, getMonitor, moveToPosition, moveToPosition, seek, select, setRequestedRange

    Methods inherited from interface java.util.Iterator

    forEachRemaining, remove

    Methods inherited from interface org.gorpipe.gor.model.RowSourceStats

    getAvgBasesPerMilliSecond, getAvgBatchSize, getAvgRowsPerMilliSecond, getAvgSeekTimeMilliSecond, getCurrentBatchLoc, getCurrentBatchRow, getCurrentBatchSize
  • Constructor Details

  • Method Details

    • init

      public void init()
    • isNor

      public boolean isNor()
    • getDataset

      public org.apache.spark.sql.Dataset<? extends org.apache.spark.sql.Row> getDataset()
    • gorpipe

      public void gorpipe(gorsat.Commands.Analysis pipeStep, boolean gor)
    • gorpipe

      public static org.apache.spark.sql.Dataset<org.gorpipe.gor.model.Row> gorpipe(org.apache.spark.sql.Dataset<? extends org.apache.spark.sql.Row> dataset, String gor)
    • gor

      public void gor()
    • schemaFromRow

      public static org.apache.spark.sql.types.StructType schemaFromRow(String[] header, org.gorpipe.gor.model.Row row)
    • checkNested

      public String checkNested(String cmd, org.gorpipe.gor.session.GorSession gpSession, String[] errorStr)
    • checkNor

      public static boolean checkNor(org.apache.spark.sql.types.StructField[] fields)
    • hasNext

      public boolean hasNext()
    • next

      public org.gorpipe.gor.model.Row next()
    • seek

      public boolean seek(String seekChr, int seekPos)
    • close

      public void close()
    • setRange

      public InputStream setRange(String seekChr, int startPos, int endPos)
      Specified by:
      setRange in class gorsat.process.ProcessSource
    • getHeader

      public String getHeader()
      Specified by:
      getHeader in interface org.gorpipe.gor.model.GenomicIterator
      Overrides:
      getHeader in class org.gorpipe.gor.model.GenomicIteratorBase
    • isBuffered

      public boolean isBuffered()
    • checkRowFormat

      public org.apache.spark.sql.Dataset<? extends org.gorpipe.gor.model.Row> checkRowFormat(org.apache.spark.sql.Dataset<? extends org.apache.spark.sql.Row> dataset)
    • pushdownFilter

      public boolean pushdownFilter(String gorwhere)
    • pushdownCalc

      public boolean pushdownCalc(String formula, String colName)
    • pushdownSelect

      public boolean pushdownSelect(String[] cols)
    • pushdownWrite

      public boolean pushdownWrite(String filename)
    • pushdownCmd

      public boolean pushdownCmd(String cmd)
    • analyse

      public static org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> analyse(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataset, String gor)
    • pushdownGor

      public boolean pushdownGor(String gor)
    • pushdownTop

      public boolean pushdownTop(int limit)