ml.shifu.guagua.yarn.util
类 InputSplitUtils

java.lang.Object
  继承者 ml.shifu.guagua.yarn.util.InputSplitUtils

public final class InputSplitUtils
extends Object

Helper class to get input splits.


方法摘要
static String expandInputFolder(org.apache.hadoop.conf.Configuration conf)
          Expand folder to all files to support all files in that folder
static int getBlockIndex(org.apache.hadoop.fs.BlockLocation[] blkLocations, long offset)
           
static List<List<org.apache.hadoop.mapreduce.InputSplit>> getCombineGuaguaSplits(List<org.apache.hadoop.mapreduce.InputSplit> oneInputSplits, long maxCombinedSplitSize)
           
static List<org.apache.hadoop.mapreduce.InputSplit> getFileSplits(org.apache.hadoop.conf.Configuration conf, long splitSize)
          Generate the list of files and make them into FileSplits.
static List<org.apache.hadoop.mapreduce.InputSplit> getFinalCombineGuaguaSplits(List<org.apache.hadoop.mapreduce.InputSplit> newSplits, long combineSize)
          Copy from pig implementation, need to check this code logic.
static List<org.apache.hadoop.mapreduce.InputSplit> getGuaguaSplits(org.apache.hadoop.conf.Configuration conf, long splitSize)
          Generate the list of files and make them into FileSplits.
static org.apache.hadoop.fs.FileStatus[] listStatus(org.apache.hadoop.conf.Configuration conf)
          List all the inputs files.
 
从类 java.lang.Object 继承的方法
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

方法详细信息

getFinalCombineGuaguaSplits

public static List<org.apache.hadoop.mapreduce.InputSplit> getFinalCombineGuaguaSplits(List<org.apache.hadoop.mapreduce.InputSplit> newSplits,
                                                                                       long combineSize)
                                                                                throws IOException
Copy from pig implementation, need to check this code logic.

抛出:
IOException

listStatus

public static org.apache.hadoop.fs.FileStatus[] listStatus(org.apache.hadoop.conf.Configuration conf)
                                                    throws IOException
List all the inputs files. Better to follow FileInputFormat#listStatus

抛出:
IOException

expandInputFolder

public static String expandInputFolder(org.apache.hadoop.conf.Configuration conf)
                                throws IOException
Expand folder to all files to support all files in that folder

抛出:
IOException

getFileSplits

public static List<org.apache.hadoop.mapreduce.InputSplit> getFileSplits(org.apache.hadoop.conf.Configuration conf,
                                                                         long splitSize)
                                                                  throws IOException
Generate the list of files and make them into FileSplits.

抛出:
IOException

getGuaguaSplits

public static List<org.apache.hadoop.mapreduce.InputSplit> getGuaguaSplits(org.apache.hadoop.conf.Configuration conf,
                                                                           long splitSize)
                                                                    throws IOException
Generate the list of files and make them into FileSplits.

抛出:
IOException

getBlockIndex

public static int getBlockIndex(org.apache.hadoop.fs.BlockLocation[] blkLocations,
                                long offset)

getCombineGuaguaSplits

public static List<List<org.apache.hadoop.mapreduce.InputSplit>> getCombineGuaguaSplits(List<org.apache.hadoop.mapreduce.InputSplit> oneInputSplits,
                                                                                        long maxCombinedSplitSize)
                                                                                 throws IOException,
                                                                                        InterruptedException
抛出:
IOException
InterruptedException


Copyright © 2014. All Rights Reserved.