public static class InputSampler.SplitSampler<K,V> extends Object implements InputSampler.Sampler<K,V>
| Constructor and Description |
|---|
SplitSampler(int numSamples)
Create a SplitSampler sampling all splits.
|
SplitSampler(int numSamples,
int maxSplitsSampled)
Create a new SplitSampler.
|
| Modifier and Type | Method and Description |
|---|---|
K[] |
getSample(org.apache.hadoop.mapreduce.InputFormat<K,V> inf,
org.apache.hadoop.mapreduce.Job job)
From each split sampled, take the first numSamples / numSplits records.
|
public SplitSampler(int numSamples)
numSamples - Total number of samples to obtain from all selected
splits.public SplitSampler(int numSamples,
int maxSplitsSampled)
numSamples - Total number of samples to obtain from all selected
splits.maxSplitsSampled - The maximum number of splits to examine.public K[] getSample(org.apache.hadoop.mapreduce.InputFormat<K,V> inf, org.apache.hadoop.mapreduce.Job job) throws IOException, InterruptedException
getSample in interface InputSampler.Sampler<K,V>IOExceptionInterruptedExceptionCopyright © 2016. All rights reserved.