Class RandomSelector<T>

java.lang.Object
org.plumelib.util.RandomSelector<T>
Type Parameters:
T - the type of elements being selected over

public class RandomSelector<T> extends Object
RandomSelector selects k elements uniformly at random from an arbitrary iterator, using O(k) space. A naive algorithm would use O(n) space. For example, selecting 1 element from a FileStream containing 1000 elements will take O(1) space. The class takes as input the number k during initialization and then can accept() any number of Objects in the future. At any point in time, getValues() will either return k randomly selected elements from the elements previous accepted or if accept() was called fewer than k times, will return all elements previously accepted.

The random selection is independent between every constructed instance of RandomSelector objects, but for the same instance, multiple calls to getValues() are not independent. Making two calls to consecutive getValues() without an accept() in between will return two new Lists containing the same elements.

A second mode allows for a fixed probability of randomly keeping each item as opposed to a fixed number of samples.

SPECFIELDS:
values : Set : The values chosen based on the Objects observed
observed : int : The number of Objects observed
numElts : int : The number of elements to choose ('k' above)
keepProbability: double : The percentage of elements to keep
selector_mode : {FIXED,PERCENT} : either fixed amount of samples or fixed percent.

Example use:


 // randomly selects 100 lines of text from a file
 List selectedLines = null;
 try {
    BufferedReader br = new BufferedReader(new FileReader("myfile.txt"));
    RandomSelector selector = new RandomSelector(100);
    while (br.ready()) {
      selector.accept(br.readLine());
    }
    selectedLines = selector.getValues();
  }
  catch (IOException e2) { e2.printStackTrace(); }
 
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    private boolean
    If true, numElts and observers are -1.
    private Random
    The Random instance to use (for reproducibility).
    private double
    The percentage of elements to keep.
    private int
    The number of objects to choose, or -1.
    private int
    The number of objects observed.
    private ArrayList<T>
    The values chosen.
  • Constructor Summary

    Constructors
    Constructor
    Description
    RandomSelector(double keepProbability, Random r)
    Creates a new RandomSelector.
    RandomSelector(int numElts)
    Creates a new RandomSelector.
    RandomSelector(int numElts, Random r)
    Creates a new RandomSelector.
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    accept(T next)
    When in fixed sample mode, increments the number of observed elements i by 1, then with probability k / i, the Object 'next' will be added to the currently selected values 'values' where k is equal to 'numElts'.
    Returns values, modifies none.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • coinTossMode

      private boolean coinTossMode
      If true, numElts and observers are -1. If false, keepProbability = -1.
    • keepProbability

      private double keepProbability
      The percentage of elements to keep.
    • numElts

      private int numElts
      The number of objects to choose, or -1.
    • observed

      private int observed
      The number of objects observed.
    • generator

      private Random generator
      The Random instance to use (for reproducibility).
    • values

      private ArrayList<T> values
      The values chosen.
  • Constructor Details

    • RandomSelector

      public RandomSelector(int numElts)
      Creates a new RandomSelector.
      Parameters:
      numElts - the number of elements intended to be selected from the input elements
    • RandomSelector

      public RandomSelector(int numElts, Random r)
      Creates a new RandomSelector.
      Parameters:
      numElts - the number of elements intended to be selected from the input elements
      r - the seed to give for random number generation
    • RandomSelector

      public RandomSelector(double keepProbability, Random r)
      Creates a new RandomSelector.
      Parameters:
      keepProbability - the probability that each element is selected from the oncoming Iteration
      r - the seed to give for random number generation
  • Method Details

    • accept

      public void accept(T next)
      When in fixed sample mode, increments the number of observed elements i by 1, then with probability k / i, the Object 'next' will be added to the currently selected values 'values' where k is equal to 'numElts'. If the size of values exceeds numElts, then one of the existing elements in values will be removed at random.

      When in probability mode, adds next to 'values' with probability equal to 'keepProbability'.

      Parameters:
      next - value to be added to this selector
    • getValues

      public List<T> getValues()
      Returns values, modifies none.
      Returns:
      values