public class IndexShuffle extends java.lang.Object implements Shuffle
Implementation of a Shuffle type, that delegates core calculations to a specified function.
For each specified Columns this shuffle creates an int array,
representing sizes of the column dictionaries. For example, if specified columns contains 3 Column
instances, and each of them contains a dictionary of 3 elements - generated array will look like this: [3, 3, 3].
Created array is delegated to specified "index calculator" function of type (int[] -> int[][]).
Int array of double depth, created by the function then transformed into Rows instance so that
each row (int[]) represents single Row instance, and each int in this row represents
an element with the same index in the dictionary of the corresponding column. Example (pseudo code):
// there're 3 columns with 3 elements each, like this:
col1 = ["qwe", "rty", "qaz"]
col2 = ['a', 'b', 'c']
col3 = [true, false, null]
// index shuffle creates a "size array":
int[] sizes = [3, 3, 3]
// then function produces an "index map", where each index,
// is between 0 (inclusive), and a "size array" element (exclusive):
int[][] indexes = [
[0, 0, 0]
[1, 1, 1]
[2, 2, 2]
]
// Columns object is indexed, so each input column can be represented by an index
// So index shuffle takes result "index map" and transforms it into rows, like this:
row = [col1, col2, col3]
// Each element of the int[] row corresponds to specific column
// and contains an index of an element, from the column's dictionary:
row1 = ["qwe", 'a', true]
row2 = ["rty", 'b', false]
row3 = ["qaz", 'c', null]
Note: "index map" returned by a delegate function might contain negative indexes.
In this case value will be ignored and result Row instance will not contain a RowEntry
with corresponding ColumnKey. This might happen in case of an empty column in the input data.| Constructor and Description |
|---|
IndexShuffle(java.util.function.Function<int[],int[][]> indexCalculator) |
| Modifier and Type | Method and Description |
|---|---|
Rows |
apply(Columns cols) |
java.util.function.Function<int[],int[][]> |
getIndexCalculator() |