public class DataFrame extends Object implements DataContainer<DataFrameHeader,DataRow>
| Modifier and Type | Field and Description |
|---|---|
static String |
PRIMARY_INDEX_NAME |
| Constructor and Description |
|---|
DataFrame() |
DataFrame(DataFrameHeader header,
Collection<DataRow> rows)
Creates a new data frame using a data frame header and a collections of data rows
|
| Modifier and Type | Method and Description |
|---|---|
<T extends Comparable<T>,C extends DataFrameColumn<T,C>> |
addColumn(Class<C> type,
String name,
ColumnAppender<T> appender)
Creates and adds a column to this data frame based on a provided column class.
|
<T extends Comparable<T>> |
addColumn(Class<T> type,
String name)
Creates a column for a specified column value type using the default
ColumnTypeMap. |
<T extends Comparable<T>> |
addColumn(Class<T> type,
String name,
ColumnTypeMap columnTypeMap)
Creates a column for a specified column value type using the provided
ColumnTypeMap. |
<T extends Comparable<T>,C extends DataFrameColumn<T,C>> |
addColumn(Class<T> type,
String name,
ColumnTypeMap columnTypeMap,
ColumnAppender<T> appender)
Creates and adds a new column based on a specified column value type and a
ColumnTypeMap. |
DataFrame |
addColumn(DataFrameColumn column)
Adds a column to the data frame.
|
DataFrame |
addColumns(Collection<DataFrameColumn> columns)
Adds a collection of columns to this data frame
|
DataFrame |
addColumns(DataFrameColumn... columns)
Adds an array of columns to this data frame
|
DataFrame |
addIndex(String indexName,
DataFrameColumn... columns)
Adds a new index based on one or multiple index columns.
|
DataFrame |
addIndex(String indexName,
String... columnNames)
Adds a new index based on one or multiple index columns.
|
DataFrame |
append(Comparable... values)
Appends a new row based on
Comparable values. |
DataFrame |
append(DataRow row)
Appends a new data row.
|
DataFrame |
concat(Collection<DataFrame> dataFrames)
Appends the rows from a collection of data frames to this data frame.
|
DataFrame |
concat(DataFrame... dataFrames)
Appends the rows from an array of data frames to this data frame.
|
DataFrame |
concat(DataFrame other)
Concatenates two data frames.
|
boolean |
containsColumn(DataFrameColumn column)
Returns true if this data frame contains the input column
|
DataFrame |
copy()
Returns a copy of this data frame.
|
DataFrame |
createSubset(int from,
int to)
Creates a new data frame from a subset of this data frame.
|
boolean |
equals(Object o) |
DataFrame |
filter(FilterPredicate predicate)
Filters data rows that are not valid according to an input predicate.
|
DataFrame |
filter(String predicateString)
Filters data rows that are not valid according to an input predicate.
|
DataFrame |
find(FilterPredicate predicate)
Deprecated.
use
select(FilterPredicate) instead. |
DataFrame |
find(String colName,
Comparable value)
Deprecated.
use
select(String,Comparable) instead. |
List<DataRow> |
findByIndex(String name,
Comparable... values)
Finds matching data rows using an index and the corresponding index values
|
DataRow |
findByPrimaryKey(Comparable... keyValues)
Finds a data row using the primary key
|
DataRow |
findFirst(FilterPredicate predicate)
Deprecated.
use
selectFirst(FilterPredicate) instead. |
DataRow |
findFirst(String colName,
Comparable value)
Deprecated.
use
selectFirst(String,Comparable) instead. |
DataRow |
findFirstByIndex(String name,
Comparable... values)
Finds the first data row matching an index and the corresponding index values
|
List<DataRow> |
findRows(FilterPredicate predicate)
Deprecated.
use
selectRows(FilterPredicate) instead. |
BooleanColumn |
getBooleanColumn(String name)
Returns a
BooleanColumn
If the column is not found or has the wrong type a DataFrameRuntimeException is thrown. |
ByteColumn |
getByteColumn(String name)
Returns a
ByteColumn
If the column is not found or has the wrong type a DataFrameRuntimeException is thrown. |
DataFrameColumn |
getColumn(String name)
Returns a column based on its name
|
<T extends DataFrameColumn> |
getColumn(String name,
Class<T> cl)
Returns a column as a specified column type.
|
Collection<String> |
getColumnNames()
Returns a collection of the column names in this data frame
|
Collection<DataFrameColumn> |
getColumns()
Returns a collection of all columns in this data frame
|
DoubleColumn |
getDoubleColumn(String name)
Returns a
DoubleColumn
If the column is not found or has the wrong type a DataFrameRuntimeException is thrown. |
FloatColumn |
getFloatColumn(String name)
Returns a
FloatColumn
If the column is not found or has the wrong type a DataFrameRuntimeException is thrown. |
DataFrameHeader |
getHeader()
Returns the header of this data frame
|
protected Indices |
getIndices()
Returns the indices of this data frame
|
IntegerColumn |
getIntegerColumn(String name)
Returns a
IntegerColumn
If the column is not found or has the wrong type a DataFrameRuntimeException is thrown. |
LongColumn |
getLongColumn(String name)
Returns a
LongColumn
If the column is not found or has the wrong type a DataFrameRuntimeException is thrown. |
NumberColumn |
getNumberColumn(String name)
Returns a
NumberColumn
If the column is not found or has the wrong type a DataFrameRuntimeException is thrown. |
DataRow |
getRow(int i)
Returns the data row at a specified index
|
List<DataRow> |
getRows()
Returns all rows in this data frame
|
List<DataRow> |
getRows(int from,
int to)
Returns a list the list of rows between from and to.
|
Comparable[] |
getRowValues(int i)
Returns the values of a row at a specified index
|
ShortColumn |
getShortColumn(String name)
Returns a
ShortColumn
If the column is not found or has the wrong type a DataFrameRuntimeException is thrown. |
StringColumn |
getStringColumn(String name)
Returns a
StringColumn
If the column is not found or has the wrong type a DataFrameRuntimeException is thrown. |
DataGrouping |
groupBy(String... column)
Groups this data frame using one or more columns
|
boolean |
isCompatible(DataFrame input)
Returns true if the header of an input data frame equals the header of this data frame.
|
boolean |
isIndexColumn(DataFrameColumn column)
Returns true if the input column is part of at least one index
|
Iterator<DataRow> |
iterator()
Returns an iterator over the rows in this data frame.
|
JoinedDataFrame |
joinInner(DataFrame dataFrame,
JoinColumn... joinColumns)
Joins this data frame with another data frame using the INNER JOIN method.
|
JoinedDataFrame |
joinInner(DataFrame dataFrame,
String... joinColumns)
Joins this data frame with another data frame using the INNER JOIN method.
|
JoinedDataFrame |
joinInner(DataFrame dataFrame,
String suffixA,
String suffixB,
JoinColumn... joinColumns)
Joins this data frame with another data frame using the INNER JOIN method.
|
JoinedDataFrame |
joinLeft(DataFrame dataFrame,
JoinColumn... joinColumns)
Joins this data frame with another data frame using the LEFT JOIN method.
|
JoinedDataFrame |
joinLeft(DataFrame dataFrame,
String... joinColumns)
Joins this data frame with another data frame using the LEFT JOIN method.
|
JoinedDataFrame |
joinLeft(DataFrame dataFrame,
String suffixA,
String suffixB,
JoinColumn... joinColumns)
Joins this data frame with another data frame using the LEFT JOIN method.
|
JoinedDataFrame |
joinRight(DataFrame dataFrame,
JoinColumn... joinColumns)
Joins this data frame with another data frame using the LEFT JOIN method.
|
JoinedDataFrame |
joinRight(DataFrame dataFrame,
String... joinColumns)
Joins this data frame with another data frame using the RIGHT JOIN method.
|
JoinedDataFrame |
joinRight(DataFrame dataFrame,
String suffixA,
String suffixB,
JoinColumn... joinColumns)
Joins this data frame with another data frame using the RIGHT JOIN method.
|
<T> List<T> |
map(Class<T> cl)
Maps this data container to a list of entities.
|
protected void |
notifyColumnChanged(DataFrameColumn column)
Notifies this data frame about a changed column.
|
protected void |
notifyColumnValueChanged(DataFrameColumn column,
int index,
Comparable value)
Notifies this data frame about a changed value in a column.
|
DataFrame |
removeColumn(DataFrameColumn column)
Removes a column from this data frame
|
DataFrame |
removeColumn(String header)
Removes a column from this data frame
|
DataFrame |
removeIndex(String name)
Removes the index with the specified name
|
DataFrame |
removePrimaryKey()
Removes the current primary key
|
DataFrame |
renameColumn(String name,
String newName)
Renames a column
|
DataFrame |
reverse()
Reverses all columns
|
DataFrame |
select(FilterPredicate predicate)
Returns a new data frame based on filtered rows from this data frame.
|
DataFrame |
select(String predicateString)
Returns a new data frame based on filtered rows from this data frame.
|
DataFrame |
select(String colName,
Comparable value)
Returns a new data frame with all rows from this data frame where a specified column value equals
an input value.
|
DataRow |
selectFirst(FilterPredicate predicate)
Returns the first found data row from this data frame matching an input predicate.
|
DataRow |
selectFirst(String predicateString)
Returns the first found data row from this data frame matching an input predicate.
|
DataRow |
selectFirst(String colName,
Comparable value)
Returns the first found data row from this data frame where a specified column value equals
an input value.
|
List<DataRow> |
selectRows(FilterPredicate predicate)
Finds data rows using a
FilterPredicate. |
List<DataRow> |
selectRows(String predicateString)
Finds data rows using a
FilterPredicate. |
DataFrame |
set(Collection<DataRow> rows)
Clears all rows in this data frame and sets new rows using the provided
DataRow collection. |
DataFrame |
set(DataFrameHeader header,
Collection<DataRow> rows)
Removes all columns and rows from this data frame.
|
DataFrame |
setPrimaryKey(DataFrameColumn... cols)
Sets the primary key columns using column objects
|
DataFrame |
setPrimaryKey(String... colNames)
Sets the primary key columns using column names
|
DataFrame |
shuffle()
Shuffles all rows
|
int |
size()
Returns the number of rows in this data frame
|
DataFrame |
sort(Comparator<DataRow> comp)
Sorts the rows in this data frame using a custom
Comparator |
DataFrame |
sort(SortColumn... columns)
Sorts the rows in this data frame by one or more
SortColumn |
DataFrame |
sort(String name)
Sorts the rows in this data frame using one column and the default sort direction (ascending)
|
DataFrame |
sort(String name,
SortColumn.Direction dir)
Sorts the rows in this data frame using one column and sort direction.
|
DataFrame |
subset(int from,
int to)
Sets this data frame to a subset of itself.
|
DataFrame |
transform(DataFrameTransform transformer)
Converts this dataframe into another dataframe using a specified transformer
|
DataFrame |
update(DataRow dataRow)
Persists the updated values of a data row.
|
clone, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitforEach, spliteratorpublic static final String PRIMARY_INDEX_NAME
public DataFrame()
public DataFrame(DataFrameHeader header, Collection<DataRow> rows)
header - data frame headerrows - collections of data rowspublic DataFrame setPrimaryKey(String... colNames)
colNames - primary key columnspublic DataFrame setPrimaryKey(DataFrameColumn... cols)
cols - primary key columnspublic DataFrame removePrimaryKey()
public DataFrame removeIndex(String name)
name - name of indexpublic DataFrame renameColumn(String name, String newName)
name - current column namenewName - new column namepublic DataFrame addColumn(DataFrameColumn column)
DataFrameRuntimeException is thrown.column - column to addpublic <T extends Comparable<T>> DataFrame addColumn(Class<T> type, String name)
ColumnTypeMap.T - type of column valuestype - class of column valuesname - column namepublic <T extends Comparable<T>> DataFrame addColumn(Class<T> type, String name, ColumnTypeMap columnTypeMap)
ColumnTypeMap.T - type of column valuestype - class of column valuesname - column namecolumnTypeMap - provided column type mapaddColumn(Class, String, ColumnAppender)public <T extends Comparable<T>,C extends DataFrameColumn<T,C>> DataFrame addColumn(Class<T> type, String name, ColumnTypeMap columnTypeMap, ColumnAppender<T> appender)
ColumnTypeMap.T - type of column valuesC - type of created columntype - column value value typename - name of new columncolumnTypeMap - column type map (value type / column class mapper)appender - column appender (value generator)addColumn(Class, String, ColumnAppender)public <T extends Comparable<T>,C extends DataFrameColumn<T,C>> DataFrame addColumn(Class<C> type, String name, ColumnAppender<T> appender)
ColumnAppender.
If no column appender is specified, the column is filled with NA values.
If the column can not be created or added a DataFrameRuntimeException is thrown.T - type of column valuesC - type of created columntype - class of created columnname - name of created columnappender - column appender (value generator)addColumn(DataFrameColumn)public DataFrame addColumns(Collection<DataFrameColumn> columns)
columns - columns to addpublic DataFrame addColumns(DataFrameColumn... columns)
columns - columns to addpublic DataFrame append(Comparable... values)
Comparable values.
There must be exactly one value for each column.
The object types have to match the column types.
If the wrong number of values or a wrong type is found aDataFrameRuntimeException is thrown.
If the data frame contains:
StringColumn,DoubleColumn,IntegerColumn
The only correct call to this method is:
append(String, Double, Integer)
empty column values must be provided as null or NA
values - values for the appended rowpublic DataFrame append(DataRow row)
NA is added for all columns with no value in the provided row.row - row containing the new valuespublic DataFrame update(DataRow dataRow)
NA instead-dataRow - data row with updated valuespublic DataFrame set(Collection<DataRow> rows)
DataRow collection.rows - new collection of rowspublic DataFrame set(DataFrameHeader header, Collection<DataRow> rows)
header - new headerrows - new rowspublic DataFrame removeColumn(String header)
header - column header namepublic DataFrame removeColumn(DataFrameColumn column)
column - column to removepublic DataFrame sort(SortColumn... columns)
SortColumncolumns - sort columnspublic DataFrame sort(Comparator<DataRow> comp)
Comparatorcomp - comparator used to sort the rowspublic DataFrame sort(String name)
name - sort columnpublic DataFrame sort(String name, SortColumn.Direction dir)
name - sort columndir - sort directionpublic DataFrame shuffle()
public DataFrame select(String colName, Comparable value)
colName - column namevalue - input valuepublic DataRow selectFirst(String colName, Comparable value)
colName - column namevalue - input valuepublic DataRow selectFirst(String predicateString)
predicateString - input predicate stringselect(FilterPredicate)public DataRow selectFirst(FilterPredicate predicate)
predicate - input predicateselect(FilterPredicate)public DataFrame select(FilterPredicate predicate)
if(predicate.valid(row)) -> add(row)
predicate - filter predicatefilter(FilterPredicate)public DataFrame select(String predicateString)
if(predicate.valid(row)) -> add(row)
predicateString - predicate stringselect(FilterPredicate)@Deprecated public DataFrame find(String colName, Comparable value)
select(String,Comparable) instead.colName - column namevalue - input value@Deprecated public DataRow findFirst(String colName, Comparable value)
selectFirst(String,Comparable) instead.colName - column namevalue - input value@Deprecated public DataRow findFirst(FilterPredicate predicate)
selectFirst(FilterPredicate) instead.predicate - input predicatepublic DataFrame filter(String predicateString)
if(!predicate.valid(row)) -> remove(row)
predicateString - filter predicate stringpublic DataFrame filter(FilterPredicate predicate)
if(!predicate.valid(row)) -> remove(row)
predicate - filter predicate@Deprecated public DataFrame find(FilterPredicate predicate)
select(FilterPredicate) instead.if(predicate.valid(row)) -> add(row)
predicate - filter predicatefilter(FilterPredicate)public List<DataRow> selectRows(String predicateString)
FilterPredicate.predicateString - input predicate stringpublic List<DataRow> selectRows(FilterPredicate predicate)
FilterPredicate.predicate - input predicate@Deprecated public List<DataRow> findRows(FilterPredicate predicate)
selectRows(FilterPredicate) instead.FilterPredicate.predicate - input predicatepublic DataFrame transform(DataFrameTransform transformer)
transformer - the applied transformerpublic DataRow findByPrimaryKey(Comparable... keyValues)
keyValues - input key valuespublic DataFrame reverse()
public DataFrame addIndex(String indexName, String... columnNames)
Values in index columns must be unique for all rows
indexName - name of new indexcolumnNames - index columnspublic DataFrame addIndex(String indexName, DataFrameColumn... columns)
Values in index columns must be unique for all rows
indexName - name of new indexcolumns - index columnspublic int size()
public DataFrame subset(int from, int to)
from - lowest remaining row indexto - highest remaining row indexpublic DataFrame createSubset(int from, int to)
from - lowest row indexto - highest row indexpublic List<DataRow> getRows(int from, int to)
from - lowest row indexto - highest row indexpublic List<DataRow> getRows()
public DataFrameHeader getHeader()
getHeader in interface DataContainer<DataFrameHeader,DataRow>public DataFrame concat(DataFrame other)
DataFrameRuntimeException if the data frames are not compatible.other - other data framepublic DataFrame concat(Collection<DataFrame> dataFrames)
DataFrameRuntimeException if the data frames are not compatible.dataFrames - other data framespublic DataFrame concat(DataFrame... dataFrames)
DataFrameRuntimeException if the data frames are not compatible.dataFrames - other data framespublic boolean isCompatible(DataFrame input)
input - input data frameBasicTypeHeader.equals(Object)public DataRow getRow(int i)
i - index of data rowpublic Comparable[] getRowValues(int i)
i - index of data rowpublic Collection<String> getColumnNames()
public DataFrameColumn getColumn(String name)
name - column namepublic <T extends DataFrameColumn> T getColumn(String name, Class<T> cl)
DataFrameRuntimeException is thrown.T - type of columnname - column namecl - class of columnpublic NumberColumn getNumberColumn(String name)
NumberColumn
If the column is not found or has the wrong type a DataFrameRuntimeException is thrown.name - column namepublic StringColumn getStringColumn(String name)
StringColumn
If the column is not found or has the wrong type a DataFrameRuntimeException is thrown.name - column namepublic DoubleColumn getDoubleColumn(String name)
DoubleColumn
If the column is not found or has the wrong type a DataFrameRuntimeException is thrown.name - column namepublic IntegerColumn getIntegerColumn(String name)
IntegerColumn
If the column is not found or has the wrong type a DataFrameRuntimeException is thrown.name - column namepublic FloatColumn getFloatColumn(String name)
FloatColumn
If the column is not found or has the wrong type a DataFrameRuntimeException is thrown.name - column namepublic BooleanColumn getBooleanColumn(String name)
BooleanColumn
If the column is not found or has the wrong type a DataFrameRuntimeException is thrown.name - column namepublic ByteColumn getByteColumn(String name)
ByteColumn
If the column is not found or has the wrong type a DataFrameRuntimeException is thrown.name - column namepublic LongColumn getLongColumn(String name)
LongColumn
If the column is not found or has the wrong type a DataFrameRuntimeException is thrown.name - column namepublic ShortColumn getShortColumn(String name)
ShortColumn
If the column is not found or has the wrong type a DataFrameRuntimeException is thrown.name - column namepublic DataGrouping groupBy(String... column)
column - group columnsdata groupingGroupUtil.groupBy(DataFrame, String...)public JoinedDataFrame joinLeft(DataFrame dataFrame, String... joinColumns)
dataFrame - other data framejoinColumns - join columnsJoinUtil.leftJoin(DataFrame, DataFrame, JoinColumn...)public JoinedDataFrame joinLeft(DataFrame dataFrame, JoinColumn... joinColumns)
dataFrame - other data framejoinColumns - join columnsJoinUtil.leftJoin(DataFrame, DataFrame, JoinColumn...)public JoinedDataFrame joinLeft(DataFrame dataFrame, String suffixA, String suffixB, JoinColumn... joinColumns)
dataFrame - other data framesuffixA - suffixes for columns from this data framesuffixB - suffixes for columns from the other data framejoinColumns - join columnsJoinUtil.leftJoin(DataFrame, DataFrame, String, String, JoinColumn...)public JoinedDataFrame joinRight(DataFrame dataFrame, String... joinColumns)
dataFrame - other data framejoinColumns - join columnsJoinUtil.rightJoin(DataFrame, DataFrame, JoinColumn...)public JoinedDataFrame joinRight(DataFrame dataFrame, JoinColumn... joinColumns)
dataFrame - other data framejoinColumns - join columnsJoinUtil.leftJoin(DataFrame, DataFrame, JoinColumn...)public JoinedDataFrame joinRight(DataFrame dataFrame, String suffixA, String suffixB, JoinColumn... joinColumns)
dataFrame - other data framesuffixA - suffixes for columns from this data framesuffixB - suffixes for columns from the other data framejoinColumns - join columnsJoinUtil.rightJoin(DataFrame, DataFrame, String, String, JoinColumn...)public JoinedDataFrame joinInner(DataFrame dataFrame, String... joinColumns)
dataFrame - other data framejoinColumns - join columnsJoinUtil.innerJoin(DataFrame, DataFrame, JoinColumn...)public JoinedDataFrame joinInner(DataFrame dataFrame, JoinColumn... joinColumns)
dataFrame - other data framejoinColumns - join columnsJoinUtil.innerJoin(DataFrame, DataFrame, JoinColumn...)public JoinedDataFrame joinInner(DataFrame dataFrame, String suffixA, String suffixB, JoinColumn... joinColumns)
dataFrame - other data framesuffixA - suffixes for columns from this data framesuffixB - suffixes for columns from the other data framejoinColumns - join columnsJoinUtil.innerJoin(DataFrame, DataFrame, String, String, JoinColumn...)public DataFrame copy()
public boolean containsColumn(DataFrameColumn column)
column - input columnprotected void notifyColumnValueChanged(DataFrameColumn column, int index, Comparable value)
column - changed columnindex - changed indexvalue - new valueprotected void notifyColumnChanged(DataFrameColumn column)
column - changed columnpublic boolean isIndexColumn(DataFrameColumn column)
column - input columnpublic List<DataRow> findByIndex(String name, Comparable... values)
name - name of indexvalues - index valuespublic DataRow findFirstByIndex(String name, Comparable... values)
name - name of indexvalues - index valuespublic Collection<DataFrameColumn> getColumns()
protected Indices getIndices()
public <T> List<T> map(Class<T> cl)
DataContainermap in interface DataContainer<DataFrameHeader,DataRow>T - type of entitiescl - class of resulting entitiesDataMapper.map(DataContainer, Class)public Iterator<DataRow> iterator()
Iterator.remove() is not supported.Copyright © 2017. All rights reserved.