DataFrame

class DataFrame extends IndexOps[DataFrame] with Dynamic
trait Dynamic
trait IndexOps[DataFrame]
class Object
trait Matchable
class Any
class DataMap[K]

Value members

Concrete methods

@targetName("update")
def &[T](namedSeries: (String, Series[T])): DataFrame

Updates a column with all values that are defined in the Series.

Updates a column with all values that are defined in the Series.

Value Params
namedSeries

Tuple of column name to be updated and Series.

Returns

DataFrame with updated column.

Throws
MergeIndexException

If indices are not compatible.

SeriesCastException

If the data type does not match with the existing column.

See also
Since

0.1.0

Note
  • The index of the Series must be included in the DataFrame.
  • The index of the DataFrame is not altered.
@targetName("updateOperator")
def &[T](series: Series[T]): DataFrame

Updates a column with all values that are defined in the Series. The name of the Series defines the column.

Updates a column with all values that are defined in the Series. The name of the Series defines the column.

Value Params
series

Series.

Returns

DataFrame with updated column.

Throws
MergeIndexException

If indices are not compatible.

SeriesCastException

If the data type does not match with the existing column.

See also
Since

0.1.0

Note
  • The index of the Series must be included in the DataFrame.
  • The index of the DataFrame is not altered.
@targetName("prepend")
def ::[T](series: Series[T]): DataFrame

Concatenates the Series on the left and the DataFrame on the right hand side.

Concatenates the Series on the left and the DataFrame on the right hand side.

Value Params
series

Series.

Returns

DataFrame, where the Series is the first column.

Throws
MergeIndexException

If indices are not compatible.

See also
Since

0.1.0

Note
  • Columns with the same name are replaced by the rightmost column.
  • The index of the left DataFrame must be included in the right Series.
  • Data on the left hand side might be copied if indices are not equivalent.
  • The resulting index is equivalent to the right operand.
  • The operators | and :: are equivalent if indices on the left and right side are equal.
@targetName("prepend")

Concatenates the DataFrame on the left and the DataFrame on the right hand side.

Concatenates the DataFrame on the left and the DataFrame on the right hand side.

Value Params
df

DataFrame.

Returns

DataFrame, where the left DataFrame columns are prior to the columns of the right DataFrame.

Throws
MergeIndexException

If indices are not compatible.

See also
Since

0.1.0

Note
  • Columns with the same name are replaced by the rightmost column.
  • The index of the left DataFrame must be included in the right DataFrame.
  • Data on the left hand side might be copied if indices are not equivalent.
  • The resulting index is equivalent to the right operand.
  • The operators | and :: are equivalent if indices on the left and right side are equal.
def append[T](name: String, series: Series[T]): DataFrame

Appends a column if not existing in DataFrame. If the column exists, it is not altered.

Appends a column if not existing in DataFrame. If the column exists, it is not altered.

Value Params
name

Column name.

series

Series.

Returns

Series with appended column on the right hand side.

Throws
MergeIndexException

If indices are not compatible. If the column exists, no exception is thrown.

Since

0.1.0

Note
  • The index of the Series must be included in the DataFrame.
  • Data of the Series might be copied if indices are not equivalent.
  • The index of the DataFrame is not altered.
def append(series: Series[_]*): DataFrame

Appends columns if not existing in DataFrame. If a column exists, it is not altered.

Appends columns if not existing in DataFrame. If a column exists, it is not altered.

Value Params
series

Series to be added. The column names are taken from the Series.

Returns

Series with appended columns on the right hand side.

Throws
MergeIndexException

If indices are not compatible. If the column exists, no exception is thrown.

Since

0.1.0

Note
  • The index of the Series must be included in the DataFrame.
  • Data of the Series might be copied if indices are not equivalent.
  • The index of the DataFrame is not altered.
def apply(col: String): Series[Any]

Returns a column as a Series.

Returns a column as a Series.

Value Params
col

Column name.

Returns

Column as a Series with the name of the column.

Throws
ColumnNotFoundException

If the column is not found.

See also
Since

0.1.0

def apply[T](row: Int, col: String)(implicit evidence$1: RequireType[T], evidence$2: Typeable[T], evidence$3: ClassTag[T]): Option[T]

Returns a value for a columns and a row.

Returns a value for a columns and a row.

Value Params
col

Column name.

row

Row.

Returns

Value as Option.

Throws
ColumnNotFoundException

If the column is not found.

IndexBoundsException

If row is not part of the base index.

See also
Since

0.1.0

Note

For an optimal performance in a loop, first extract the column as a Series.

def apply[T](row: Option[Int], col: String)(implicit evidence$4: RequireType[T], evidence$5: Typeable[T], evidence$6: ClassTag[T]): Option[T]

Returns a value for a columns and a row.

Returns a value for a columns and a row.

Value Params
col

Column name.

row

Row.

Returns

Value as Option. None if row is None.

Throws
ColumnNotFoundException

If the column is not found.

IndexBoundsException

If row is not part of the base index.

See also
Since

0.1.0

Note

For an optimal performance in a loop, first extract the column as a Series.

def apply[T](row: Int, col: String, default: => T)(implicit evidence$7: Typeable[T], evidence$8: ClassTag[T]): T

Returns a value for a columns and a row using a default value for undefined entries.

Returns a value for a columns and a row using a default value for undefined entries.

Value Params
col

Column name.

default

Default value for undefined values.

row

Row.

Returns

Value.

Throws
ColumnNotFoundException

If the column is not found.

IndexBoundsException

If row is not part of the base index.

See also
Since

0.1.0

Note

For an optimal performance in a loop, first extract the column as a Series.

def apply[T](row: Option[Int], col: String, default: => T)(implicit evidence$9: Typeable[T], evidence$10: ClassTag[T]): T

Returns a value for a columns and a row using a default value for undefined entries.

Returns a value for a columns and a row using a default value for undefined entries.

Value Params
col

Column name.

default

Default value for undefined values.

row

Row.

Returns

Value. Default value if row is None.

Throws
ColumnNotFoundException

If the column is not found.

IndexBoundsException

If row is not part of the base index.

See also
Since

0.1.0

Note

For an optimal performance in a loop, first extract the column as a Series.

def apply(range: Range, col: String): Series[Any]

Extracts a column and slices the index by intersecting the current index with a range.

Extracts a column and slices the index by intersecting the current index with a range.

Value Params
col

Column name.

range

Range.

Returns

Series with sliced index.

Throws
ColumnNotFoundException

If the column is not found.

See also
Since

0.1.0

def apply(seq: Seq[Int], col: String): Series[Any]

Extracts a column and slices the index by intersecting the current index with a sequence of index positions.

Extracts a column and slices the index by intersecting the current index with a sequence of index positions.

Value Params
col

Column name.

seq

Sequence of index positions.

Returns

Series with sliced index.

Throws
ColumnNotFoundException

If the column is not found.

See also
Since

0.1.0

def apply(array: Array[Int], col: String): Series[Any]

Extracts a column and slices the index by intersecting the current index with an array of index positions.

Extracts a column and slices the index by intersecting the current index with an array of index positions.

Value Params
array

Array of index positions.

col

Column name.

Returns

Series with sliced index.

Throws
ColumnNotFoundException

If the column is not found.

See also
Since

0.1.0

def apply(series: Series[Boolean], col: String): Series[Any]

Extracts a column and slices the index by intersecting the current index with a boolean Series.

Extracts a column and slices the index by intersecting the current index with a boolean Series.

Value Params
col

Column name.

series

Boolean Series as mask, where only index positions kept that are true.

Returns

Series with sliced index.

Throws
ColumnNotFoundException

If the column is not found.

See also
Since

0.1.0

@targetName("applyCols")
def apply(cols: Seq[String]): DataFrame

Extracts DataFrame with selected columns.

Extracts DataFrame with selected columns.

Value Params
cols

Column names.

Returns

DataFrame with selected column.

Throws
ColumnNotFoundException

If one of the columns is not found.

See also
Since

0.1.0

def canEqual(a: Any): Boolean

Determines if an object is a DataFrame.

Determines if an object is a DataFrame.

Value Params
a

Any object.

Returns

True if the object is a DataFrame and false otherwise.

Since

0.1.0

def col[T](series: Series[T]): DataFrame

Appends (or replaces) one columns.

Appends (or replaces) one columns.

Value Params
series

Series to be concatenated.

Returns

DataFrame, where the Series is appended on the right side.

Throws
MergeIndexException

If indices are not compatible.

See also
Since

0.1.0

Note
  • An existing column with the same name is replaced.
  • The index of the Series must be included in the left DataFrame.
  • Series might be copied if indices are not equivalent.
  • The index of the DataFrame is not altered.
  • This operation is equivalent to the | operator.
def col[T](col: String, series: Series[T]): DataFrame

Appends (or replaces) a columns.

Appends (or replaces) a columns.

Value Params
col

Name of column to be appended.

series

Series.

Returns

DataFrame, where the Series is appended on the right side.

Throws
MergeIndexException

If indices are not compatible.

See also
Since

0.1.0

Note
  • An existing column with the same name is replaced.
  • The index of the Series must be included in the left DataFrame.
  • Series might be copied if indices are not equivalent.
  • The index of the DataFrame is not altered.
  • This operation is equivalent to the | operator.

Appends (or replaces) multiple columns.

Appends (or replaces) multiple columns.

Returns

Appender object which appends Series on the right side.

Throws
MergeIndexException

If indices are not compatible.

See also
Since

0.1.0

Note
  • Existing columns with the same name are replaced by the rightmost column.
  • The index of the Series must be included in the left DataFrame.
  • Series might be copied if indices are not equivalent.
  • The index of the DataFrame is not altered.
  • For one column, this operation is equivalent to the | operator.
Example
 df.cols("price" -> Series(10.0, 20.0), "quantity" -> Series(5, 2))
 df.cols(Series(10.0, 20.0) as "price", Series(5, 2) as "quantity")
 df.cols(Series("price")(10.0, 20.0), Series("quantity")(5, 2))
 df.cols(price = Series(10.0, 20.0), quantity = Series(5, 2))
def columnArray: Array[Series[Any]]

All columns as array of Series.

All columns as array of Series.

Returns

Array with all columns as Series in defined order. All Series have the same index as the DataFrame.

Since

0.1.0

def columnIterator: Iterable[Series[Any]]

Iterates over all columns.

Iterates over all columns.

Returns

Iterable over all columns in defined order.

Since

0.1.0

def columns: Seq[String]

Sequence with column names.

Sequence with column names.

Returns

Sequence with column names in defined order.

Since

0.1.0

def contains(col: String): Boolean

Determines if a column is in the DataFrame.

Determines if a column is in the DataFrame.

Value Params
col

Column name.

Returns

True if DataFrame has the column col and false otherwise.

Since

0.1.0

def display(n: Int, width: Int, colWidth: Int): Unit

Prints the DataFrame as a table with an index column and annotated column types.

Prints the DataFrame as a table with an index column and annotated column types.

Value Params
colWidth

The width of each column.

n

The maximal numbers of rows.

width

The maximal width of a line.

See also
Since

0.1.0

Drops all rows with undefined (null) values.

Drops all rows with undefined (null) values.

Returns

DataFrame restricted to rows without undefined values in all columns.

Since

0.1.0

def dropUndefined(cols: String*): DataFrame

Drops all rows with undefined (null) values with respect to specified columns.

Drops all rows with undefined (null) values with respect to specified columns.

Value Params
cols

Columns.

Returns

DataFrame restricted to rows without undefined values in columns cols.

Throws
ColumnNotFoundException

If one of the columns cols does not exist.

Since

0.1.0

override def equals(df: Any): Boolean

Determines if the object is a DataFrame and is equivalent. Equivalence implies:

Determines if the object is a DataFrame and is equivalent. Equivalence implies:

  • The same index. The indices are equal if they have the same elements, the same order and the same base index.
  • The column names are the same (but may have different order).
  • The values in all columns are equal.
Value Params
df

DataFrame (or other object) to compare to.

Returns

True if equal, false otherwise.

Since

0.1.0

Definition Classes
Any
def get[T](row: Int, col: String)(implicit evidence$11: RequireType[T], evidence$12: Typeable[T], evidence$13: ClassTag[T]): T

Returns a value for a columns and a row.

Returns a value for a columns and a row.

Value Params
col

Column name.

row

Row.

Returns

Value.

Throws
ColumnNotFoundException

If the column is not found.

NoSuchElementException

If the value is undefined or row is not in the index.

See also
Since

0.1.0

Note

For an optimal performance in a loop, first extract the column as a Series.

def get[T](row: Option[Int], col: String)(implicit evidence$14: RequireType[T], evidence$15: Typeable[T], evidence$16: ClassTag[T]): T

Returns a value for a columns and a row.

Returns a value for a columns and a row.

Value Params
col

Column name.

row

Row.

Returns

Value.

Throws
ColumnNotFoundException

If the column is not found.

NoSuchElementException

If the value is undefined, row is not in the index or row is None.

See also
Since

0.1.0

Note

For an optimal performance in a loop, first extract the column as a Series.

def groupBy(col: String): Groups[Any]

Groups the DataFrame by a column.

Groups the DataFrame by a column.

Value Params
col

Column name.

Returns

Groups.

Throws
ColumnNotFoundException

If the a column does not exist.

See also
Since

0.1.0

Note

Undefined (null) grouping values are ignored. Each Double.NaN value represents an individual group.

def groupBy(col1: String, col2: String): Groups[(Any, Any)]

Groups the DataFrame by columns.

Groups the DataFrame by columns.

Value Params
col1

Column name.

col2

Column name.

Returns

Groups.

See also
Since

0.1.0

Note

Undefined (null) grouping values are ignored. Each Double.NaN value represents an individual group.

def groupBy(col1: String, col2: String, col3: String): Groups[(Any, Any, Any)]

Groups the DataFrame by columns.

Groups the DataFrame by columns.

Value Params
col1

Column name.

col2

Column name.

col3

Column name.

Returns

Groups.

See also
Since

0.1.0

Note

Undefined (null) grouping values are ignored. Each Double.NaN value represents an individual group.

def groupBy(cols: Seq[String]): Groups[Seq[Any]]

Groups the DataFrame by a sequence of columns (of arbitrary length).

Groups the DataFrame by a sequence of columns (of arbitrary length).

Value Params
cols

Sequence with column names.

Returns

Groups.

See also
Since

0.1.0

Note

Undefined (null) grouping values are ignored. Each Double.NaN value represents an individual group.

def groupByCol[T](col: String)(implicit evidence$17: Typeable[T], evidence$18: ClassTag[T]): Groups[T]

Groups the DataFrame by a typed column.

Groups the DataFrame by a typed column.

Value Params
col

Column name.

Returns

Groups.

Throws
ColumnNotFoundException

If the a column does not exist.

See also
Since

0.1.0

Note

Undefined (null) grouping values are ignored. Each Double.NaN value represents an individual group.

def groupByCol[T1, T2](col1: String, col2: String)(implicit evidence$19: Typeable[T1], evidence$20: ClassTag[T1], evidence$21: Typeable[T2], evidence$22: ClassTag[T2]): Groups[(T1, T2)]

Groups the DataFrame by typed columns.

Groups the DataFrame by typed columns.

Value Params
col1

Column name.

col2

Column name.

Returns

Groups.

See also
Since

0.1.0

Note

Undefined (null) grouping values are ignored. Each Double.NaN value represents an individual group.

def groupByCol[T1, T2, T3](col1: String, col2: String, col3: String)(implicit evidence$23: Typeable[T1], evidence$24: ClassTag[T1], evidence$25: Typeable[T2], evidence$26: ClassTag[T2], evidence$27: Typeable[T3], evidence$28: ClassTag[T3]): Groups[(T1, T2, T3)]

Groups the DataFrame by typed columns.

Groups the DataFrame by typed columns.

Value Params
col1

Column name.

col2

Column name.

col3

Column name.

Returns

Groups.

See also
Since

0.1.0

Note

Undefined (null) grouping values are ignored. Each Double.NaN value represents an individual group.

def groupByColOption[T](col: String)(implicit evidence$29: Typeable[T], evidence$30: ClassTag[T]): Groups[Option[T]]

Groups the DataFrame by a typed column including undefined values.

Groups the DataFrame by a typed column including undefined values.

Value Params
col

Column name.

Returns

Groups.

Throws
ColumnNotFoundException

If the a column does not exist.

See also
Since

0.1.0

Note

Undefined (null) grouping are assembled in one group. Each Double.NaN value represents an individual group.

def groupByColOption[T1, T2](col1: String, col2: String)(implicit evidence$31: Typeable[T1], evidence$32: ClassTag[T1], evidence$33: Typeable[T2], evidence$34: ClassTag[T2]): Groups[(Option[T1], Option[T2])]

Groups the DataFrame by typed columns including undefined values.

Groups the DataFrame by typed columns including undefined values.

Value Params
col1

Column name.

col2

Column name.

Returns

Groups.

See also
Since

0.1.0

Note

Undefined (null) grouping are assembled in one group. Each Double.NaN value represents an individual group.

def groupByColOption[T1, T2, T3](col1: String, col2: String, col3: String)(implicit evidence$35: Typeable[T1], evidence$36: ClassTag[T1], evidence$37: Typeable[T2], evidence$38: ClassTag[T2], evidence$39: Typeable[T3], evidence$40: ClassTag[T3]): Groups[(Option[T1], Option[T2], Option[T3])]

Groups the DataFrame by typed columns including undefined values.

Groups the DataFrame by typed columns including undefined values.

Value Params
col1

Column name.

col2

Column name.

col3

Column name.

Returns

Groups.

See also
Since

0.1.0

Note

Undefined (null) grouping are assembled in one group. Each Double.NaN value represents an individual group.

def groupByOption(col: String): Groups[Option[Any]]

Groups the DataFrame by a column including undefined values.

Groups the DataFrame by a column including undefined values.

Value Params
col

Column name.

Returns

Groups.

Throws
ColumnNotFoundException

If the a column does not exist.

See also
Since

0.1.0

Note

Undefined (null) grouping are assembled in one group. Each Double.NaN value represents an individual group.

def groupByOption(col1: String, col2: String): Groups[(Option[Any], Option[Any])]

Groups the DataFrame by columns including undefined values.

Groups the DataFrame by columns including undefined values.

Value Params
col1

Column name.

col2

Column name.

Returns

Groups.

See also
Since

0.1.0

Note

Undefined (null) grouping are assembled in one group. Each Double.NaN value represents an individual group.

def groupByOption(col1: String, col2: String, col3: String): Groups[(Option[Any], Option[Any], Option[Any])]

Groups the DataFrame by columns including undefined values.

Groups the DataFrame by columns including undefined values.

Value Params
col1

Column name.

col2

Column name.

col3

Column name.

Returns

Groups.

See also
Since

0.1.0

Note

Undefined (null) grouping are assembled in one group. Each Double.NaN value represents an individual group.

def groupByOption(cols: Seq[String]): Groups[Seq[Option[Any]]]

Groups the DataFrame by a sequence of columns (of arbitrary length) including undefined values.

Groups the DataFrame by a sequence of columns (of arbitrary length) including undefined values.

Value Params
cols

Sequence with column names.

Returns

Groups.

See also
Since

0.1.0

Note

Undefined (null) grouping are assembled in one group. Each Double.NaN value represents an individual group.

def indexBy[T](col: String)(implicit evidence$41: Typeable[T], evidence$42: ClassTag[T]): DataMap[T]

Indexes the DataFrame by a (typed) column.

Indexes the DataFrame by a (typed) column.

Value Params
col

Column name.

Returns

DataMap.

Throws
ColumnNotFoundException

If the a column does not exist.

See also
Since

0.1.0

Note

Undefined (null) grouping values are ignored. Each Double.NaN value represents an individual group.

def indexBy[T1, T2](col1: String, col2: String)(implicit evidence$43: Typeable[T1], evidence$44: ClassTag[T1], evidence$45: Typeable[T2], evidence$46: ClassTag[T2]): DataMap[(T1, T2)]

Indexes the DataFrame by (typed) columns.

Indexes the DataFrame by (typed) columns.

Value Params
col1

Column name.

col2

Column name.

Returns

DataMap.

See also
Since

0.1.0

Note

Undefined (null) grouping values are ignored. Each Double.NaN value represents an individual group.

def indexBy[T1, T2, T3](col1: String, col2: String, col3: String)(implicit evidence$47: Typeable[T1], evidence$48: ClassTag[T1], evidence$49: Typeable[T2], evidence$50: ClassTag[T2], evidence$51: Typeable[T3], evidence$52: ClassTag[T3]): DataMap[(T1, T2, T3)]

Indexes the DataFrame by (typed) columns.

Indexes the DataFrame by (typed) columns.

Value Params
col1

Column name.

col2

Column name.

col3

Column name.

Returns

DataMap.

See also
Since

0.1.0

Note

Undefined (null) grouping values are ignored. Each Double.NaN value represents an individual group.

def indexBy(cols: Seq[String]): DataMap[Seq[Any]]

Indexes the DataFrame by a sequence of columns (of arbitrary length).

Indexes the DataFrame by a sequence of columns (of arbitrary length).

Value Params
cols

Sequence with column names.

Returns

DataMap.

Throws
IllegalOperation

If sequence of columns is empty.

See also
Since

0.1.0

Note

Undefined (null) grouping values are ignored. Each Double.NaN value represents an individual group.

def indexByOption[T](col: String)(implicit evidence$53: Typeable[T], evidence$54: ClassTag[T]): DataMap[Option[T]]

Groups the DataFrame by a (typed) column including undefined values.

Groups the DataFrame by a (typed) column including undefined values.

Value Params
col

Column name.

Returns

DataMap with Option keys.

Throws
ColumnNotFoundException

If the a column does not exist.

See also
Since

0.1.0

Note

Undefined (null) grouping are assembled in one group. Each Double.NaN value represents an individual group.

def indexByOption[T1, T2](col1: String, col2: String)(implicit evidence$55: Typeable[T1], evidence$56: ClassTag[T1], evidence$57: Typeable[T2], evidence$58: ClassTag[T2]): DataMap[(Option[T1], Option[T2])]

Indexes the DataFrame by (typed) columns including undefined values.

Indexes the DataFrame by (typed) columns including undefined values.

Value Params
col1

Column name.

col2

Column name.

Returns

DataMap with Option keys.

See also
Since

0.1.0

Note

Undefined (null) grouping are assembled in one group. Each Double.NaN value represents an individual group.

def indexByOption[T1, T2, T3](col1: String, col2: String, col3: String)(implicit evidence$59: Typeable[T1], evidence$60: ClassTag[T1], evidence$61: Typeable[T2], evidence$62: ClassTag[T2], evidence$63: Typeable[T3], evidence$64: ClassTag[T3]): DataMap[(Option[T1], Option[T2], Option[T3])]

Indexes the DataFrame by (typed) columns including undefined values.

Indexes the DataFrame by (typed) columns including undefined values.

Value Params
col1

Column name.

col2

Column name.

col3

Column name.

Returns

DataMap with Option keys.

See also
Since

0.1.0

Note

Undefined (null) grouping are assembled in one group. Each Double.NaN value represents an individual group.

def indexIterator: Iterator[Int]

Iterator over the index.

Iterator over the index.

Returns

Iterator over the index in the current order.

Since

0.1.0

def info: String

Information string describing the DataFrame.

Information string describing the DataFrame.

Returns

Info string.

Since

0.1.0

Adapter to write the DataFrame to a specific output. Use e.g.

Adapter to write the DataFrame to a specific output. Use e.g.

import pd.io.parquet.implicits

to import the respective format and read a DataFrame via

df.io.parquet.write(...)

import pd.io.parquet.implicits }}} respective format and read a DataFrame via

df.io.parquet.write(...)
Returns

WriteAdapter.

See also
Since

0.1.0

def joinInner(df: DataFrame, cols: String*): DataFrame

Performs an inner join equivalent to a SQL inner join with respect to the specified key columns.

Performs an inner join equivalent to a SQL inner join with respect to the specified key columns.

Value Params
cols

Key columns to be joined.

df

DataFrame to be joined.

Returns

The joined DataFrame.

Throws
IllegalOperation

If not at least one key column is specified.

See also
Since

0.1.0

Note
  • Columns in df are dropped that are not key columns but appear also in the original DataFrame. To keep these columns in the results, the columns must be renamed first.
  • The comparison of two undefined values (null) is false and does not satisfy a join condition.
def joinLeft(df: DataFrame, cols: String*): DataFrame

Performs a left join equivalent to a SQL outer left join with respect to the specified key columns.

Performs a left join equivalent to a SQL outer left join with respect to the specified key columns.

Value Params
cols

Key columns to be joined.

df

DataFrame to be joined.

Returns

The joined DataFrame.

Throws
IllegalOperation

If not at least one key column is specified.

See also
Since

0.1.0

Note
  • Columns in df are dropped that are not key columns but appear also in the original DataFrame. To keep these columns in the results, the columns must be renamed first.
  • The comparison of two undefined values (null) is false and does not satisfy a join condition.
def joinOuter(df: DataFrame, cols: String*): DataFrame

Performs an outer join equivalent to a SQL full outer join with respect to the specified key columns.

Performs an outer join equivalent to a SQL full outer join with respect to the specified key columns.

Value Params
cols

Key columns to be joined.

df

DataFrame to be joined.

Returns

The joined DataFrame.

Throws
IllegalOperation

If not at least one key column is specified.

See also
Since

0.1.0

Note
  • Columns in df are dropped that are not key columns but appear also in the original DataFrame. To keep these columns in the results, the columns must be renamed first.
  • The comparison of two undefined values (null) is false and does not satisfy a join condition.
def joinRight(df: DataFrame, cols: String*): DataFrame

Performs a right join equivalent to a SQL outer right join with respect to the specified key columns.

Performs a right join equivalent to a SQL outer right join with respect to the specified key columns.

Value Params
cols

Key columns to be joined.

df

DataFrame to be joined.

Returns

The joined DataFrame.

Throws
IllegalOperation

If not at least one key column is specified.

See also
Since

0.1.0

Note
  • Columns in df are dropped that are not key columns but appear also in the original DataFrame. To keep these columns in the results, the columns must be renamed first.
  • The comparison of two undefined values (null) is false and does not satisfy a join condition.

Merges two DataFrame by index.

Merges two DataFrame by index.

Value Params
df

DataFrame to merge.

Returns

DataFrame with all columns of df concatenated on the right side.

Since

0.1.0

Note
  • Columns with the same name are replaced by the rightmost column.
  • The order of index positions is only preserved if the indices are equal.
  • Use the join methods if you intend to join via column values.
def merge(series: Series[_]*): DataFrame

Merges a Series into a DataFrame by index.

Merges a Series into a DataFrame by index.

Value Params
series

Series to concatenate.

Returns

DataFrame with all columns of df concatenated on the right side.

Since

0.1.0

Note
  • Columns with the same name are replaced by the rightmost column.
  • The order of index positions is only preserved if all indices are equal.
  • Use the join methods if you intend to join via column values.
def numCols: Int

Number of columns.

Number of columns.

Returns

Number of columns.

Since

0.1.0

def numRows: Int

Number of rows.

Number of rows.

Returns

Length of the Series, i.e. number of elements in the index.

Since

0.1.0

def numRowsBase: Int

Number of rows of the underlying data vectors.

Number of rows of the underlying data vectors.

Returns

Number of elements in the base index.

Since

0.1.0

def plot: Plot

Creates a plot.

Creates a plot.

Returns

Plot object.

Since

0.1.0

Requirement object which is used to throw exceptions if conditions are not met.

Requirement object which is used to throw exceptions if conditions are not met.

Returns

Requirement object.

Since

0.1.0

Resets the index to a UniformIndex with index positions 0 to numRows - 1 while keeping the order of the elements. If the current index is not a uniform index, the columns are copied into new vectors with the order of the current index.

Resets the index to a UniformIndex with index positions 0 to numRows - 1 while keeping the order of the elements. If the current index is not a uniform index, the columns are copied into new vectors with the order of the current index.

Returns

DataFrame with a UniformIndex.

See also

sortIndex for sorting the index by index positions.

Since

0.1.0

def select(cols: Array[String]): DataFrame

Extracts DataFrame with selected columns.

Extracts DataFrame with selected columns.

Value Params
cols

Column names.

Returns

DataFrame with selected column.

Throws
ColumnNotFoundException

If a column is not found.

See also
Since

0.1.0

@targetName("selectSeq")
def select(cols: Seq[String]): DataFrame

Extracts DataFrame with selected columns.

Extracts DataFrame with selected columns.

Value Params
cols

Column names.

Returns

DataFrame with selected column.

Throws
ColumnNotFoundException

If a column is not found.

See also
Since

0.1.0

@targetName("selectCols")
def select(cols: String*): DataFrame

Extracts DataFrame with selected columns.

Extracts DataFrame with selected columns.

Value Params
cols

Column names.

Returns

DataFrame with selected column.

Throws
ColumnNotFoundException

If a column is not found.

See also
Since

0.1.0

@unused
def selectDynamic(col: String): Series[Any]

Selects a column via dot notation.

Selects a column via dot notation.

Value Params
col

Column name.

Returns

Column as a Series with the name of the column.

Throws
ColumnNotFoundException

If the column is not found.

See also
Since

0.1.0

Example
df.myColumn
def show(n: Int, width: Int, annotateIndex: Boolean, annotateType: Boolean, colWidth: Int): Unit

Prints the DataFrame as a table.

Prints the DataFrame as a table.

Value Params
annotateIndex

If true, the an index column is displayed.

annotateType

If true, the type for each column in displayed.

colWidth

The width of each column.

n

The maximal numbers of rows.

width

The maximal width of a line.

See also
Since

0.1.0

@targetName("sortValuesCols")
def sortValues(cols: String*): DataFrame

Sorts the DataFrame in ascending order with respect to columns cols for supported types.

Sorts the DataFrame in ascending order with respect to columns cols for supported types.

Value Params
cols

Columns which defines the order.

Returns

DataFrame with sorted index.

Since

0.1.0

Note
  • The sorting algorithm is stable.
  • String values are sorted lexicographically ignoring case differences (see String.compareToIgnoreCase).
def sortValues(keys: (String, Order)*): DataFrame

Sorts the DataFrame with respect to columns cols for supported types.

Sorts the DataFrame with respect to columns cols for supported types.

Value Params
keys

Tuple column -> order with column name and order for sorting. Possible values are Order.asc, Order.desc, Order.ascNullsFirst and Order.descNullsFirst

Returns

Series with sorted index.

Since

0.1.0

Note
  • The sorting algorithm is stable.
  • String values are sorted lexicographically ignoring case differences (see String.compareToIgnoreCase).
def sorted[T](col: String)(implicit ordering: Ordering[T]): DataFrame

Sorts the DataFrame in ascending order with respect to column col.

Sorts the DataFrame in ascending order with respect to column col.

Value Params
col

Column which defines the order.

ordering

Implicit ordering that must exist for the type T, i.e. classes must extend the trait Ordered.

Returns

DataFrame with sorted index.

Since

0.1.0

Note
  • The sorting algorithm is stable.
  • String values are sorted lexicographically ignoring case differences (see String.compareToIgnoreCase).
def sorted[T1, T2](col1: String, col2: String)(implicit ordering1: Ordering[T1], ordering2: Ordering[T2]): DataFrame

Sorts the DataFrame in ascending order with respect to tow columns.

Sorts the DataFrame in ascending order with respect to tow columns.

Value Params
col1

Column which defines the order.

col2

Column which defines the secondary order.

ordering1

Implicit ordering that must exist for the type T1, i.e. classes must extend the trait Ordered.

ordering2

Implicit ordering that must exist for the type T2, i.e. classes must extend the trait Ordered.

Returns

DataFrame with sorted index.

Since

0.1.0

Note
  • The sorting algorithm is stable.
  • String values are sorted lexicographically ignoring case differences (see String.compareToIgnoreCase).
def sorted[T1, T2, T3](col1: String, col2: String, col3: String)(implicit ordering1: Ordering[T1], ordering2: Ordering[T2], ordering3: Ordering[T3]): DataFrame

Sorts the DataFrame in ascending order with respect to three columns.

Sorts the DataFrame in ascending order with respect to three columns.

Value Params
col1

Column which defines the order.

col2

Column which defines the secondary order.

col3

Column which defines the trinary order.

ordering1

Implicit ordering that must exist for the type T1, i.e. classes must extend the trait Ordered.

ordering2

Implicit ordering that must exist for the type T2, i.e. classes must extend the trait Ordered.

ordering3

Implicit ordering that must exist for the type T3, i.e. classes must extend the trait Ordered.

Returns

DataFrame with sorted index.

Since

0.1.0

Note
  • The sorting algorithm is stable.
  • String values are sorted lexicographically ignoring case differences (see String.compareToIgnoreCase).
def sorted[T](key: (String, Order))(implicit ordering: Ordering[T]): DataFrame

Sorts the DataFrame with respect to a column.

Sorts the DataFrame with respect to a column.

Value Params
key

Tuple column -> order with column name and order for sorting. Possible values are Order.asc, Order.desc, Order.ascNullsFirst and Order.descNullsFirst. Implicit ordering must exist for the type T, i.e. classes must extend the trait Ordered.

Returns

Series with sorted index.

Since

0.1.0

Note
  • The sorting algorithm is stable.
  • String values are sorted lexicographically ignoring case differences (see String.compareToIgnoreCase).
def sorted[T1, T2](key1: (String, Order), key2: (String, Order))(implicit ordering1: Ordering[T1], ordering2: Ordering[T2]): DataFrame

Sorts the DataFrame with respect to a column.

Sorts the DataFrame with respect to a column.

Value Params
key1

Tuple column -> order with column name and order for sorting. Possible values are Order.asc, Order.desc, Order.ascNullsFirst and Order.descNullsFirst. Implicit ordering must exist for the type T, i.e. classes must extend the trait Ordered.

key2

Secondary sorting key. See key1 for details.

ordering1

Implicit ordering that must exist for the type T1, i.e. classes must extend the trait Ordered.

ordering2

Implicit ordering that must exist for the type T2, i.e. classes must extend the trait Ordered.

Returns

Series with sorted index.

Since

0.1.0

Note
  • The sorting algorithm is stable.
  • String values are sorted lexicographically ignoring case differences (see String.compareToIgnoreCase).
def sorted[T1, T2, T3](key1: (String, Order), key2: (String, Order), key3: (String, Order))(implicit ordering1: Ordering[T1], ordering2: Ordering[T2], ordering3: Ordering[T3]): DataFrame

Sorts the DataFrame with respect to a column.

Sorts the DataFrame with respect to a column.

Value Params
key1

Tuple column -> order with column name and order for sorting. Possible values are Order.asc, Order.desc, Order.ascNullsFirst and Order.descNullsFirst. Implicit ordering must exist for the type T, i.e. classes must extend the trait Ordered.

key2

Secondary sorting key. See key1 for details.

key3

Trinary sorting key. See key1 for details.

ordering1

Implicit ordering that must exist for the type T1, i.e. classes must extend the trait Ordered.

ordering2

Implicit ordering that must exist for the type T2, i.e. classes must extend the trait Ordered.

ordering3

Implicit ordering that must exist for the type T3, i.e. classes must extend the trait Ordered.

Returns

Series with sorted index.

Since

0.1.0

Note
  • The sorting algorithm is stable.
  • String values are sorted lexicographically ignoring case differences (see String.compareToIgnoreCase).
def toArray[T](col: String)(implicit evidence$65: RequireType[T], evidence$66: Typeable[T], evidence$67: ClassTag[T]): Array[Option[T]]

Copies a column into an array.

Copies a column into an array.

Value Params
col

Column name.

Returns

Array of type Option[T] with numRows elements.

See also
Since

0.1.0

def toFlatArray[T](col: String)(implicit evidence$68: RequireType[T], evidence$69: Typeable[T], evidence$70: ClassTag[T]): Array[T]

Copies a column into an array.

Copies a column into an array.

Value Params
col

Column name.

Returns

Array of type T.

See also
Since

0.1.0

def toFlatList[T](col: String)(implicit evidence$71: RequireType[T], evidence$72: Typeable[T], evidence$73: ClassTag[T]): List[T]

Copies a column into a List.

Copies a column into a List.

Value Params
col

Column name.

Returns

List of type T.

See also
Since

0.1.0

def toFlatSeq[T](col: String)(implicit evidence$74: RequireType[T], evidence$75: Typeable[T], evidence$76: ClassTag[T]): Seq[T]

Copies a column into a sequence.

Copies a column into a sequence.

Value Params
col

Column name.

Returns

Sequence of type T.

See also
Since

0.1.0

def toList[T](col: String)(implicit evidence$77: RequireType[T], evidence$78: Typeable[T], evidence$79: ClassTag[T]): List[Option[T]]

Copies a column into a List.

Copies a column into a List.

Value Params
col

Column name.

Returns

List of type List[T] with numRows elements.

See also
Since

0.1.0

def toSeq[T](col: String)(implicit evidence$80: RequireType[T], evidence$81: Typeable[T], evidence$82: ClassTag[T]): Seq[Option[T]]

Copies a column into a sequence.

Copies a column into a sequence.

Value Params
col

Column name.

Returns

Sequence of type Seq[T] with numRows elements.

See also
Since

0.1.0

override def toString: String

Renders the DataFrame as a table using default parameters.

Renders the DataFrame as a table using default parameters.

Returns

Formatted table.

Since

0.1.0

Definition Classes
Any
def toString(n: Int, width: Int, annotateIndex: Boolean, annotateType: Boolean, colWidth: Int, indexWidth: Int): String

Renders the DataFrame as a table.

Renders the DataFrame as a table.

Value Params
annotateIndex

If true, the an index column is displayed.

annotateType

If true, the type for each column in displayed.

colWidth

The width of each column.

indexWidth

The width of the index colum.

n

The maximal numbers of rows.

width

The maximal width of a line.

Returns

Formatted table.

Since

0.1.0

Appends row-wise one or multiple DataFrame objects. The method materializes all indices into a uniform index.

Appends row-wise one or multiple DataFrame objects. The method materializes all indices into a uniform index.

Value Params
df

DataFrame object to be appended.

Returns

DataFrame with exactly the same columns.

Throws
ColumnNotFoundException

If the DataFrame objects to be appended don't have the same columns.

SeriesCastException

If the underlying types of the columns do not match.

Since

0.1.0

def update[T](col: String, series: Series[T]): DataFrame

Updates a column with all values that are defined in the Series.

Updates a column with all values that are defined in the Series.

Value Params
col

Column name to be updated.

series

Series.

Returns

DataFrame with updated column.

Throws
MergeIndexException

If indices are not compatible.

SeriesCastException

If the data type does not match with the existing column.

See also
Since

0.1.0

Note
  • The index of the Series must be included in the DataFrame.
  • The index of the DataFrame is not altered.
def update[T](series: Series[T]): DataFrame

Updates a column with all values that are defined in the Series. The name of the Series defines the column.

Updates a column with all values that are defined in the Series. The name of the Series defines the column.

Value Params
series

Series.

Returns

DataFrame with updated column.

Throws
MergeIndexException

If indices are not compatible.

SeriesCastException

If the data type does not match with the existing column.

See also
Since

0.1.0

Note
  • The index of the Series must be included in the DataFrame.
  • The index of the DataFrame is not altered.
def update(series: Series[_]*): DataFrame

Updates a columns with all values that are defined in the Series objects. The names of the Series define the column names.

Updates a columns with all values that are defined in the Series objects. The names of the Series define the column names.

Value Params
series

Series.

Returns

DataFrame with updated column.

Throws
MergeIndexException

If indices are not compatible.

SeriesCastException

If the data types do not match with the existing column.

See also
Since

0.1.0

Note
  • The index of the Series must be included in the DataFrame.
  • The index of the DataFrame is not altered.
def valueCounts(cols: String*): DataFrame

Counts the number of rows for unique value pairs in key columns (including undefined key values).

Counts the number of rows for unique value pairs in key columns (including undefined key values).

Value Params
cols

Key columns.

Returns

DataFrame with the key columns and a "count" column.

Since

0.1.0

def valueCounts(cols: Seq[String], countCol: String, dropUndefined: Boolean, order: Order, asFraction: Boolean): DataFrame

Counts the number of rows for unique value pairs in key columns.

Counts the number of rows for unique value pairs in key columns.

Value Params
asFraction

If true, it returns the total fraction as Double for an unique key value relative to the total number of rows (including undefined values). Otherwise the number of rows is returned as Int column.

cols

Key columns.

countCol

Name of the resulting column.

dropUndefined

If true, drops rows with undefined values in a key column (the default includes undefined values).

order

Order of the countCol column.

Returns

DataFrame with the key columns and the countCol column.

Since

0.1.0

@targetName("concat")
def |[T](series: Series[T]): DataFrame

Concatenates the DataFrame on the left and the Series on the right hand side.

Concatenates the DataFrame on the left and the Series on the right hand side.

Value Params
series

Series to be concatenated.

Returns

DataFrame, where the Series is the last column.

Throws
MergeIndexException

If indices are not compatible.

See also
Since

0.1.0

Note
  • Columns with the same name are replaced by the rightmost column.
  • The index of the right Series must be included in the left DataFrame.
  • Data on the right hand side might be copied if indices are not equivalent.
  • The resulting index is equivalent to the left operand.
  • The operator is equivalent to the method col.
@targetName("concat")
def |[T](namedSeries: (String, Series[T])): DataFrame

Concatenates the DataFrame on the left and the Series on the right hand side.

Concatenates the DataFrame on the left and the Series on the right hand side.

Value Params
namedSeries

Tuple of column name and Series to be concatenated.

Returns

DataFrame, where the Series is the last column.

Throws
MergeIndexException

If indices are not compatible.

See also
Since

0.1.0

Note
  • Columns with the same name are replaced by the rightmost column.
  • The index of the right Series must be included in the left DataFrame.
  • Data on the right hand side might be copied if indices are not equivalent.
  • The resulting index is equivalent to the left operand.
@targetName("concat")
def |(df: DataFrame): DataFrame

Concatenates the DataFrame on the left and the DataFrame on the right hand side.

Concatenates the DataFrame on the left and the DataFrame on the right hand side.

Value Params
df

DataFrame to be concatenated.

Returns

DataFrame, where the left DataFrame columns are prior to the columns of the right DataFrame.

Throws
MergeIndexException

If indices are not compatible.

See also
Since

0.1.0

Note
  • Columns with the same name are replaced by the rightmost column.
  • The index of the right DataFrame must be included in the left DataFrame.
  • Data on the right hand side might be copied if indices are not equivalent.
  • The resulting index is equivalent to the left operand.

Inherited methods

def apply(series: Series[Boolean]): DataFrame

Slices the index by intersecting the current index with a boolean Series.

Slices the index by intersecting the current index with a boolean Series.

Value Params
series

Boolean Series as mask, where only index positions kept that are true.

Returns

Object with sliced index and order of series.

See also
Since

0.1.0

Inherited from
IndexOps
def apply(array: Array[Int]): DataFrame

Slices the index by intersecting it with an array of index positions.

Slices the index by intersecting it with an array of index positions.

Value Params
array

Array of index positions.

Returns

Object with sliced index and order of array.

See also
Since

0.1.0

Inherited from
IndexOps
def apply(seq: Seq[Int]): DataFrame

Slices the index by intersecting it with a sequence of index positions.

Slices the index by intersecting it with a sequence of index positions.

Value Params
seq

Sequence of index positions.

Returns

Object with sliced index and order of seq.

See also
Since

0.1.0

Inherited from
IndexOps
def apply(range: Range): DataFrame

Slices the index by intersecting it with a range.

Slices the index by intersecting it with a range.

Value Params
range

Range.

Returns

Object with sliced index with ascending order.

See also
Since

0.1.0

Inherited from
IndexOps
def head(n: Int): DataFrame

Head of object.

Head of object.

Value Params
n

Number of rows.

Returns

First n rows in index.

See also
Since

0.1.0

Inherited from
IndexOps

Sorts the index (ascending).

Sorts the index (ascending).

Returns

Object with sorted index.

See also
Since

0.1.0

Inherited from
IndexOps
def tail(n: Int): DataFrame

Tail of object.

Tail of object.

Value Params
n

Number of rows.

Returns

Last n rows in index.

See also
Since

0.1.0

Inherited from
IndexOps