DataSet

abstract class DataSet[T <: Product & DataSet[T]] extends IndexOps[T]

A DataSet similar to a DataFrame but with static column names and column types that can be checked at compile time.

Type Params
T

Case class that extends DataSet.

See also
Since

0.1.0

Example
case class Members(
 firstName: Series[String],
 familyName: Series[String],
 fee: Series[Double],
) extends DataSet[Members]
Companion
object
trait IndexOps[T]
class Object
trait Matchable
class Any

Value members

Concrete methods

@targetName("update")
def &[T](namedSeries: (String, Series[T])): DataFrame
Implicitly added by toDataFrame

Updates a column with all values that are defined in the Series.

Updates a column with all values that are defined in the Series.

Value Params
namedSeries

Tuple of column name to be updated and Series.

Returns

DataFrame with updated column.

Throws
MergeIndexException

If indices are not compatible.

SeriesCastException

If the data type does not match with the existing column.

See also
Since

0.1.0

Note
  • The index of the Series must be included in the DataFrame.
  • The index of the DataFrame is not altered.
@targetName("updateOperator")
def &[T](series: Series[T]): DataFrame
Implicitly added by toDataFrame

Updates a column with all values that are defined in the Series. The name of the Series defines the column.

Updates a column with all values that are defined in the Series. The name of the Series defines the column.

Value Params
series

Series.

Returns

DataFrame with updated column.

Throws
MergeIndexException

If indices are not compatible.

SeriesCastException

If the data type does not match with the existing column.

See also
Since

0.1.0

Note
  • The index of the Series must be included in the DataFrame.
  • The index of the DataFrame is not altered.
@targetName("prepend")
def ::[T](series: Series[T]): DataFrame
Implicitly added by toDataFrame

Concatenates the Series on the left and the DataFrame on the right hand side.

Concatenates the Series on the left and the DataFrame on the right hand side.

Value Params
series

Series.

Returns

DataFrame, where the Series is the first column.

Throws
MergeIndexException

If indices are not compatible.

See also
Since

0.1.0

Note
  • Columns with the same name are replaced by the rightmost column.
  • The index of the left DataFrame must be included in the right Series.
  • Data on the left hand side might be copied if indices are not equivalent.
  • The resulting index is equivalent to the right operand.
  • The operators | and :: are equivalent if indices on the left and right side are equal.
@targetName("prepend")
Implicitly added by toDataFrame

Concatenates the DataFrame on the left and the DataFrame on the right hand side.

Concatenates the DataFrame on the left and the DataFrame on the right hand side.

Value Params
df

DataFrame.

Returns

DataFrame, where the left DataFrame columns are prior to the columns of the right DataFrame.

Throws
MergeIndexException

If indices are not compatible.

See also
Since

0.1.0

Note
  • Columns with the same name are replaced by the rightmost column.
  • The index of the left DataFrame must be included in the right DataFrame.
  • Data on the left hand side might be copied if indices are not equivalent.
  • The resulting index is equivalent to the right operand.
  • The operators | and :: are equivalent if indices on the left and right side are equal.
def append[T](name: String, series: Series[T]): DataFrame
Implicitly added by toDataFrame

Appends a column if not existing in DataFrame. If the column exists, it is not altered.

Appends a column if not existing in DataFrame. If the column exists, it is not altered.

Value Params
name

Column name.

series

Series.

Returns

Series with appended column on the right hand side.

Throws
MergeIndexException

If indices are not compatible. If the column exists, no exception is thrown.

Since

0.1.0

Note
  • The index of the Series must be included in the DataFrame.
  • Data of the Series might be copied if indices are not equivalent.
  • The index of the DataFrame is not altered.
def append(series: Series[_]*): DataFrame
Implicitly added by toDataFrame

Appends columns if not existing in DataFrame. If a column exists, it is not altered.

Appends columns if not existing in DataFrame. If a column exists, it is not altered.

Value Params
series

Series to be added. The column names are taken from the Series.

Returns

Series with appended columns on the right hand side.

Throws
MergeIndexException

If indices are not compatible. If the column exists, no exception is thrown.

Since

0.1.0

Note
  • The index of the Series must be included in the DataFrame.
  • Data of the Series might be copied if indices are not equivalent.
  • The index of the DataFrame is not altered.
def apply(col: String): Series[Any]
Implicitly added by toDataFrame

Returns a column as a Series.

Returns a column as a Series.

Value Params
col

Column name.

Returns

Column as a Series with the name of the column.

Throws
ColumnNotFoundException

If the column is not found.

See also
Since

0.1.0

def apply[T](row: Int, col: String)(implicit evidence$1: RequireType[T], evidence$2: Typeable[T], evidence$3: ClassTag[T]): Option[T]
Implicitly added by toDataFrame

Returns a value for a columns and a row.

Returns a value for a columns and a row.

Value Params
col

Column name.

row

Row.

Returns

Value as Option.

Throws
ColumnNotFoundException

If the column is not found.

IndexBoundsException

If row is not part of the base index.

See also
Since

0.1.0

Note

For an optimal performance in a loop, first extract the column as a Series.

def apply[T](row: Option[Int], col: String)(implicit evidence$4: RequireType[T], evidence$5: Typeable[T], evidence$6: ClassTag[T]): Option[T]
Implicitly added by toDataFrame

Returns a value for a columns and a row.

Returns a value for a columns and a row.

Value Params
col

Column name.

row

Row.

Returns

Value as Option. None if row is None.

Throws
ColumnNotFoundException

If the column is not found.

IndexBoundsException

If row is not part of the base index.

See also
Since

0.1.0

Note

For an optimal performance in a loop, first extract the column as a Series.

def apply[T](row: Int, col: String, default: => T)(implicit evidence$7: Typeable[T], evidence$8: ClassTag[T]): T
Implicitly added by toDataFrame

Returns a value for a columns and a row using a default value for undefined entries.

Returns a value for a columns and a row using a default value for undefined entries.

Value Params
col

Column name.

default

Default value for undefined values.

row

Row.

Returns

Value.

Throws
ColumnNotFoundException

If the column is not found.

IndexBoundsException

If row is not part of the base index.

See also
Since

0.1.0

Note

For an optimal performance in a loop, first extract the column as a Series.

def apply[T](row: Option[Int], col: String, default: => T)(implicit evidence$9: Typeable[T], evidence$10: ClassTag[T]): T
Implicitly added by toDataFrame

Returns a value for a columns and a row using a default value for undefined entries.

Returns a value for a columns and a row using a default value for undefined entries.

Value Params
col

Column name.

default

Default value for undefined values.

row

Row.

Returns

Value. Default value if row is None.

Throws
ColumnNotFoundException

If the column is not found.

IndexBoundsException

If row is not part of the base index.

See also
Since

0.1.0

Note

For an optimal performance in a loop, first extract the column as a Series.

def apply(range: Range, col: String): Series[Any]
Implicitly added by toDataFrame

Extracts a column and slices the index by intersecting the current index with a range.

Extracts a column and slices the index by intersecting the current index with a range.

Value Params
col

Column name.

range

Range.

Returns

Series with sliced index.

Throws
ColumnNotFoundException

If the column is not found.

See also
Since

0.1.0

def apply(seq: Seq[Int], col: String): Series[Any]
Implicitly added by toDataFrame

Extracts a column and slices the index by intersecting the current index with a sequence of index positions.

Extracts a column and slices the index by intersecting the current index with a sequence of index positions.

Value Params
col

Column name.

seq

Sequence of index positions.

Returns

Series with sliced index.

Throws
ColumnNotFoundException

If the column is not found.

See also
Since

0.1.0

def apply(array: Array[Int], col: String): Series[Any]
Implicitly added by toDataFrame

Extracts a column and slices the index by intersecting the current index with an array of index positions.

Extracts a column and slices the index by intersecting the current index with an array of index positions.

Value Params
array

Array of index positions.

col

Column name.

Returns

Series with sliced index.

Throws
ColumnNotFoundException

If the column is not found.

See also
Since

0.1.0

def apply(series: Series[Boolean], col: String): Series[Any]
Implicitly added by toDataFrame

Extracts a column and slices the index by intersecting the current index with a boolean Series.

Extracts a column and slices the index by intersecting the current index with a boolean Series.

Value Params
col

Column name.

series

Boolean Series as mask, where only index positions kept that are true.

Returns

Series with sliced index.

Throws
ColumnNotFoundException

If the column is not found.

See also
Since

0.1.0

@targetName("applyCols")
def apply(cols: Seq[String]): DataFrame
Implicitly added by toDataFrame

Extracts DataFrame with selected columns.

Extracts DataFrame with selected columns.

Value Params
cols

Column names.

Returns

DataFrame with selected column.

Throws
ColumnNotFoundException

If one of the columns is not found.

See also
Since

0.1.0

def canEqual(a: Any): Boolean
Implicitly added by toDataFrame

Determines if an object is a DataFrame.

Determines if an object is a DataFrame.

Value Params
a

Any object.

Returns

True if the object is a DataFrame and false otherwise.

Since

0.1.0

def col[T](series: Series[T]): DataFrame
Implicitly added by toDataFrame

Appends (or replaces) one columns.

Appends (or replaces) one columns.

Value Params
series

Series to be concatenated.

Returns

DataFrame, where the Series is appended on the right side.

Throws
MergeIndexException

If indices are not compatible.

See also
Since

0.1.0

Note
  • An existing column with the same name is replaced.
  • The index of the Series must be included in the left DataFrame.
  • Series might be copied if indices are not equivalent.
  • The index of the DataFrame is not altered.
  • This operation is equivalent to the | operator.
def col[T](col: String, series: Series[T]): DataFrame
Implicitly added by toDataFrame

Appends (or replaces) a columns.

Appends (or replaces) a columns.

Value Params
col

Name of column to be appended.

series

Series.

Returns

DataFrame, where the Series is appended on the right side.

Throws
MergeIndexException

If indices are not compatible.

See also
Since

0.1.0

Note
  • An existing column with the same name is replaced.
  • The index of the Series must be included in the left DataFrame.
  • Series might be copied if indices are not equivalent.
  • The index of the DataFrame is not altered.
  • This operation is equivalent to the | operator.
Implicitly added by toDataFrame

Appends (or replaces) multiple columns.

Appends (or replaces) multiple columns.

Returns

Appender object which appends Series on the right side.

Throws
MergeIndexException

If indices are not compatible.

See also
Since

0.1.0

Note
  • Existing columns with the same name are replaced by the rightmost column.
  • The index of the Series must be included in the left DataFrame.
  • Series might be copied if indices are not equivalent.
  • The index of the DataFrame is not altered.
  • For one column, this operation is equivalent to the | operator.
Example
 df.cols("price" -> Series(10.0, 20.0), "quantity" -> Series(5, 2))
 df.cols(Series(10.0, 20.0) as "price", Series(5, 2) as "quantity")
 df.cols(Series("price")(10.0, 20.0), Series("quantity")(5, 2))
 df.cols(price = Series(10.0, 20.0), quantity = Series(5, 2))
def columnArray: Array[Series[Any]]
Implicitly added by toDataFrame

All columns as array of Series.

All columns as array of Series.

Returns

Array with all columns as Series in defined order. All Series have the same index as the DataFrame.

Since

0.1.0

def columnIterator: Iterable[Series[Any]]
Implicitly added by toDataFrame

Iterates over all columns.

Iterates over all columns.

Returns

Iterable over all columns in defined order.

Since

0.1.0

def columns: Seq[String]
Implicitly added by toDataFrame

Sequence with column names.

Sequence with column names.

Returns

Sequence with column names in defined order.

Since

0.1.0

def contains(col: String): Boolean
Implicitly added by toDataFrame

Determines if a column is in the DataFrame.

Determines if a column is in the DataFrame.

Value Params
col

Column name.

Returns

True if DataFrame has the column col and false otherwise.

Since

0.1.0

def display(n: Int, width: Int, colWidth: Int): Unit
Implicitly added by toDataFrame

Prints the DataFrame as a table with an index column and annotated column types.

Prints the DataFrame as a table with an index column and annotated column types.

Value Params
colWidth

The width of each column.

n

The maximal numbers of rows.

width

The maximal width of a line.

See also
Since

0.1.0

Implicitly added by toDataFrame

Drops all rows with undefined (null) values.

Drops all rows with undefined (null) values.

Returns

DataFrame restricted to rows without undefined values in all columns.

Since

0.1.0

def dropUndefined(cols: String*): DataFrame
Implicitly added by toDataFrame

Drops all rows with undefined (null) values with respect to specified columns.

Drops all rows with undefined (null) values with respect to specified columns.

Value Params
cols

Columns.

Returns

DataFrame restricted to rows without undefined values in columns cols.

Throws
ColumnNotFoundException

If one of the columns cols does not exist.

Since

0.1.0

def get[T](row: Int, col: String)(implicit evidence$11: RequireType[T], evidence$12: Typeable[T], evidence$13: ClassTag[T]): T
Implicitly added by toDataFrame

Returns a value for a columns and a row.

Returns a value for a columns and a row.

Value Params
col

Column name.

row

Row.

Returns

Value.

Throws
ColumnNotFoundException

If the column is not found.

NoSuchElementException

If the value is undefined or row is not in the index.

See also
Since

0.1.0

Note

For an optimal performance in a loop, first extract the column as a Series.

def get[T](row: Option[Int], col: String)(implicit evidence$14: RequireType[T], evidence$15: Typeable[T], evidence$16: ClassTag[T]): T
Implicitly added by toDataFrame

Returns a value for a columns and a row.

Returns a value for a columns and a row.

Value Params
col

Column name.

row

Row.

Returns

Value.

Throws
ColumnNotFoundException

If the column is not found.

NoSuchElementException

If the value is undefined, row is not in the index or row is None.

See also
Since

0.1.0

Note

For an optimal performance in a loop, first extract the column as a Series.

def groupBy(col: String): Groups[Any]
Implicitly added by toDataFrame

Groups the DataFrame by a column.

Groups the DataFrame by a column.

Value Params
col

Column name.

Returns

Groups.

Throws
ColumnNotFoundException

If the a column does not exist.

See also
Since

0.1.0

Note

Undefined (null) grouping values are ignored. Each Double.NaN value represents an individual group.

def groupBy(col1: String, col2: String): Groups[(Any, Any)]
Implicitly added by toDataFrame

Groups the DataFrame by columns.

Groups the DataFrame by columns.

Value Params
col1

Column name.

col2

Column name.

Returns

Groups.

See also
Since

0.1.0

Note

Undefined (null) grouping values are ignored. Each Double.NaN value represents an individual group.

def groupBy(col1: String, col2: String, col3: String): Groups[(Any, Any, Any)]
Implicitly added by toDataFrame

Groups the DataFrame by columns.

Groups the DataFrame by columns.

Value Params
col1

Column name.

col2

Column name.

col3

Column name.

Returns

Groups.

See also
Since

0.1.0

Note

Undefined (null) grouping values are ignored. Each Double.NaN value represents an individual group.

def groupBy(cols: Seq[String]): Groups[Seq[Any]]
Implicitly added by toDataFrame

Groups the DataFrame by a sequence of columns (of arbitrary length).

Groups the DataFrame by a sequence of columns (of arbitrary length).

Value Params
cols

Sequence with column names.

Returns

Groups.

See also
Since

0.1.0

Note

Undefined (null) grouping values are ignored. Each Double.NaN value represents an individual group.

def groupByCol[T](col: String)(implicit evidence$17: Typeable[T], evidence$18: ClassTag[T]): Groups[T]
Implicitly added by toDataFrame

Groups the DataFrame by a typed column.

Groups the DataFrame by a typed column.

Value Params
col

Column name.

Returns

Groups.

Throws
ColumnNotFoundException

If the a column does not exist.

See also
Since

0.1.0

Note

Undefined (null) grouping values are ignored. Each Double.NaN value represents an individual group.

def groupByCol[T1, T2](col1: String, col2: String)(implicit evidence$19: Typeable[T1], evidence$20: ClassTag[T1], evidence$21: Typeable[T2], evidence$22: ClassTag[T2]): Groups[(T1, T2)]
Implicitly added by toDataFrame

Groups the DataFrame by typed columns.

Groups the DataFrame by typed columns.

Value Params
col1

Column name.

col2

Column name.

Returns

Groups.

See also
Since

0.1.0

Note

Undefined (null) grouping values are ignored. Each Double.NaN value represents an individual group.

def groupByCol[T1, T2, T3](col1: String, col2: String, col3: String)(implicit evidence$23: Typeable[T1], evidence$24: ClassTag[T1], evidence$25: Typeable[T2], evidence$26: ClassTag[T2], evidence$27: Typeable[T3], evidence$28: ClassTag[T3]): Groups[(T1, T2, T3)]
Implicitly added by toDataFrame

Groups the DataFrame by typed columns.

Groups the DataFrame by typed columns.

Value Params
col1

Column name.

col2

Column name.

col3

Column name.

Returns

Groups.

See also
Since

0.1.0

Note

Undefined (null) grouping values are ignored. Each Double.NaN value represents an individual group.

def groupByColOption[T](col: String)(implicit evidence$29: Typeable[T], evidence$30: ClassTag[T]): Groups[Option[T]]
Implicitly added by toDataFrame

Groups the DataFrame by a typed column including undefined values.

Groups the DataFrame by a typed column including undefined values.

Value Params
col

Column name.

Returns

Groups.

Throws
ColumnNotFoundException

If the a column does not exist.

See also
Since

0.1.0

Note

Undefined (null) grouping are assembled in one group. Each Double.NaN value represents an individual group.

def groupByColOption[T1, T2](col1: String, col2: String)(implicit evidence$31: Typeable[T1], evidence$32: ClassTag[T1], evidence$33: Typeable[T2], evidence$34: ClassTag[T2]): Groups[(Option[T1], Option[T2])]
Implicitly added by toDataFrame

Groups the DataFrame by typed columns including undefined values.

Groups the DataFrame by typed columns including undefined values.

Value Params
col1

Column name.

col2

Column name.

Returns

Groups.

See also
Since

0.1.0

Note

Undefined (null) grouping are assembled in one group. Each Double.NaN value represents an individual group.

def groupByColOption[T1, T2, T3](col1: String, col2: String, col3: String)(implicit evidence$35: Typeable[T1], evidence$36: ClassTag[T1], evidence$37: Typeable[T2], evidence$38: ClassTag[T2], evidence$39: Typeable[T3], evidence$40: ClassTag[T3]): Groups[(Option[T1], Option[T2], Option[T3])]
Implicitly added by toDataFrame

Groups the DataFrame by typed columns including undefined values.

Groups the DataFrame by typed columns including undefined values.

Value Params
col1

Column name.

col2

Column name.

col3

Column name.

Returns

Groups.

See also
Since

0.1.0

Note

Undefined (null) grouping are assembled in one group. Each Double.NaN value represents an individual group.

def groupByOption(col: String): Groups[Option[Any]]
Implicitly added by toDataFrame

Groups the DataFrame by a column including undefined values.

Groups the DataFrame by a column including undefined values.

Value Params
col

Column name.

Returns

Groups.

Throws
ColumnNotFoundException

If the a column does not exist.

See also
Since

0.1.0

Note

Undefined (null) grouping are assembled in one group. Each Double.NaN value represents an individual group.

def groupByOption(col1: String, col2: String): Groups[(Option[Any], Option[Any])]
Implicitly added by toDataFrame

Groups the DataFrame by columns including undefined values.

Groups the DataFrame by columns including undefined values.

Value Params
col1

Column name.

col2

Column name.

Returns

Groups.

See also
Since

0.1.0

Note

Undefined (null) grouping are assembled in one group. Each Double.NaN value represents an individual group.

def groupByOption(col1: String, col2: String, col3: String): Groups[(Option[Any], Option[Any], Option[Any])]
Implicitly added by toDataFrame

Groups the DataFrame by columns including undefined values.

Groups the DataFrame by columns including undefined values.

Value Params
col1

Column name.

col2

Column name.

col3

Column name.

Returns

Groups.

See also
Since

0.1.0

Note

Undefined (null) grouping are assembled in one group. Each Double.NaN value represents an individual group.

def groupByOption(cols: Seq[String]): Groups[Seq[Option[Any]]]
Implicitly added by toDataFrame

Groups the DataFrame by a sequence of columns (of arbitrary length) including undefined values.

Groups the DataFrame by a sequence of columns (of arbitrary length) including undefined values.

Value Params
cols

Sequence with column names.

Returns

Groups.

See also
Since

0.1.0

Note

Undefined (null) grouping are assembled in one group. Each Double.NaN value represents an individual group.

def indexBy[T](col: String)(implicit evidence$41: Typeable[T], evidence$42: ClassTag[T]): DataMap[T]
Implicitly added by toDataFrame

Indexes the DataFrame by a (typed) column.

Indexes the DataFrame by a (typed) column.

Value Params
col

Column name.

Returns

DataMap.

Throws
ColumnNotFoundException

If the a column does not exist.

See also
Since

0.1.0

Note

Undefined (null) grouping values are ignored. Each Double.NaN value represents an individual group.

def indexBy[T1, T2](col1: String, col2: String)(implicit evidence$43: Typeable[T1], evidence$44: ClassTag[T1], evidence$45: Typeable[T2], evidence$46: ClassTag[T2]): DataMap[(T1, T2)]
Implicitly added by toDataFrame

Indexes the DataFrame by (typed) columns.

Indexes the DataFrame by (typed) columns.

Value Params
col1

Column name.

col2

Column name.

Returns

DataMap.

See also
Since

0.1.0

Note

Undefined (null) grouping values are ignored. Each Double.NaN value represents an individual group.

def indexBy[T1, T2, T3](col1: String, col2: String, col3: String)(implicit evidence$47: Typeable[T1], evidence$48: ClassTag[T1], evidence$49: Typeable[T2], evidence$50: ClassTag[T2], evidence$51: Typeable[T3], evidence$52: ClassTag[T3]): DataMap[(T1, T2, T3)]
Implicitly added by toDataFrame

Indexes the DataFrame by (typed) columns.

Indexes the DataFrame by (typed) columns.

Value Params
col1

Column name.

col2

Column name.

col3

Column name.

Returns

DataMap.

See also
Since

0.1.0

Note

Undefined (null) grouping values are ignored. Each Double.NaN value represents an individual group.

def indexBy(cols: Seq[String]): DataMap[Seq[Any]]
Implicitly added by toDataFrame

Indexes the DataFrame by a sequence of columns (of arbitrary length).

Indexes the DataFrame by a sequence of columns (of arbitrary length).

Value Params
cols

Sequence with column names.

Returns

DataMap.

Throws
IllegalOperation

If sequence of columns is empty.

See also
Since

0.1.0

Note

Undefined (null) grouping values are ignored. Each Double.NaN value represents an individual group.

def indexByOption[T](col: String)(implicit evidence$53: Typeable[T], evidence$54: ClassTag[T]): DataMap[Option[T]]
Implicitly added by toDataFrame

Groups the DataFrame by a (typed) column including undefined values.

Groups the DataFrame by a (typed) column including undefined values.

Value Params
col

Column name.

Returns

DataMap with Option keys.

Throws
ColumnNotFoundException

If the a column does not exist.

See also
Since

0.1.0

Note

Undefined (null) grouping are assembled in one group. Each Double.NaN value represents an individual group.

def indexByOption[T1, T2](col1: String, col2: String)(implicit evidence$55: Typeable[T1], evidence$56: ClassTag[T1], evidence$57: Typeable[T2], evidence$58: ClassTag[T2]): DataMap[(Option[T1], Option[T2])]
Implicitly added by toDataFrame

Indexes the DataFrame by (typed) columns including undefined values.

Indexes the DataFrame by (typed) columns including undefined values.

Value Params
col1

Column name.

col2

Column name.

Returns

DataMap with Option keys.

See also
Since

0.1.0

Note

Undefined (null) grouping are assembled in one group. Each Double.NaN value represents an individual group.

def indexByOption[T1, T2, T3](col1: String, col2: String, col3: String)(implicit evidence$59: Typeable[T1], evidence$60: ClassTag[T1], evidence$61: Typeable[T2], evidence$62: ClassTag[T2], evidence$63: Typeable[T3], evidence$64: ClassTag[T3]): DataMap[(Option[T1], Option[T2], Option[T3])]
Implicitly added by toDataFrame

Indexes the DataFrame by (typed) columns including undefined values.

Indexes the DataFrame by (typed) columns including undefined values.

Value Params
col1

Column name.

col2

Column name.

col3

Column name.

Returns

DataMap with Option keys.

See also
Since

0.1.0

Note

Undefined (null) grouping are assembled in one group. Each Double.NaN value represents an individual group.

def indexIterator: Iterator[Int]
Implicitly added by toDataFrame

Iterator over the index.

Iterator over the index.

Returns

Iterator over the index in the current order.

Since

0.1.0

def info: String
Implicitly added by toDataFrame

Information string describing the DataFrame.

Information string describing the DataFrame.

Returns

Info string.

Since

0.1.0

Implicitly added by toDataFrame

Adapter to write the DataFrame to a specific output. Use e.g.

Adapter to write the DataFrame to a specific output. Use e.g.

import pd.io.parquet.implicits

to import the respective format and read a DataFrame via

df.io.parquet.write(...)

import pd.io.parquet.implicits }}} respective format and read a DataFrame via

df.io.parquet.write(...)
Returns

WriteAdapter.

See also
Since

0.1.0

def joinInner(df: DataFrame, cols: String*): DataFrame
Implicitly added by toDataFrame

Performs an inner join equivalent to a SQL inner join with respect to the specified key columns.

Performs an inner join equivalent to a SQL inner join with respect to the specified key columns.

Value Params
cols

Key columns to be joined.

df

DataFrame to be joined.

Returns

The joined DataFrame.

Throws
IllegalOperation

If not at least one key column is specified.

See also
Since

0.1.0

Note
  • Columns in df are dropped that are not key columns but appear also in the original DataFrame. To keep these columns in the results, the columns must be renamed first.
  • The comparison of two undefined values (null) is false and does not satisfy a join condition.
def joinLeft(df: DataFrame, cols: String*): DataFrame
Implicitly added by toDataFrame

Performs a left join equivalent to a SQL outer left join with respect to the specified key columns.

Performs a left join equivalent to a SQL outer left join with respect to the specified key columns.

Value Params
cols

Key columns to be joined.

df

DataFrame to be joined.

Returns

The joined DataFrame.

Throws
IllegalOperation

If not at least one key column is specified.

See also
Since

0.1.0

Note
  • Columns in df are dropped that are not key columns but appear also in the original DataFrame. To keep these columns in the results, the columns must be renamed first.
  • The comparison of two undefined values (null) is false and does not satisfy a join condition.
def joinOuter(df: DataFrame, cols: String*): DataFrame
Implicitly added by toDataFrame

Performs an outer join equivalent to a SQL full outer join with respect to the specified key columns.

Performs an outer join equivalent to a SQL full outer join with respect to the specified key columns.

Value Params
cols

Key columns to be joined.

df

DataFrame to be joined.

Returns

The joined DataFrame.

Throws
IllegalOperation

If not at least one key column is specified.

See also
Since

0.1.0

Note
  • Columns in df are dropped that are not key columns but appear also in the original DataFrame. To keep these columns in the results, the columns must be renamed first.
  • The comparison of two undefined values (null) is false and does not satisfy a join condition.
def joinRight(df: DataFrame, cols: String*): DataFrame
Implicitly added by toDataFrame

Performs a right join equivalent to a SQL outer right join with respect to the specified key columns.

Performs a right join equivalent to a SQL outer right join with respect to the specified key columns.

Value Params
cols

Key columns to be joined.

df

DataFrame to be joined.

Returns

The joined DataFrame.

Throws
IllegalOperation

If not at least one key column is specified.

See also
Since

0.1.0

Note
  • Columns in df are dropped that are not key columns but appear also in the original DataFrame. To keep these columns in the results, the columns must be renamed first.
  • The comparison of two undefined values (null) is false and does not satisfy a join condition.
Implicitly added by toDataFrame

Merges two DataFrame by index.

Merges two DataFrame by index.

Value Params
df

DataFrame to merge.

Returns

DataFrame with all columns of df concatenated on the right side.

Since

0.1.0

Note
  • Columns with the same name are replaced by the rightmost column.
  • The order of index positions is only preserved if the indices are equal.
  • Use the join methods if you intend to join via column values.
def merge(series: Series[_]*): DataFrame
Implicitly added by toDataFrame

Merges a Series into a DataFrame by index.

Merges a Series into a DataFrame by index.

Value Params
series

Series to concatenate.

Returns

DataFrame with all columns of df concatenated on the right side.

Since

0.1.0

Note
  • Columns with the same name are replaced by the rightmost column.
  • The order of index positions is only preserved if all indices are equal.
  • Use the join methods if you intend to join via column values.
def numCols: Int
Implicitly added by toDataFrame

Number of columns.

Number of columns.

Returns

Number of columns.

Since

0.1.0

def numRows: Int
Implicitly added by toDataFrame

Number of rows.

Number of rows.

Returns

Length of the Series, i.e. number of elements in the index.

Since

0.1.0

def numRowsBase: Int
Implicitly added by toDataFrame

Number of rows of the underlying data vectors.

Number of rows of the underlying data vectors.

Returns

Number of elements in the base index.

Since

0.1.0

def plot: Plot
Implicitly added by toDataFrame

Creates a plot.

Creates a plot.

Returns

Plot object.

Since

0.1.0

Implicitly added by toDataFrame

Requirement object which is used to throw exceptions if conditions are not met.

Requirement object which is used to throw exceptions if conditions are not met.

Returns

Requirement object.

Since

0.1.0

Implicitly added by toDataFrame

Resets the index to a UniformIndex with index positions 0 to numRows - 1 while keeping the order of the elements. If the current index is not a uniform index, the columns are copied into new vectors with the order of the current index.

Resets the index to a UniformIndex with index positions 0 to numRows - 1 while keeping the order of the elements. If the current index is not a uniform index, the columns are copied into new vectors with the order of the current index.

Returns

DataFrame with a UniformIndex.

See also

sortIndex for sorting the index by index positions.

Since

0.1.0

def select(cols: Array[String]): DataFrame
Implicitly added by toDataFrame

Extracts DataFrame with selected columns.

Extracts DataFrame with selected columns.

Value Params
cols

Column names.

Returns

DataFrame with selected column.

Throws
ColumnNotFoundException

If a column is not found.

See also
Since

0.1.0

@targetName("selectSeq")
def select(cols: Seq[String]): DataFrame
Implicitly added by toDataFrame

Extracts DataFrame with selected columns.

Extracts DataFrame with selected columns.

Value Params
cols

Column names.

Returns

DataFrame with selected column.

Throws
ColumnNotFoundException

If a column is not found.

See also
Since

0.1.0

@targetName("selectCols")
def select(cols: String*): DataFrame
Implicitly added by toDataFrame

Extracts DataFrame with selected columns.

Extracts DataFrame with selected columns.

Value Params
cols

Column names.

Returns

DataFrame with selected column.

Throws
ColumnNotFoundException

If a column is not found.

See also
Since

0.1.0

@unused
def selectDynamic(col: String): Series[Any]
Implicitly added by toDataFrame

Selects a column via dot notation.

Selects a column via dot notation.

Value Params
col

Column name.

Returns

Column as a Series with the name of the column.

Throws
ColumnNotFoundException

If the column is not found.

See also
Since

0.1.0

Example
df.myColumn
def show(n: Int, width: Int, annotateIndex: Boolean, annotateType: Boolean, colWidth: Int): Unit
Implicitly added by toDataFrame

Prints the DataFrame as a table.

Prints the DataFrame as a table.

Value Params
annotateIndex

If true, the an index column is displayed.

annotateType

If true, the type for each column in displayed.

colWidth

The width of each column.

n

The maximal numbers of rows.

width

The maximal width of a line.

See also
Since

0.1.0

@targetName("sortValuesCols")
def sortValues(cols: String*): DataFrame
Implicitly added by toDataFrame

Sorts the DataFrame in ascending order with respect to columns cols for supported types.

Sorts the DataFrame in ascending order with respect to columns cols for supported types.

Value Params
cols

Columns which defines the order.

Returns

DataFrame with sorted index.

Since

0.1.0

Note
  • The sorting algorithm is stable.
  • String values are sorted lexicographically ignoring case differences (see String.compareToIgnoreCase).
def sortValues(keys: (String, Order)*): DataFrame
Implicitly added by toDataFrame

Sorts the DataFrame with respect to columns cols for supported types.

Sorts the DataFrame with respect to columns cols for supported types.

Value Params
keys

Tuple column -> order with column name and order for sorting. Possible values are Order.asc, Order.desc, Order.ascNullsFirst and Order.descNullsFirst

Returns

Series with sorted index.

Since

0.1.0

Note
  • The sorting algorithm is stable.
  • String values are sorted lexicographically ignoring case differences (see String.compareToIgnoreCase).
def sorted[T](col: String)(implicit ordering: Ordering[T]): DataFrame
Implicitly added by toDataFrame

Sorts the DataFrame in ascending order with respect to column col.

Sorts the DataFrame in ascending order with respect to column col.

Value Params
col

Column which defines the order.

ordering

Implicit ordering that must exist for the type T, i.e. classes must extend the trait Ordered.

Returns

DataFrame with sorted index.

Since

0.1.0

Note
  • The sorting algorithm is stable.
  • String values are sorted lexicographically ignoring case differences (see String.compareToIgnoreCase).
def sorted[T1, T2](col1: String, col2: String)(implicit ordering1: Ordering[T1], ordering2: Ordering[T2]): DataFrame
Implicitly added by toDataFrame

Sorts the DataFrame in ascending order with respect to tow columns.

Sorts the DataFrame in ascending order with respect to tow columns.

Value Params
col1

Column which defines the order.

col2

Column which defines the secondary order.

ordering1

Implicit ordering that must exist for the type T1, i.e. classes must extend the trait Ordered.

ordering2

Implicit ordering that must exist for the type T2, i.e. classes must extend the trait Ordered.

Returns

DataFrame with sorted index.

Since

0.1.0

Note
  • The sorting algorithm is stable.
  • String values are sorted lexicographically ignoring case differences (see String.compareToIgnoreCase).
def sorted[T1, T2, T3](col1: String, col2: String, col3: String)(implicit ordering1: Ordering[T1], ordering2: Ordering[T2], ordering3: Ordering[T3]): DataFrame
Implicitly added by toDataFrame

Sorts the DataFrame in ascending order with respect to three columns.

Sorts the DataFrame in ascending order with respect to three columns.

Value Params
col1

Column which defines the order.

col2

Column which defines the secondary order.

col3

Column which defines the trinary order.

ordering1

Implicit ordering that must exist for the type T1, i.e. classes must extend the trait Ordered.

ordering2

Implicit ordering that must exist for the type T2, i.e. classes must extend the trait Ordered.

ordering3

Implicit ordering that must exist for the type T3, i.e. classes must extend the trait Ordered.

Returns

DataFrame with sorted index.

Since

0.1.0

Note
  • The sorting algorithm is stable.
  • String values are sorted lexicographically ignoring case differences (see String.compareToIgnoreCase).
def sorted[T](key: (String, Order))(implicit ordering: Ordering[T]): DataFrame
Implicitly added by toDataFrame

Sorts the DataFrame with respect to a column.

Sorts the DataFrame with respect to a column.

Value Params
key

Tuple column -> order with column name and order for sorting. Possible values are Order.asc, Order.desc, Order.ascNullsFirst and Order.descNullsFirst. Implicit ordering must exist for the type T, i.e. classes must extend the trait Ordered.

Returns

Series with sorted index.

Since

0.1.0

Note
  • The sorting algorithm is stable.
  • String values are sorted lexicographically ignoring case differences (see String.compareToIgnoreCase).
def sorted[T1, T2](key1: (String, Order), key2: (String, Order))(implicit ordering1: Ordering[T1], ordering2: Ordering[T2]): DataFrame
Implicitly added by toDataFrame

Sorts the DataFrame with respect to a column.

Sorts the DataFrame with respect to a column.

Value Params
key1

Tuple column -> order with column name and order for sorting. Possible values are Order.asc, Order.desc, Order.ascNullsFirst and Order.descNullsFirst. Implicit ordering must exist for the type T, i.e. classes must extend the trait Ordered.

key2

Secondary sorting key. See key1 for details.

ordering1

Implicit ordering that must exist for the type T1, i.e. classes must extend the trait Ordered.

ordering2

Implicit ordering that must exist for the type T2, i.e. classes must extend the trait Ordered.

Returns

Series with sorted index.

Since

0.1.0

Note
  • The sorting algorithm is stable.
  • String values are sorted lexicographically ignoring case differences (see String.compareToIgnoreCase).
def sorted[T1, T2, T3](key1: (String, Order), key2: (String, Order), key3: (String, Order))(implicit ordering1: Ordering[T1], ordering2: Ordering[T2], ordering3: Ordering[T3]): DataFrame
Implicitly added by toDataFrame

Sorts the DataFrame with respect to a column.

Sorts the DataFrame with respect to a column.

Value Params
key1

Tuple column -> order with column name and order for sorting. Possible values are Order.asc, Order.desc, Order.ascNullsFirst and Order.descNullsFirst. Implicit ordering must exist for the type T, i.e. classes must extend the trait Ordered.

key2

Secondary sorting key. See key1 for details.

key3

Trinary sorting key. See key1 for details.

ordering1

Implicit ordering that must exist for the type T1, i.e. classes must extend the trait Ordered.

ordering2

Implicit ordering that must exist for the type T2, i.e. classes must extend the trait Ordered.

ordering3

Implicit ordering that must exist for the type T3, i.e. classes must extend the trait Ordered.

Returns

Series with sorted index.

Since

0.1.0

Note
  • The sorting algorithm is stable.
  • String values are sorted lexicographically ignoring case differences (see String.compareToIgnoreCase).
def toArray[T](col: String)(implicit evidence$65: RequireType[T], evidence$66: Typeable[T], evidence$67: ClassTag[T]): Array[Option[T]]
Implicitly added by toDataFrame

Copies a column into an array.

Copies a column into an array.

Value Params
col

Column name.

Returns

Array of type Option[T] with numRows elements.

See also
Since

0.1.0

inline def toDf: DataFrame

Converts the DataSet into a regular DataFrame object.

Converts the DataSet into a regular DataFrame object.

Returns

DataFrame.

Since

0.1.0

def toFlatArray[T](col: String)(implicit evidence$68: RequireType[T], evidence$69: Typeable[T], evidence$70: ClassTag[T]): Array[T]
Implicitly added by toDataFrame

Copies a column into an array.

Copies a column into an array.

Value Params
col

Column name.

Returns

Array of type T.

See also
Since

0.1.0

def toFlatList[T](col: String)(implicit evidence$71: RequireType[T], evidence$72: Typeable[T], evidence$73: ClassTag[T]): List[T]
Implicitly added by toDataFrame

Copies a column into a List.

Copies a column into a List.

Value Params
col

Column name.

Returns

List of type T.

See also
Since

0.1.0

def toFlatSeq[T](col: String)(implicit evidence$74: RequireType[T], evidence$75: Typeable[T], evidence$76: ClassTag[T]): Seq[T]
Implicitly added by toDataFrame

Copies a column into a sequence.

Copies a column into a sequence.

Value Params
col

Column name.

Returns

Sequence of type T.

See also
Since

0.1.0

def toList[T](col: String)(implicit evidence$77: RequireType[T], evidence$78: Typeable[T], evidence$79: ClassTag[T]): List[Option[T]]
Implicitly added by toDataFrame

Copies a column into a List.

Copies a column into a List.

Value Params
col

Column name.

Returns

List of type List[T] with numRows elements.

See also
Since

0.1.0

def toSeq[T](col: String)(implicit evidence$80: RequireType[T], evidence$81: Typeable[T], evidence$82: ClassTag[T]): Seq[Option[T]]
Implicitly added by toDataFrame

Copies a column into a sequence.

Copies a column into a sequence.

Value Params
col

Column name.

Returns

Sequence of type Seq[T] with numRows elements.

See also
Since

0.1.0

def toString(n: Int, width: Int, annotateIndex: Boolean, annotateType: Boolean, colWidth: Int, indexWidth: Int): String
Implicitly added by toDataFrame

Renders the DataFrame as a table.

Renders the DataFrame as a table.

Value Params
annotateIndex

If true, the an index column is displayed.

annotateType

If true, the type for each column in displayed.

colWidth

The width of each column.

indexWidth

The width of the index colum.

n

The maximal numbers of rows.

width

The maximal width of a line.

Returns

Formatted table.

Since

0.1.0

Implicitly added by toDataFrame

Appends row-wise one or multiple DataFrame objects. The method materializes all indices into a uniform index.

Appends row-wise one or multiple DataFrame objects. The method materializes all indices into a uniform index.

Value Params
df

DataFrame object to be appended.

Returns

DataFrame with exactly the same columns.

Throws
ColumnNotFoundException

If the DataFrame objects to be appended don't have the same columns.

SeriesCastException

If the underlying types of the columns do not match.

Since

0.1.0

def update[T](col: String, series: Series[T]): DataFrame
Implicitly added by toDataFrame

Updates a column with all values that are defined in the Series.

Updates a column with all values that are defined in the Series.

Value Params
col

Column name to be updated.

series

Series.

Returns

DataFrame with updated column.

Throws
MergeIndexException

If indices are not compatible.

SeriesCastException

If the data type does not match with the existing column.

See also
Since

0.1.0

Note
  • The index of the Series must be included in the DataFrame.
  • The index of the DataFrame is not altered.
def update[T](series: Series[T]): DataFrame
Implicitly added by toDataFrame

Updates a column with all values that are defined in the Series. The name of the Series defines the column.

Updates a column with all values that are defined in the Series. The name of the Series defines the column.

Value Params
series

Series.

Returns

DataFrame with updated column.

Throws
MergeIndexException

If indices are not compatible.

SeriesCastException

If the data type does not match with the existing column.

See also
Since

0.1.0

Note
  • The index of the Series must be included in the DataFrame.
  • The index of the DataFrame is not altered.
def update(series: Series[_]*): DataFrame
Implicitly added by toDataFrame

Updates a columns with all values that are defined in the Series objects. The names of the Series define the column names.

Updates a columns with all values that are defined in the Series objects. The names of the Series define the column names.

Value Params
series

Series.

Returns

DataFrame with updated column.

Throws
MergeIndexException

If indices are not compatible.

SeriesCastException

If the data types do not match with the existing column.

See also
Since

0.1.0

Note
  • The index of the Series must be included in the DataFrame.
  • The index of the DataFrame is not altered.
def valueCounts(cols: String*): DataFrame
Implicitly added by toDataFrame

Counts the number of rows for unique value pairs in key columns (including undefined key values).

Counts the number of rows for unique value pairs in key columns (including undefined key values).

Value Params
cols

Key columns.

Returns

DataFrame with the key columns and a "count" column.

Since

0.1.0

def valueCounts(cols: Seq[String], countCol: String, dropUndefined: Boolean, order: Order, asFraction: Boolean): DataFrame
Implicitly added by toDataFrame

Counts the number of rows for unique value pairs in key columns.

Counts the number of rows for unique value pairs in key columns.

Value Params
asFraction

If true, it returns the total fraction as Double for an unique key value relative to the total number of rows (including undefined values). Otherwise the number of rows is returned as Int column.

cols

Key columns.

countCol

Name of the resulting column.

dropUndefined

If true, drops rows with undefined values in a key column (the default includes undefined values).

order

Order of the countCol column.

Returns

DataFrame with the key columns and the countCol column.

Since

0.1.0

@targetName("concat")
def |[T](series: Series[T]): DataFrame
Implicitly added by toDataFrame

Concatenates the DataFrame on the left and the Series on the right hand side.

Concatenates the DataFrame on the left and the Series on the right hand side.

Value Params
series

Series to be concatenated.

Returns

DataFrame, where the Series is the last column.

Throws
MergeIndexException

If indices are not compatible.

See also
Since

0.1.0

Note
  • Columns with the same name are replaced by the rightmost column.
  • The index of the right Series must be included in the left DataFrame.
  • Data on the right hand side might be copied if indices are not equivalent.
  • The resulting index is equivalent to the left operand.
  • The operator is equivalent to the method col.
@targetName("concat")
def |[T](namedSeries: (String, Series[T])): DataFrame
Implicitly added by toDataFrame

Concatenates the DataFrame on the left and the Series on the right hand side.

Concatenates the DataFrame on the left and the Series on the right hand side.

Value Params
namedSeries

Tuple of column name and Series to be concatenated.

Returns

DataFrame, where the Series is the last column.

Throws
MergeIndexException

If indices are not compatible.

See also
Since

0.1.0

Note
  • Columns with the same name are replaced by the rightmost column.
  • The index of the right Series must be included in the left DataFrame.
  • Data on the right hand side might be copied if indices are not equivalent.
  • The resulting index is equivalent to the left operand.
@targetName("concat")
def |(df: DataFrame): DataFrame
Implicitly added by toDataFrame

Concatenates the DataFrame on the left and the DataFrame on the right hand side.

Concatenates the DataFrame on the left and the DataFrame on the right hand side.

Value Params
df

DataFrame to be concatenated.

Returns

DataFrame, where the left DataFrame columns are prior to the columns of the right DataFrame.

Throws
MergeIndexException

If indices are not compatible.

See also
Since

0.1.0

Note
  • Columns with the same name are replaced by the rightmost column.
  • The index of the right DataFrame must be included in the left DataFrame.
  • Data on the right hand side might be copied if indices are not equivalent.
  • The resulting index is equivalent to the left operand.

Inherited methods

def apply(series: Series[Boolean]): DataFrame
Implicitly added by toDataFrame

Slices the index by intersecting the current index with a boolean Series.

Slices the index by intersecting the current index with a boolean Series.

Value Params
series

Boolean Series as mask, where only index positions kept that are true.

Returns

Object with sliced index and order of series.

See also
Since

0.1.0

Inherited from
IndexOps
def apply(array: Array[Int]): DataFrame
Implicitly added by toDataFrame

Slices the index by intersecting it with an array of index positions.

Slices the index by intersecting it with an array of index positions.

Value Params
array

Array of index positions.

Returns

Object with sliced index and order of array.

See also
Since

0.1.0

Inherited from
IndexOps
def apply(seq: Seq[Int]): DataFrame
Implicitly added by toDataFrame

Slices the index by intersecting it with a sequence of index positions.

Slices the index by intersecting it with a sequence of index positions.

Value Params
seq

Sequence of index positions.

Returns

Object with sliced index and order of seq.

See also
Since

0.1.0

Inherited from
IndexOps
def apply(range: Range): DataFrame
Implicitly added by toDataFrame

Slices the index by intersecting it with a range.

Slices the index by intersecting it with a range.

Value Params
range

Range.

Returns

Object with sliced index with ascending order.

See also
Since

0.1.0

Inherited from
IndexOps
def apply(series: Series[Boolean]): T

Slices the index by intersecting the current index with a boolean Series.

Slices the index by intersecting the current index with a boolean Series.

Value Params
series

Boolean Series as mask, where only index positions kept that are true.

Returns

Object with sliced index and order of series.

See also
Since

0.1.0

Inherited from
IndexOps
def apply(array: Array[Int]): T

Slices the index by intersecting it with an array of index positions.

Slices the index by intersecting it with an array of index positions.

Value Params
array

Array of index positions.

Returns

Object with sliced index and order of array.

See also
Since

0.1.0

Inherited from
IndexOps
def apply(seq: Seq[Int]): T

Slices the index by intersecting it with a sequence of index positions.

Slices the index by intersecting it with a sequence of index positions.

Value Params
seq

Sequence of index positions.

Returns

Object with sliced index and order of seq.

See also
Since

0.1.0

Inherited from
IndexOps
def apply(range: Range): T

Slices the index by intersecting it with a range.

Slices the index by intersecting it with a range.

Value Params
range

Range.

Returns

Object with sliced index with ascending order.

See also
Since

0.1.0

Inherited from
IndexOps
def head(n: Int): DataFrame
Implicitly added by toDataFrame

Head of object.

Head of object.

Value Params
n

Number of rows.

Returns

First n rows in index.

See also
Since

0.1.0

Inherited from
IndexOps
def head(n: Int): T

Head of object.

Head of object.

Value Params
n

Number of rows.

Returns

First n rows in index.

See also
Since

0.1.0

Inherited from
IndexOps
Implicitly added by toDataFrame

Sorts the index (ascending).

Sorts the index (ascending).

Returns

Object with sorted index.

See also
Since

0.1.0

Inherited from
IndexOps
def sortIndex: T

Sorts the index (ascending).

Sorts the index (ascending).

Returns

Object with sorted index.

See also
Since

0.1.0

Inherited from
IndexOps
def tail(n: Int): DataFrame
Implicitly added by toDataFrame

Tail of object.

Tail of object.

Value Params
n

Number of rows.

Returns

Last n rows in index.

See also
Since

0.1.0

Inherited from
IndexOps
def tail(n: Int): T

Tail of object.

Tail of object.

Value Params
n

Number of rows.

Returns

Last n rows in index.

See also
Since

0.1.0

Inherited from
IndexOps