Package-level declarations

Types

Link copied to clipboard
data class Adam(val alpha: Double = 0.001, val beta1: Double = 0.9, val beta2: Double = 0.999, val epsilon: Double = 1.0E-8, val statistics: Statistics) : MultiPassOptimizer, Statistics

Adam optimizer. Based on research paper: https://arxiv.org/pdf/1412.6980

Link copied to clipboard
data class ConstantLearningRate(val learningRate: Float) : LearningRateSchedule
Link copied to clipboard
open class GradientDescent(val learningRate: LearningRateSchedule, val entropy: Random = Random.Default) : SinglePassOptimizer

An optimizer that works by caching calculations during the forward pass and calculating gradients during the backward pass.

Link copied to clipboard

Allows an SinglePassOptimizer to delegate the learning rate to increase composability of different optimizers.

Link copied to clipboard

An optimizer that performs multiple passes over training data, updating the model parameters multiple times per epoch.

Link copied to clipboard
interface Optimizer

A Trainer will track the activations and derivatives of the model during the forward pass and provide them to the SinglePassOptimizer to update the model parameters.

Link copied to clipboard

Implemented per lecture slides: https://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf

Link copied to clipboard

An optimizer that only performs a single pass over training data before updating model parameters.

Link copied to clipboard
Link copied to clipboard
class StochasticGradientDescent(val batchSize: Int, val learningRate: LearningRateSchedule, val entropy: Random = Random.Default, val discardExtras: Boolean = false) : GradientDescent

Stochastic Gradient Descent (SGD) optimizer with adjustable learning rate.

Link copied to clipboard
data class WarmRestartExponentialLearningRate(val initialLearningRate: Float, val decayMax: Float = 100.0f, val decayPeriod: Int = 10000) : LearningRateSchedule

This learning rate schedule begins at the initialLearningRate and decays exponentially until it reaches initialLearningRate / decayMax over decayPeriod epochs.