org.apache.spark.ml.regression.odkl
Suggested depth for treeAggregate (greater than or equal to 2).
Suggested depth for treeAggregate (greater than or equal to 2). If the dimensions of features or the number of partitions are large, this param could be adjusted to a larger size. Default is 2.
Set the ElasticNet mixing parameter.
Set the ElasticNet mixing parameter. For alpha = 0, the penalty is an L2 penalty. For alpha = 1, it is an L1 penalty. For alpha in (0,1), the penalty is a combination of L1 and L2. Default is 0.0 which is an L2 penalty.
Note: Fitting with huber loss only supports None and L2 regularization, so throws exception if this param is non-zero value.
Sets the value of param epsilon.
Sets the value of param epsilon. Default is 1.35.
Set if we should fit the intercept.
Set if we should fit the intercept. Default is true.
Sets the value of param loss.
Sets the value of param loss. Default is "squaredError".
Set the maximum number of iterations.
Set the maximum number of iterations. Default is 100.
Set the regularization parameter.
Set the regularization parameter. Default is 0.0.
Set the solver algorithm used for optimization.
Set the solver algorithm used for optimization. In case of linear regression, this can be "l-bfgs", "normal" and "auto".
LinearRegression.MAX_FEATURES_FOR_NORMAL_SOLVER.Note: Fitting with huber loss doesn't support normal solver, so throws exception if this param was set with "normal".
Whether to standardize the training features before fitting the model.
Whether to standardize the training features before fitting the model. The coefficients of models will be always returned on the original scale, so it will be transparent for users. Default is true.
With/without standardization, the models should be always converged to the same solution when no regularization is applied. In R's GLMNET package, the default behavior is true as well.
Set the convergence tolerance of iterations.
Set the convergence tolerance of iterations. Smaller value will lead to higher accuracy with the cost of more iterations. Default is 1E-6.
Whether to over-/under-sample training instances according to the given weights in weightCol.
Whether to over-/under-sample training instances according to the given weights in weightCol. If not set or empty, all instances are treated equally (weight 1.0). Default is not set, so all instances have weight one.
Simple wrapper around the SparkML linear regression used to attach summary blocks. TODO: Add unit tests