class CodegenContext extends Logging
A context for codegen, tracking a list of objects that could be passed into generated Java function.
- Alphabetic
- By Inheritance
- CodegenContext
- Logging
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
- new CodegenContext()
Type Members
-
class
MutableStateArrays extends AnyRef
This class holds a set of names of mutableStateArrays that is used for compacting mutable states for a certain type, and holds the next available slot of the current compacted array.
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
var
INPUT_ROW: String
Holding the variable name of the input row of the current operator, will be used by
BoundReferenceto generate code.Holding the variable name of the input row of the current operator, will be used by
BoundReferenceto generate code.Note that if
currentVarsis not null,BoundReferencepreferscurrentVarsoverINPUT_ROWto generate code. If you want to make sure the generated code useINPUT_ROW, you need to setcurrentVarsto null, or setcurrentVars(i)to null for certain columns, before callingExpression.genCode. -
def
addBufferedState(dataType: DataType, variableName: String, initCode: String): ExprCode
Add buffer variable which stores data coming from an InternalRow.
Add buffer variable which stores data coming from an InternalRow. This methods guarantees that the variable is safely stored, which is important for (potentially) byte array backed data types like: UTF8String, ArrayData, MapData & InternalRow.
-
def
addImmutableStateIfNotExists(javaType: String, variableName: String, initFunc: (String) ⇒ String = _ => ""): Unit
Add an immutable state as a field to the generated class only if it does not exist yet a field with that name.
Add an immutable state as a field to the generated class only if it does not exist yet a field with that name. This helps reducing the number of the generated class' fields, since the same variable can be reused by many functions.
Even though the added variables are not declared as final, they should never be reassigned in the generated code to prevent errors and unexpected behaviors.
Internally, this method calls
addMutableState.- javaType
Java type of the field.
- variableName
Name of the field.
- initFunc
Function includes statement(s) to put into the init() method to initialize this field. The argument is the name of the mutable state variable.
-
def
addInnerClass(code: String): Unit
Add extra source code to the outermost generated class.
Add extra source code to the outermost generated class.
- code
verbatim source code of the inner class to be added.
-
def
addMutableState(javaType: String, variableName: String, initFunc: (String) ⇒ String = _ => "", forceInline: Boolean = false, useFreshName: Boolean = true): String
Add a mutable state as a field to the generated class.
Add a mutable state as a field to the generated class. c.f. the comments above.
- javaType
Java type of the field. Note that short names can be used for some types, e.g. InternalRow, UnsafeRow, UnsafeArrayData, etc. Other types will have to specify the fully-qualified Java type name. See the code in doCompile() for the list of default imports available. Also, generic type arguments are accepted but ignored.
- variableName
Name of the field.
- initFunc
Function includes statement(s) to put into the init() method to initialize this field. The argument is the name of the mutable state variable. If left blank, the field will be default-initialized.
- forceInline
whether the declaration and initialization code may be inlined rather than compacted. Please set
trueinto forceInline for one of the followings:- use the original name of the status 2. expect to non-frequently generate the status (e.g. not much sort operators in one stage)
- useFreshName
If this is false and the mutable state ends up inlining in the outer class, the name is not changed
- returns
the name of the mutable state variable, which is the original name or fresh name if the variable is inlined to the outer class, or an array access if the variable is to be stored in an array of variables of the same type. A variable will be inlined into the outer class when one of the following conditions are satisfied:
- forceInline is true
2. its type is primitive type and the total number of the inlined mutable variables
is less than
OUTER_CLASS_VARIABLES_THRESHOLD3. its type is multi-dimensional array When a variable is compacted into an array, the max size of the array for compaction is given byMUTABLESTATEARRAY_SIZE_LIMIT.
- forceInline is true
2. its type is primitive type and the total number of the inlined mutable variables
is less than
-
def
addNewFunction(funcName: String, funcCode: String, inlineToOuterClass: Boolean = false): String
Adds a function to the generated class.
Adds a function to the generated class. If the code for the
OuterClassgrows too large, the function will be inlined into a new private, inner class, and a class-qualified name for the function will be returned. Otherwise, the function will be inlined to theOuterClassthe simplefuncNamewill be returned.- funcName
the class-unqualified name of the function
- funcCode
the body of the function
- inlineToOuterClass
whether the given code must be inlined to the
OuterClass. This can be necessary when a function is declared outside of the context it is eventually referenced and a returned qualified function name cannot otherwise be accessed.- returns
the name of the function, qualified by class if it will be inlined to a private, inner class
- def addPartitionInitializationStatement(statement: String): Unit
-
def
addReferenceObj(objName: String, obj: Any, className: String = null): String
Add an object to
references.Add an object to
references.Returns the code to access it.
This does not to store the object into field but refer it from the references field at the time of use because number of fields in class is limited so we should reduce it.
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
var
currentVars: Seq[ExprCode]
Holding a list of generated columns as input of current operator, will be used by BoundReference to generate code.
-
def
declareAddedFunctions(): String
Declares all function code.
Declares all function code. If the added functions are too many, split them into nested sub-classes to avoid hitting Java compiler constant pool limitation.
- def declareMutableStates(): String
-
def
emitExtraCode(): String
Emits extra inner classes added with addExtraCode
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
def
freshName(name: String): String
Returns a term name that is unique within this instance of a
CodegenContext. -
var
freshNamePrefix: String
A prefix used to generate fresh name.
-
def
freshVariable(name: String, javaClass: Class[_]): VariableValue
Creates an
ExprValuerepresenting a local java variable of required Java class. -
def
freshVariable(name: String, dt: DataType): VariableValue
Creates an
ExprValuerepresenting a local java variable of required data type. -
def
genComp(dataType: DataType, c1: String, c2: String): String
Generates code for comparing two expressions.
Generates code for comparing two expressions.
- dataType
data type of the expressions
- c1
name of the variable of expression 1's output
- c2
name of the variable of expression 2's output
-
def
genEqual(dataType: DataType, c1: String, c2: String): String
Generates code for equal expression in Java.
-
def
genGreater(dataType: DataType, c1: String, c2: String): String
Generates code for greater of two expressions.
Generates code for greater of two expressions.
- dataType
data type of the expressions
- c1
name of the variable of expression 1's output
- c2
name of the variable of expression 2's output
-
def
generateExpressions(expressions: Seq[Expression], doSubexpressionElimination: Boolean = false): Seq[ExprCode]
Generates code for expressions.
Generates code for expressions. If doSubexpressionElimination is true, subexpression elimination will be performed. Subexpression elimination assumes that the code for each expression will be combined in the
expressionsorder. -
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
getPlaceHolderToComments(): Map[String, String]
get a map of the pair of a place holder and a corresponding comment
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- def initMutableStates(): String
- def initPartition(): String
-
def
initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
initializeLogIfNecessary(isInterpreter: Boolean): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
def
isTraceEnabled(): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
log: Logger
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logName: String
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
def
nullArrayElementsSaveExec(nullElements: Boolean, isNull: String, arrayData: String)(execute: String): String
Generates code to do null safe execution when accessing properties of complex ArrayData elements.
Generates code to do null safe execution when accessing properties of complex ArrayData elements.
- nullElements
used to decide whether the ArrayData might contain null or not.
- isNull
a variable indicating whether the result will be evaluated to null or not.
- arrayData
a variable name representing the ArrayData.
- execute
the code that should be executed only if the ArrayData doesn't contain any null.
-
def
nullSafeExec(nullable: Boolean, isNull: String)(execute: String): String
Generates code to do null safe execution, i.e.
Generates code to do null safe execution, i.e. only execute the code when the input is not null by adding null check if necessary.
- nullable
used to decide whether we should add null check or not.
- isNull
the code to check if the input is null.
- execute
the code that should only be executed when the input is not null.
- val outerClassName: String
-
val
partitionInitializationStatements: ArrayBuffer[String]
Code statements to initialize states that depend on the partition index.
Code statements to initialize states that depend on the partition index. An integer
partitionIndexwill be made available within the scope. -
def
reassignIfGreater(dataType: DataType, partialResult: ExprCode, item: ExprCode): String
Generates code for updating
partialResultifitemis greater than it.Generates code for updating
partialResultifitemis greater than it.- dataType
data type of the expressions
- partialResult
ExprCoderepresenting the partial result which has to be updated- item
ExprCoderepresenting the new expression to evaluate for the result
-
def
reassignIfSmaller(dataType: DataType, partialResult: ExprCode, item: ExprCode): String
Generates code for updating
partialResultifitemis smaller than it.Generates code for updating
partialResultifitemis smaller than it.- dataType
data type of the expressions
- partialResult
ExprCoderepresenting the partial result which has to be updated- item
ExprCoderepresenting the new expression to evaluate for the result
-
val
references: ArrayBuffer[Any]
Holding a list of objects that could be used passed into generated class.
-
def
registerComment(text: ⇒ String, placeholderId: String = "", force: Boolean = false): Block
Register a comment and return the corresponding place holder
Register a comment and return the corresponding place holder
- placeholderId
an optionally specified identifier for the comment's placeholder. The caller should make sure this identifier is unique within the compilation unit. If this argument is not specified, a fresh identifier will be automatically created and used as the placeholder.
- force
whether to force registering the comments
-
def
splitExpressions(expressions: Seq[String], funcName: String, arguments: Seq[(String, String)], returnType: String = "void", makeSplitFunction: (String) ⇒ String = identity, foldFunctions: (Seq[String]) ⇒ String = _.mkString("", ";\n", ";")): String
Splits the generated code of expressions into multiple functions, because function has 64kb code size limit in JVM.
Splits the generated code of expressions into multiple functions, because function has 64kb code size limit in JVM. If the class to which the function would be inlined would grow beyond 1000kb, we declare a private, inner sub-class, and the function is inlined to it instead, because classes have a constant pool limit of 65,536 named values.
- expressions
the codes to evaluate expressions.
- funcName
the split function name base.
- arguments
the list of (type, name) of the arguments of the split function.
- returnType
the return type of the split function.
- makeSplitFunction
makes split function body, e.g. add preparation or cleanup.
- foldFunctions
folds the split function calls.
-
def
splitExpressionsWithCurrentInputs(expressions: Seq[String], funcName: String = "apply", extraArguments: Seq[(String, String)] = Nil, returnType: String = "void", makeSplitFunction: (String) ⇒ String = identity, foldFunctions: (Seq[String]) ⇒ String = _.mkString("", ";\n", ";")): String
Splits the generated code of expressions into multiple functions, because function has 64kb code size limit in JVM.
Splits the generated code of expressions into multiple functions, because function has 64kb code size limit in JVM. If the class to which the function would be inlined would grow beyond 1000kb, we declare a private, inner sub-class, and the function is inlined to it instead, because classes have a constant pool limit of 65,536 named values.
Note that different from
splitExpressions, we will extract the current inputs of this context and pass them to the generated functions. The input isINPUT_ROWfor normal codegen path, andcurrentVarsfor whole stage codegen path. Whole stage codegen path is not supported yet.- expressions
the codes to evaluate expressions.
- funcName
the split function name base.
- extraArguments
the list of (type, name) of the arguments of the split function, except for the current inputs like
ctx.INPUT_ROW.- returnType
the return type of the split function.
- makeSplitFunction
makes split function body, e.g. add preparation or cleanup.
- foldFunctions
folds the split function calls.
-
def
subexprFunctionsCode: String
Returns the code for subexpression elimination after splitting it if necessary.
-
def
subexpressionEliminationForWholeStageCodegen(expressions: Seq[Expression]): SubExprCodes
Checks and sets up the state and codegen for subexpression elimination.
Checks and sets up the state and codegen for subexpression elimination. This finds the common subexpressions, generates the code snippets that evaluate those expressions and populates the mapping of common subexpressions to the generated code snippets. The generated code snippets will be returned and should be inserted into generated codes before these common subexpressions actually are used first time.
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
withSubExprEliminationExprs(newSubExprEliminationExprs: Map[Expression, SubExprEliminationState])(f: ⇒ Seq[ExprCode]): Seq[ExprCode]
Perform a function which generates a sequence of ExprCodes with a given mapping between expressions and common expressions, instead of using the mapping in current context.