class Analyzer extends RuleExecutor[LogicalPlan] with CheckAnalysis with LookupCatalog
Provides a logical query plan analyzer, which translates UnresolvedAttributes and UnresolvedRelations into fully typed objects using information in a SessionCatalog.
- Alphabetic
- By Inheritance
- Analyzer
- LookupCatalog
- CheckAnalysis
- PredicateHelper
- RuleExecutor
- Logging
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
Type Members
- case class ResolveNamespace(catalogManager: CatalogManager) extends Rule[LogicalPlan] with LookupCatalog with Product with Serializable
-
case class
Batch(name: String, strategy: Strategy, rules: Rule[TreeType]*) extends Product with Serializable
A batch of rules.
A batch of rules.
- Attributes
- protected
- Definition Classes
- RuleExecutor
-
case class
FixedPoint(maxIterations: Int, errorOnExceed: Boolean = false, maxIterationsSetting: String = null) extends Strategy with Product with Serializable
A strategy that runs until fix point or maxIterations times, whichever comes first.
A strategy that runs until fix point or maxIterations times, whichever comes first. Especially, a FixedPoint(1) batch is supposed to run only once.
- Definition Classes
- RuleExecutor
-
abstract
class
Strategy extends AnyRef
An execution strategy for rules that indicates the maximum number of executions.
An execution strategy for rules that indicates the maximum number of executions. If the execution reaches fix point (i.e. converge) before maxIterations, it will stop.
- Definition Classes
- RuleExecutor
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
lazy val
batches: Seq[Batch]
Defines a sequence of rule batches, to be overridden by the implementation.
Defines a sequence of rule batches, to be overridden by the implementation.
- Definition Classes
- Analyzer → RuleExecutor
-
val
blacklistedOnceBatches: Set[String]
Once batches that are blacklisted in the idempotence checker
Once batches that are blacklisted in the idempotence checker
- Attributes
- protected
- Definition Classes
- RuleExecutor
-
def
canEvaluate(expr: Expression, plan: LogicalPlan): Boolean
Returns true if
exprcan be evaluated using only the output ofplan.Returns true if
exprcan be evaluated using only the output ofplan. This method can be used to determine when it is acceptable to move expression evaluation within a query plan.For example consider a join between two relations R(a, b) and S(c, d).
-
canEvaluate(EqualTo(a,b), R)returnstrue-canEvaluate(EqualTo(a,c), R)returnsfalse-canEvaluate(Literal(1), R)returnstrueas literals CAN be evaluated on any plan- Attributes
- protected
- Definition Classes
- PredicateHelper
-
def
canEvaluateWithinJoin(expr: Expression): Boolean
Returns true iff
exprcould be evaluated as a condition within join.Returns true iff
exprcould be evaluated as a condition within join.- Attributes
- protected
- Definition Classes
- PredicateHelper
-
val
catalogManager: CatalogManager
- Definition Classes
- Analyzer → LookupCatalog
-
def
checkAnalysis(plan: LogicalPlan): Unit
- Definition Classes
- CheckAnalysis
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
containsMultipleGenerators(exprs: Seq[Expression]): Boolean
- Attributes
- protected
- Definition Classes
- CheckAnalysis
-
def
currentCatalog: CatalogPlugin
Returns the current catalog set.
Returns the current catalog set.
- Definition Classes
- LookupCatalog
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
execute(plan: LogicalPlan): LogicalPlan
Executes the batches of rules defined by the subclass.
Executes the batches of rules defined by the subclass. The batches are executed serially using the defined execution strategy. Within each batch, rules are also executed serially.
- Definition Classes
- Analyzer → RuleExecutor
- def executeAndCheck(plan: LogicalPlan, tracker: QueryPlanningTracker): LogicalPlan
-
def
executeAndTrack(plan: LogicalPlan, tracker: QueryPlanningTracker): LogicalPlan
Executes the batches of rules defined by the subclass, and also tracks timing info for each rule using the provided tracker.
Executes the batches of rules defined by the subclass, and also tracks timing info for each rule using the provided tracker.
- Definition Classes
- RuleExecutor
- See also
-
val
extendedCheckRules: Seq[(LogicalPlan) ⇒ Unit]
Override to provide additional checks for correct analysis.
Override to provide additional checks for correct analysis. These rules will be evaluated after our built-in check rules.
- Definition Classes
- CheckAnalysis
-
val
extendedResolutionRules: Seq[Rule[LogicalPlan]]
Override to provide additional rules for the "Resolution" batch.
-
def
failAnalysis(msg: String): Nothing
- Attributes
- protected
- Definition Classes
- CheckAnalysis
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
def
findExpressionAndTrackLineageDown(exp: Expression, plan: LogicalPlan): Option[(Expression, LogicalPlan)]
Find the origin of where the input references of expression exp were scanned in the tree of plan, and if they originate from a single leaf node.
Find the origin of where the input references of expression exp were scanned in the tree of plan, and if they originate from a single leaf node. Returns optional tuple with Expression, undoing any projections and aliasing that has been done along the way from plan to origin, and the origin LeafNode plan from which all the exp
- Definition Classes
- PredicateHelper
-
val
fixedPoint: FixedPoint
If the plan cannot be resolved within maxIterations, analyzer will throw exception to inform user to increase the value of SQLConf.ANALYZER_MAX_ITERATIONS.
If the plan cannot be resolved within maxIterations, analyzer will throw exception to inform user to increase the value of SQLConf.ANALYZER_MAX_ITERATIONS.
- Attributes
- protected
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
hasMapType(dt: DataType): Boolean
- Attributes
- protected
- Definition Classes
- CheckAnalysis
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
initializeLogIfNecessary(isInterpreter: Boolean): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
def
isPlanIntegral(plan: LogicalPlan): Boolean
Defines a check function that checks for structural integrity of the plan after the execution of each rule.
Defines a check function that checks for structural integrity of the plan after the execution of each rule. For example, we can check whether a plan is still resolved after each rule in
Optimizer, so we can catch rules that return invalid plans. The check function returnsfalseif the given plan doesn't pass the structural integrity check.- Attributes
- protected
- Definition Classes
- RuleExecutor
-
def
isTraceEnabled(): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
isView(nameParts: Seq[String]): Boolean
- Definition Classes
- Analyzer → CheckAnalysis
-
def
log: Logger
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logName: String
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
mapColumnInSetOperation(plan: LogicalPlan): Option[Attribute]
- Attributes
- protected
- Definition Classes
- CheckAnalysis
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
val
postHocResolutionRules: Seq[Rule[LogicalPlan]]
Override to provide rules to do post-hoc resolution.
Override to provide rules to do post-hoc resolution. Note that these rules will be executed in an individual batch. This batch is to run right after the normal resolution batch and execute its rules in one pass.
-
def
replaceAlias(condition: Expression, aliases: AttributeMap[Expression]): Expression
- Attributes
- protected
- Definition Classes
- PredicateHelper
-
def
resolveExpressionBottomUp(expr: Expression, plan: LogicalPlan, throws: Boolean = false): Expression
Resolves the attribute, column value and extract value expressions(s) by traversing the input expression in bottom-up manner.
Resolves the attribute, column value and extract value expressions(s) by traversing the input expression in bottom-up manner. In order to resolve the nested complex type fields correctly, this function makes use of
throwsparameter to control when to raise an AnalysisException.Example : SELECT a.b FROM t ORDER BY b[0].d
In the above example, in b needs to be resolved before d can be resolved. Given we are doing a bottom up traversal, it will first attempt to resolve d and fail as b has not been resolved yet. If
throwsis false, this function will handle the exception by returning the original attribute. In this casedwill be resolved in subsequent passes afterbis resolved.- Attributes
- protected[sql]
- def resolver: Resolver
-
def
splitConjunctivePredicates(condition: Expression): Seq[Expression]
- Attributes
- protected
- Definition Classes
- PredicateHelper
-
def
splitDisjunctivePredicates(condition: Expression): Seq[Expression]
- Attributes
- protected
- Definition Classes
- PredicateHelper
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
object
AsTableIdentifier
Extract legacy table identifier from a multi-part identifier.
Extract legacy table identifier from a multi-part identifier.
For legacy support only. Please use CatalogAndIdentifier instead on DSv2 code paths.
- Definition Classes
- LookupCatalog
-
object
CatalogAndIdentifier
Extract catalog and identifier from a multi-part name with the current catalog if needed.
Extract catalog and identifier from a multi-part name with the current catalog if needed. Catalog name takes precedence over identifier, but for a single-part name, identifier takes precedence over catalog name.
Note that, this pattern is used to look up permanent catalog objects like table, view, function, etc. If you need to look up temp objects like temp view, please do it separately before calling this pattern, as temp objects don't belong to any catalog.
- Definition Classes
- LookupCatalog
-
object
CatalogAndNamespace
Extract catalog and namespace from a multi-part name with the current catalog if needed.
Extract catalog and namespace from a multi-part name with the current catalog if needed. Catalog name takes precedence over namespaces.
- Definition Classes
- LookupCatalog
-
object
ExtractGenerator extends Rule[LogicalPlan]
Extracts Generator from the projectList of a Project operator and creates Generate operator under Project.
Extracts Generator from the projectList of a Project operator and creates Generate operator under Project.
This rule will throw AnalysisException for following cases: 1. Generator is nested in expressions, e.g.
SELECT explode(list) + 1 FROM tbl2. more than one Generator is found in projectList, e.g.SELECT explode(list), explode(list) FROM tbl3. Generator is found in other operators that are not Project or Generate, e.g.SELECT * FROM tbl SORT BY explode(list) -
object
ExtractWindowExpressions extends Rule[LogicalPlan]
Extracts WindowExpressions from the projectList of a Project operator and aggregateExpressions of an Aggregate operator and creates individual Window operators for every distinct WindowSpecDefinition.
Extracts WindowExpressions from the projectList of a Project operator and aggregateExpressions of an Aggregate operator and creates individual Window operators for every distinct WindowSpecDefinition.
This rule handles three cases:
- A Project having WindowExpressions in its projectList;
- An Aggregate having WindowExpressions in its aggregateExpressions.
- A Filter->Aggregate pattern representing GROUP BY with a HAVING clause and the Aggregate has WindowExpressions in its aggregateExpressions. Note: If there is a GROUP BY clause in the query, aggregations and corresponding filters (expressions in the HAVING clause) should be evaluated before any WindowExpression. If a query has SELECT DISTINCT, the DISTINCT part should be evaluated after all WindowExpressions.
For every case, the transformation works as follows: 1. For a list of Expressions (a projectList or an aggregateExpressions), partitions it two lists of Expressions, one for all WindowExpressions and another for all regular expressions. 2. For all WindowExpressions, groups them based on their WindowSpecDefinitions and WindowFunctionTypes. 3. For every distinct WindowSpecDefinition and WindowFunctionType, creates a Window operator and inserts it into the plan tree.
-
object
GlobalAggregates extends Rule[LogicalPlan]
Turns projections that contain aggregate expressions into aggregations.
-
object
HandleNullInputsForUDF extends Rule[LogicalPlan]
Correctly handle null primitive inputs for UDF by adding extra If expression to do the null check.
Correctly handle null primitive inputs for UDF by adding extra If expression to do the null check. When user defines a UDF with primitive parameters, there is no way to tell if the primitive parameter is null or not, so here we assume the primitive input is null-propagatable and we should return null if the input is null.
-
object
LookupFunctions extends Rule[LogicalPlan]
Checks whether a function identifier referenced by an UnresolvedFunction is defined in the function registry.
Checks whether a function identifier referenced by an UnresolvedFunction is defined in the function registry. Note that this rule doesn't try to resolve the UnresolvedFunction. It only performs simple existence check according to the function identifier to quickly identify undefined functions without triggering relation resolution, which may incur potentially expensive partition/schema discovery process in some cases. In order to avoid duplicate external functions lookup, the external function identifier will store in the local hash set externalFunctionNameSet.
- See also
https://issues.apache.org/jira/browse/SPARK-19737
-
object
NonSessionCatalogAndIdentifier
Extract non-session catalog and identifier from a multi-part identifier.
Extract non-session catalog and identifier from a multi-part identifier.
- Definition Classes
- LookupCatalog
-
object
PullOutNondeterministic extends Rule[LogicalPlan]
Pulls out nondeterministic expressions from LogicalPlan which is not Project or Filter, put them into an inner Project and finally project them away at the outer Project.
-
object
ResolveAggAliasInGroupBy extends Rule[LogicalPlan]
Replace unresolved expressions in grouping keys with resolved ones in SELECT clauses.
Replace unresolved expressions in grouping keys with resolved ones in SELECT clauses. This rule is expected to run after ResolveReferences applied.
-
object
ResolveAggregateFunctions extends Rule[LogicalPlan]
This rule finds aggregate expressions that are not in an aggregate operator.
This rule finds aggregate expressions that are not in an aggregate operator. For example, those in a HAVING clause or ORDER BY clause. These expressions are pushed down to the underlying aggregate operator and then projected away after the original operator.
-
object
ResolveAliases extends Rule[LogicalPlan]
Replaces UnresolvedAliass with concrete aliases.
-
object
ResolveAlterTableChanges extends Rule[LogicalPlan]
Rule to mostly resolve, normalize and rewrite column names based on case sensitivity.
-
object
ResolveBinaryArithmetic extends Rule[LogicalPlan]
For Add: 1.
For Add: 1. if both side are interval, stays the same; 2. else if one side is date and the other is interval, turns it to DateAddInterval; 3. else if one side is interval, turns it to TimeAdd; 4. else if one side is date, turns it to DateAdd ; 5. else stays the same.
For Subtract: 1. if both side are interval, stays the same; 2. else if the left side is date and the right side is interval, turns it to -r); 3. else if the right side is an interval, turns it to TimeSub; 4. else if one side is timestamp, turns it to SubtractTimestamps; 5. else if the right side is date, turns it to DateDiff/SubtractDates; 6. else if the left side is date, turns it to DateSub; 7. else turns it to stays the same.
For Multiply: 1. If one side is interval, turns it to MultiplyInterval; 2. otherwise, stays the same.
For Divide: 1. If the left side is interval, turns it to DivideInterval; 2. otherwise, stays the same.
-
object
ResolveDeserializer extends Rule[LogicalPlan]
Replaces UnresolvedDeserializer with the deserialization expression that has been resolved to the given input attributes.
-
object
ResolveFunctions extends Rule[LogicalPlan]
Replaces UnresolvedFunctions with concrete Expressions.
-
object
ResolveGenerate extends Rule[LogicalPlan]
Rewrites table generating expressions that either need one or more of the following in order to be resolved:
Rewrites table generating expressions that either need one or more of the following in order to be resolved:
- concrete attribute references for their output.
- to be relocated from a SELECT clause (i.e. from a Project) into a Generate).
Names for the output Attributes are extracted from Alias or MultiAlias expressions that wrap the Generator.
- object ResolveGroupingAnalytics extends Rule[LogicalPlan]
- object ResolveInsertInto extends Rule[LogicalPlan]
-
object
ResolveMissingReferences extends Rule[LogicalPlan]
In many dialects of SQL it is valid to sort by attributes that are not present in the SELECT clause.
In many dialects of SQL it is valid to sort by attributes that are not present in the SELECT clause. This rule detects such queries and adds the required attributes to the original projection, so that they will be available during sorting. Another projection is added to remove these attributes after sorting.
The HAVING clause could also used a grouping columns that is not presented in the SELECT.
-
object
ResolveNaturalAndUsingJoin extends Rule[LogicalPlan]
Removes natural or using joins by calculating output columns based on output from two sides, Then apply a Project on a normal Join to eliminate natural or using join.
-
object
ResolveNewInstance extends Rule[LogicalPlan]
Resolves NewInstance by finding and adding the outer scope to it if the object being constructed is an inner class.
-
object
ResolveOrdinalInOrderByAndGroupBy extends Rule[LogicalPlan]
In many dialects of SQL it is valid to use ordinal positions in order/sort by and group by clauses.
In many dialects of SQL it is valid to use ordinal positions in order/sort by and group by clauses. This rule is to convert ordinal positions to the corresponding expressions in the select list. This support is introduced in Spark 2.0.
- When the sort references or group by expressions are not integer but foldable expressions, just ignore them. - When spark.sql.orderByOrdinal/spark.sql.groupByOrdinal is set to false, ignore the position numbers too.
Before the release of Spark 2.0, the literals in order/sort by and group by clauses have no effect on the results.
-
object
ResolveOutputRelation extends Rule[LogicalPlan]
Resolves columns of an output table from the data in a logical plan.
Resolves columns of an output table from the data in a logical plan. This rule will:
- Reorder columns when the write is by name - Insert casts when data types do not match - Insert aliases when column names do not match - Detect plans that are not compatible with the output table and throw AnalysisException
- object ResolvePivot extends Rule[LogicalPlan]
-
object
ResolveRandomSeed extends Rule[LogicalPlan]
Set the seed for random number generation.
-
object
ResolveReferences extends Rule[LogicalPlan]
Replaces UnresolvedAttributes with concrete AttributeReferences from a logical plan node's children.
-
object
ResolveRelations extends Rule[LogicalPlan]
Replaces UnresolvedRelations with concrete relations from the catalog.
-
object
ResolveSubquery extends Rule[LogicalPlan] with PredicateHelper
This rule resolves and rewrites subqueries inside expressions.
This rule resolves and rewrites subqueries inside expressions.
Note: CTEs are handled in CTESubstitution.
-
object
ResolveSubqueryColumnAliases extends Rule[LogicalPlan]
Replaces unresolved column aliases for a subquery with projections.
-
object
ResolveTables extends Rule[LogicalPlan]
Resolve table relations with concrete relations from v2 catalog.
Resolve table relations with concrete relations from v2 catalog.
ResolveRelations still resolves v1 tables.
-
object
ResolveTempViews extends Rule[LogicalPlan]
Resolve relations to temp views.
Resolve relations to temp views. This is not an actual rule, and is called by ResolveTables and ResolveRelations.
-
object
ResolveUpCast extends Rule[LogicalPlan]
Replace the UpCast expression by Cast, and throw exceptions if the cast may truncate.
-
object
ResolveWindowFrame extends Rule[LogicalPlan]
Check and add proper window frames for all window functions.
-
object
ResolveWindowOrder extends Rule[LogicalPlan]
Check and add order to AggregateWindowFunctions.
-
object
SessionCatalogAndIdentifier
Extract session catalog and identifier from a multi-part identifier.
Extract session catalog and identifier from a multi-part identifier.
- Definition Classes
- LookupCatalog
-
object
WindowsSubstitution extends Rule[LogicalPlan]
Substitute child plan with WindowSpecDefinitions.
-
object
Once extends Strategy with Product with Serializable
A strategy that is run once and idempotent.
A strategy that is run once and idempotent.
- Definition Classes
- RuleExecutor