Skip to content

Commit 240840f

Browse files
gengliangwangmaropu
authored andcommitted
[SPARK-30515][SQL] Refactor SimplifyBinaryComparison to reduce the time complexity
### What changes were proposed in this pull request? The changes in the rule `SimplifyBinaryComparison` from #27008 could bring performance regression in the optimizer when there are a large set of filter conditions. We need to improve the implementation and reduce the time complexity. ### Why are the changes needed? Need to fix the potential performance regression in the optimizer. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Existing unit tests. Also run a micor benchmark in `BinaryComparisonSimplificationSuite` ``` object Optimize extends RuleExecutor[LogicalPlan] { val batches = Batch("Constant Folding", FixedPoint(50), SimplifyBinaryComparison) :: Nil } test("benchmark") { val a = Symbol("a") val condition = (1 to 500).map(i => EqualTo(a, a)).reduceLeft(And) val finalCondition = And(condition, IsNotNull(a)) val plan = nullableRelation.where(finalCondition).analyze val start = System.nanoTime() Optimize.execute(plan) println((System.nanoTime() - start) /1000000) } ``` Before the changes: 2507ms After the changes: 3ms Closes #27212 from gengliangwang/SimplifyBinaryComparison. Authored-by: Gengliang Wang <[email protected]> Signed-off-by: Takeshi Yamamuro <[email protected]>
1 parent 51d2917 commit 240840f

File tree

1 file changed

+25
-24
lines changed

1 file changed

+25
-24
lines changed

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala

Lines changed: 25 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -410,37 +410,38 @@ object SimplifyBinaryComparison
410410
extends Rule[LogicalPlan] with PredicateHelper with ConstraintHelper {
411411

412412
private def canSimplifyComparison(
413-
plan: LogicalPlan, left: Expression, right: Expression): Boolean = {
413+
left: Expression,
414+
right: Expression,
415+
notNullExpressions: => ExpressionSet): Boolean = {
414416
if (left.semanticEquals(right)) {
415-
if (!left.nullable && !right.nullable) {
416-
true
417-
} else {
418-
// We do more checks for non-nullable cases
419-
plan match {
420-
case Filter(fc, _) =>
421-
splitConjunctivePredicates(fc).exists { condition =>
422-
condition.semanticEquals(IsNotNull(left))
423-
}
424-
case _ => false
425-
}
426-
}
417+
(!left.nullable && !right.nullable) || notNullExpressions.contains(left)
427418
} else {
428419
false
429420
}
430421
}
431422

432423
def apply(plan: LogicalPlan): LogicalPlan = plan transform {
433-
case l: LogicalPlan => l transformExpressionsUp {
434-
// True with equality
435-
case a EqualNullSafe b if a.semanticEquals(b) => TrueLiteral
436-
case a EqualTo b if canSimplifyComparison(l, a, b) => TrueLiteral
437-
case a GreaterThanOrEqual b if canSimplifyComparison(l, a, b) => TrueLiteral
438-
case a LessThanOrEqual b if canSimplifyComparison(l, a, b) => TrueLiteral
439-
440-
// False with inequality
441-
case a GreaterThan b if canSimplifyComparison(l, a, b) => FalseLiteral
442-
case a LessThan b if canSimplifyComparison(l, a, b) => FalseLiteral
443-
}
424+
case l: LogicalPlan =>
425+
lazy val notNullExpressions = ExpressionSet(l match {
426+
case Filter(fc, _) =>
427+
splitConjunctivePredicates(fc).collect {
428+
case i: IsNotNull => i.child
429+
}
430+
case _ => Seq.empty
431+
})
432+
433+
l transformExpressionsUp {
434+
// True with equality
435+
case a EqualNullSafe b if a.semanticEquals(b) => TrueLiteral
436+
case a EqualTo b if canSimplifyComparison(a, b, notNullExpressions) => TrueLiteral
437+
case a GreaterThanOrEqual b if canSimplifyComparison(a, b, notNullExpressions) =>
438+
TrueLiteral
439+
case a LessThanOrEqual b if canSimplifyComparison(a, b, notNullExpressions) => TrueLiteral
440+
441+
// False with inequality
442+
case a GreaterThan b if canSimplifyComparison(a, b, notNullExpressions) => FalseLiteral
443+
case a LessThan b if canSimplifyComparison(a, b, notNullExpressions) => FalseLiteral
444+
}
444445
}
445446
}
446447

0 commit comments

Comments
 (0)