Skip to content

fix: query tolerance= in SQL file tests now also asserts Comet native execution#3797

Open
andygrove wants to merge 4 commits intoapache:mainfrom
andygrove:fix-sql-test-tolerance-coverage
Open

fix: query tolerance= in SQL file tests now also asserts Comet native execution#3797
andygrove wants to merge 4 commits intoapache:mainfrom
andygrove:fix-sql-test-tolerance-coverage

Conversation

@andygrove
Copy link
Copy Markdown
Member

@andygrove andygrove commented Mar 25, 2026

Which issue does this PR close?

Closes #.

Rationale for this change

CometSqlFileTestSuite supports a query tolerance=<value> directive for floating-point queries where small numeric differences are expected. However, the WithTolerance mode called checkSparkAnswerWithTolerance, which hardcodes assertCometNative = false. This meant that all 50+ queries using this directive (covering sin, cos, tan, asin, acos, atan, atan2, sinh, cosh, tanh, exp, expm1, log, log2, log10, pow, sqrt, cot, stddev, variance, avg, sum, corr, covariance) were silently skipping the assertion that Comet actually executed the expression natively.

What changes are included in this PR?

  • Add checkSparkAnswerAndOperatorWithTolerance to CometTestBase, which combines tolerance-based result comparison with assertCometNative = true.
  • Change the WithTolerance case in CometSqlFileTestSuite to call the new method instead of checkSparkAnswerWithTolerance.
  • Fix tan.sql to add allowIncompatible config so the test runs natively.
  • Add native support for Spark's Logarithm expression (two-arg log(base, value)), mapping it to DataFusion's log(base, value) function.

How are these changes tested?

The existing SQL file tests for all affected math and aggregate expressions now enforce native execution as part of their normal CI run.

… execution The WithTolerance mode in CometSqlFileTestSuite called checkSparkAnswerWithTolerance, which hardcodes assertCometNative=false. This meant that all 50+ SQL test queries using 'query tolerance=...' (sin, cos, tan, stddev, avg, etc.) silently skipped the check that Comet actually executed the expression natively. Add checkSparkAnswerAndOperatorWithTolerance that combines tolerance- based result comparison with the native operator assertion, and use it in the WithTolerance case.
@andygrove andygrove marked this pull request as draft March 25, 2026 23:57
Maps Spark's Logarithm expression to DataFusion's log(base, value) function. Applies nullIfNegative to both base and value to match Spark's behavior of returning null when the result would be NaN (inputs <= 0).
@andygrove andygrove marked this pull request as ready for review March 26, 2026 05:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant