feat: Support Spark expression hours by 0lai0 · Pull Request #3804 · apache/datafusion-comet

0lai0 · 2026-03-27T07:15:00Z

Which issue does this PR close?

Rationale for this change

Comet previously did not support the Spark hours expression (a V2 partition transform).
Queries using the hours function for partitioning would fall back to Spark's JVM execution instead of running natively on DataFusion. By adding native support for this expression, we allow more Spark workloads (especially those partitioned by hourly intervals) to benefit from Comet's native acceleration.

What changes are included in this PR?

This change adds end-to-end native support for the hours partition transform. Since Hours is a PartitionTransformExpression (and not a TimeZoneAwareExpression), the timezone is injected from the session configuration during the planning phase.
The native implementation uses Arrow's unary and try_unary kernels for efficient vectorized computation, and correctly handles pre-epoch (negative) timestamps using Euclidean floor division (div_euclid). It distinctly handles both TimestampType (applies timezone offsets) and TimestampNTZType (direct wall-clock computation).

expr.proto: Added HoursTransform message definition to pass the child expression and session timezone.
datetime.scala: Added CometHours serde handler to intercept the Spark Hours expression and read the timezone from SQLConf.
QueryPlanSerde.scala: Registered the CometHours handler in the temporal expressions map.
hours.rs: Added SparkHoursTransform UDF using efficient Arrow kernels.
temporal.rs & expression_registry.rs: Registered the native Builder for the new expression.

How are these changes tested?

Added comprehensive evaluation in both Rust and Scala:

Rust Unit Tests : Added unit tests in hours.rs covering:
- Positive/negative (pre-epoch) epoch handling
- Epoch boundary (zero)
- Timezone offset handling
- Null propagation
- Proper isolation of TimestampNTZType (ensuring it ignores timezone offsets)
```
cargo test -p datafusion-comet-spark-expr -- datetime_funcs::hours
```
Scala Integration Tests: Evaluated end-to-end execution in CometTemporalExpressionSuite.
```
./mvnw test -pl spark -Dsuites='org.apache.comet.CometTemporalExpressionSuite'
```

…ce epoch from timestamps.

0lai0 added 2 commits March 27, 2026 14:50

feat: Add Spark V2 partition transform Hours to calculate hours sin…

f7cf339

…ce epoch from timestamps.

fix style

ebe4073

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Support Spark expression hours#3804

feat: Support Spark expression hours#3804
0lai0 wants to merge 2 commits intoapache:mainfrom
0lai0:support_spark_hours

0lai0 commented Mar 27, 2026

Labels

1 participant

Conversation