Commit c5b7fda
feat: Implement ST_LENGTH geography function (#1791)
* feat: Implement ST_LENGTH geography function This commit introduces the ST_LENGTH function for BigQuery DataFrames. ST_LENGTH computes the length of GEOGRAPHY objects in meters. The implementation includes: - A new operation `geo_st_length_op` in `bigframes.operations.geo_ops`. - The user-facing function `st_length` in `bigframes.bigquery._operations.geo`. - Exposure of the new operation and function in relevant `__init__.py` files. - Comprehensive unit tests covering various geometry types (Point, LineString, Polygon, MultiPoint, MultiLineString, MultiPolygon, GeometryCollection), empty geographies, and NULL inputs. The function behaves as per the BigQuery ST_LENGTH documentation: - Returns 0 for POINT, MULTIPOINT, and empty GEOGRAPHYs. - Returns the perimeter for POLYGON and MULTIPOLYGON. - Returns the total length for LINESTRING and MULTILINESTRING. - For GEOMETRYCOLLECTION, sums the lengths/perimeters of its constituent linestrings and polygons. * feat: Add NotImplemented length property to GeoSeries This commit adds a `length` property to the `GeoSeries` class. Accessing this property will raise a `NotImplementedError`, guiding you to utilize the `bigframes.bigquery.st_length()` function instead. This change includes: - The `length` property in `bigframes/geopandas/geoseries.py`. - A unit test in `tests/system/small/geopandas/test_geoseries.py` to verify that the correct error is raised with the specified message when `GeoSeries.length` is accessed. * Update bigframes/bigquery/_operations/__init__.py * fix lint * add missing compilation method * use pandas for the expected values in tests * fix: Apply patch for ST_LENGTH and related test updates This commit applies a user-provided patch that includes: - Removing `st_length` from `bigframes/bigquery/_operations/__init__.py`. - Adding an Ibis implementation for `geo_st_length_op` in `bigframes/core/compile/scalar_op_compiler.py`. - Modifying `KMeans` in `bigframes/ml/cluster.py` to handle `init="k-means++"`. - Updating geo tests in `tests/system/small/bigquery/test_geo.py` to use `to_pandas()` and `pd.testing.assert_series_equal`. Note: System tests requiring Google Cloud authentication were not executed due to limitations in my current environment. * feat: Add use_spheroid parameter to ST_LENGTH and update docs This commit introduces the `use_spheroid` parameter to the `ST_LENGTH` geography function, aligning it more closely with the BigQuery ST_LENGTH(geography_expression[, use_spheroid]) signature. Key changes: - `bigframes.operations.geo_ops.GeoStLengthOp` is now a dataclass that accepts `use_spheroid` (defaulting to `False`). A check is included to raise `NotImplementedError` if `use_spheroid` is `True`, as this is the current limitation in BigQuery. - The Ibis compiler implementation for `geo_st_length_op` in `bigframes.core.compile.scalar_op_compiler.py` has been updated to accept the new `GeoStLengthOp` operator type. - The user-facing `st_length` function in `bigframes.bigquery._operations.geo.py` now includes the `use_spheroid` keyword argument. - The docstring for `st_length` has been updated to match the official BigQuery documentation, clarifying that only lines contribute to the length (points and polygons result in 0 length), and detailing the `use_spheroid` parameter. Examples have been updated accordingly. - Tests in `tests/system/small/bigquery/test_geo.py` have been updated to: - Reflect the correct behavior (0 length for polygons/points). - Test calls with both default `use_spheroid` and explicit `use_spheroid=False`. - Verify that `use_spheroid=True` raises a `NotImplementedError`. Note: System tests requiring Google Cloud authentication were not re-executed for this specific commit due to environment limitations identified in previous steps. The changes primarily affect the operator definition, function signature, and client-side validation, with the core Ibis compilation logic for length remaining unchanged. * feat: Implement use_spheroid for ST_LENGTH via Ibis UDF This commit refactors the ST_LENGTH implementation to correctly pass the `use_spheroid` parameter to BigQuery by using Ibis's `ibis_udf.scalar.builtin('ST_LENGTH', ...)` function. Key changes: - `bigframes.operations.geo_ops.GeoStLengthOp`: The client-side `NotImplementedError` for `use_spheroid=True` (raised in `__post_init__`) has been removed. BigQuery DataFrames will now pass this parameter directly to BigQuery. - `bigframes.core.compile.scalar_op_compiler.geo_length_op_impl`: The implementation now always uses `ibis_udf.scalar.builtin('ST_LENGTH', x, op.use_spheroid)` instead of `x.length()`. This ensures the `use_spheroid` parameter is included in the SQL generated for BigQuery. - `tests/system/small/bigquery/test_geo.py`: - The test expecting a client-side `NotImplementedError` for `use_spheroid=True` has been removed. - A new test `test_st_length_use_spheroid_true_errors_from_bq` has been added. This test calls `st_length` with `use_spheroid=True` and asserts that an exception is raised from BigQuery, as BigQuery itself currently only supports `use_spheroid=False` for the `ST_LENGTH` function. - Existing tests for `st_length` were already updated in a previous commit to reflect that only line geometries contribute to the length, and these continue to verify behavior with `use_spheroid=False`. This change ensures that BigQuery DataFrames accurately reflects BigQuery's `ST_LENGTH` capabilities concerning the `use_spheroid` parameter. * refactor: Use Ibis UDF for ST_LENGTH BigQuery builtin This commit refactors the ST_LENGTH geography operation to use an Ibis UDF defined via `@ibis_udf.scalar.builtin`. This aligns with the pattern exemplified by other built-in functions like ST_DISTANCE when a direct Ibis method with all necessary parameters is not available. Key changes: - A new `st_length` function is defined in `bigframes/core/compile/scalar_op_compiler.py` using `@ibis_udf.scalar.builtin`. This UDF maps to BigQuery's `ST_LENGTH(geography, use_spheroid)` function. - The `geo_length_op_impl` in the same file now calls this `st_length` Ibis UDF, replacing the previous use of `op_typing.ibis_function`. - The `GeoStLengthOp` in `bigframes/operations/geo_ops.py` and the user-facing `st_length` function in `bigframes/bigquery/_operations/geo.py` remain unchanged from the previous version, as they correctly define the operation's interface and parameters. This change provides a cleaner and more direct way to map the BigQuery DataFrames operation to the specific BigQuery ST_LENGTH SQL function signature, while maintaining the existing BigQuery DataFrames operation structure. The behavior of the `st_length` function, including its handling of the `use_spheroid` parameter and error conditions from BigQuery, remains the same. * refactor: Consolidate st_length tests in test_geo.py This commit refactors the system tests for the `st_length` geography function in `tests/system/small/bigquery/test_geo.py`. The numerous individual test cases for different geometry types have been combined into a single, comprehensive test function `test_st_length_various_geometries`. This new test uses a single GeoSeries with a variety of inputs (Point, LineString, Polygon, MultiPoint, MultiLineString, MultiPolygon, GeometryCollection, None/Empty) and compares the output of `st_length` (with both default and explicit `use_spheroid=False`) against a pandas Series of expected lengths. This consolidation improves the conciseness and maintainability of the tests for `st_length`. The test for `use_spheroid=True` (expecting an error from BigQuery) remains separate. * fix: Correct export of GeoStLengthOp in operations init This commit fixes an ImportError caused by an incorrect name being used for the ST_LENGTH geography operator in `bigframes/operations/__init__.py`. When `geo_st_length_op` (a variable) was replaced by the dataclass `GeoStLengthOp`, the import and `__all__` list in this `__init__.py` file were not updated. This commit changes the import from `.geo_ops` to correctly import `GeoStLengthOp` and updates the `__all__` list to export `GeoStLengthOp`. * fix system test and some linting * fix lint * fix doctest * fix docstring * Update bigframes/core/compile/scalar_op_compiler.py --------- Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>1 parent d2154c8 commit c5b7fda
File tree
8 files changed
+170
-1
lines changed- bigframes
- bigquery
- _operations
- core/compile
- geopandas
- operations
- tests/system/small
- bigquery
- geopandas
8 files changed
+170
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| 35 | + | |
35 | 36 | | |
36 | 37 | | |
37 | 38 | | |
| |||
58 | 59 | | |
59 | 60 | | |
60 | 61 | | |
| 62 | + | |
61 | 63 | | |
62 | 64 | | |
63 | 65 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
380 | 380 | | |
381 | 381 | | |
382 | 382 | | |
| 383 | + | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
| 417 | + | |
| 418 | + | |
| 419 | + | |
| 420 | + | |
| 421 | + | |
| 422 | + | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
| 427 | + | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
33 | | - | |
34 | 33 | | |
35 | 34 | | |
36 | 35 | | |
| |||
1079 | 1078 | | |
1080 | 1079 | | |
1081 | 1080 | | |
| 1081 | + | |
| 1082 | + | |
| 1083 | + | |
| 1084 | + | |
| 1085 | + | |
| 1086 | + | |
1082 | 1087 | | |
1083 | 1088 | | |
1084 | 1089 | | |
| |||
2057 | 2062 | | |
2058 | 2063 | | |
2059 | 2064 | | |
| 2065 | + | |
| 2066 | + | |
| 2067 | + | |
| 2068 | + | |
| 2069 | + | |
| 2070 | + | |
2060 | 2071 | | |
2061 | 2072 | | |
2062 | 2073 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
33 | 39 | | |
34 | 40 | | |
35 | 41 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
101 | 101 | | |
102 | 102 | | |
103 | 103 | | |
| 104 | + | |
104 | 105 | | |
105 | 106 | | |
106 | 107 | | |
| |||
385 | 386 | | |
386 | 387 | | |
387 | 388 | | |
| 389 | + | |
388 | 390 | | |
389 | 391 | | |
390 | 392 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
80 | 80 | | |
81 | 81 | | |
82 | 82 | | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
22 | 25 | | |
23 | 26 | | |
24 | 27 | | |
25 | 28 | | |
| 29 | + | |
26 | 30 | | |
27 | 31 | | |
28 | 32 | | |
| |||
59 | 63 | | |
60 | 64 | | |
61 | 65 | | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
62 | 126 | | |
63 | 127 | | |
64 | 128 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
96 | 96 | | |
97 | 97 | | |
98 | 98 | | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
99 | 110 | | |
100 | 111 | | |
101 | 112 | | |
| |||
0 commit comments