Skip to content

Releases: googleapis/python-bigquery-dataframes

v2.0.0

17 Apr 19:46
881e4f0

Choose a tag to compare

2.0.0 (2025-04-17)

⚠ BREAKING CHANGES

  • make dataset and name params mandatory in udf (#1619)
  • Locational endpoints support is not available in BigFrames 2.0.
  • change default LLM model to gemini-2.0-flash-001, drop PaLM2TextGenerator and PaLM2TextEmbeddingGenerator (#1558)
  • change default ingress setting for remote_function to internal-only (#1544)
  • make remote_function params keyword only (#1537)
  • make remote_function default service account explicit (#1537)
  • set allow_large_results=False by default (#1541)

Features

  • Add on parameter in dataframe.rolling() and dataframe.groupby.rolling() (#1556) (45c9d9f)
  • Add component to manage temporary tables (#1559) (0a4e245)
  • Add Series.to_pandas_batches() method (#1592) (09ce979)
  • Add support for creating a Matrix Factorization model (#1330) (b5297f9)
  • Allow input_types, output_type, and dataset to be used positionally in remote_function (#1560) (bcac8c6)
  • Allow pandas.cut 'labels' parameter to accept a list of string (#1549) (af842b1)
  • Change default ingress setting for remote_function to internal-only (#1544) (c848a80)
  • Detect duplicate column/index names in read_gbq before send query. (#1615) (40d6960)
  • Drop support for locational endpoints (#1542) (4bf2e43)
  • Enable time range rolling for DataFrame, DataFrameGroupBy and SeriesGroupBy (#1605) (b4b7073)
  • Improve local data validation (#1598) (815e471)
  • Make remote_function default service account explicit (#1537) (9eb9089)
  • Set allow_large_results=False by default (#1541) (e9fb712)
  • Support bigquery connection in managed function (#1554) (f6f697a)
  • Support bq connection path format (#1550) (e7eb918)
  • Support gemini-2.0-X models (#1558) (3104fab)
  • Support inlining small list, struct, json data (#1589) (2ce891f)
  • Support time range rolling on Series. (#1590) (6e98a2c)
  • Use session temp tables for all ephemeral storage (#1569) (9711b83)
  • Use validated local storage for data uploads (#1612) (aee4159)
  • Warn the deprecated max_download_size, random_state and sampling_method parameters in (DataFrame|Series).to_pandas() (#1573) (b9623da)

Bug Fixes

  • to_pandas_batches() respects page_size and max_results again (#1572) (27c5905)
  • Ensure page_size works correctly in to_pandas_batches when max_results is not set (#1588) (570cff3)
  • Include role and service account in IAM exception (#1564) (8c50755)
  • Make dataset and name params mandatory in udf (#1619) (637e860)
  • Pandas.cut returns labels index for numeric breaks when labels=False (#1548) (b2375de)
  • Prevent KeyError in bpd.concat with empty DF and struct/array types DF (#1568) (b4da1cf)
  • Read_csv supports for tilde local paths and includes index for bigquery_stream write engine (#1580) (352e8e4)
  • Use dictionaries to avoid problematic google.iam namespace (#1611) (b03e44f)

Performance Improvements

  • Directly read gbq table for simple plans (#1607) (6ad38e8)

Dependencies

Documentation

Read more

v2.0.0.dev0

31 Mar 13:41

Choose a tag to compare

v2.0.0.dev0 Pre-release
Pre-release

2.0.0.dev0 (2025-03-31)

⚠ BREAKING CHANGES

  • Locational endpoints support is not available in BigFrames 2.0.
  • change default LLM model to gemini-2.0-flash-001, drop PaLM2TextGenerator and PaLM2TextEmbeddingGenerator (#1558)
  • change default ingress setting for remote_function to internal-only (#1544)
  • make remote_function params keyword only (#1537)
  • make remote_function default service account explicit (#1537)
  • set allow_large_results=False by default (#1541)

Features

  • Add component to manage temporary tables (#1559) (0a4e245)
  • Allow input_types, output_type, and dataset to be used positionally in remote_function (#1560) (bcac8c6)
  • Allow pandas.cut 'labels' parameter to accept a list of string (#1549) (af842b1)
  • Change default ingress setting for remote_function to internal-only (#1544) (c848a80)
  • Drop support for locational endpoints (#1542) (4bf2e43)
  • Make remote_function default service account explicit (#1537) (9eb9089)
  • Set allow_large_results=False by default (#1541) (e9fb712)
  • Support bigquery connection in managed function (#1554) (f6f697a)
  • Support bq connection path format (#1550) (e7eb918)
  • Support gemini-2.0-X models (#1558) (3104fab)

Bug Fixes

  • Include role and service account in IAM exception (#1564) (8c50755)
  • Pandas.cut returns labels index for numeric breaks when labels=False (#1548) (b2375de)
  • Prevent KeyError in bpd.concat with empty DF and struct/array types DF (#1568) (b4da1cf)

Documentation

  • Add message to remove default model for version 3.0 (#1563) (910be2b)
  • Add warning for bigframes 2.0 (#1557) (3f0eaa1)
  • Remove gemini-1.5 deprecation warning for GeminiTextGenerator (#1562) (0cc6784)
  • Use restructured text to allow publishing to PyPI (#1565) (d1e9ec2)

Miscellaneous Chores

  • Make remote_function params keyword only (#1537) (9eb9089)

v1.42.0

27 Mar 07:46
b6b82ec

Choose a tag to compare

1.42.0 (2025-03-27)

Features

  • Add closed parameter in rolling() (#1539) (8bcc89b)
  • Add GeoSeries.difference() and bigframes.bigquery.st_difference() (#1471) (e9fe815)
  • Add GeoSeries.intersection() and bigframes.bigquery.st_intersection() (#1529) (8542bd4)
  • Add df.take and series.take (#1509) (7d00be6)
  • Add Linear_Regression.global_explain() (#1446) (7e5b6a8)
  • Allow iloc to support lists of negative indices (#1497) (a9cf215)
  • Support dry_run in to_pandas() (#1436) (75fc7e0)
  • Support window partition by geo column (#1512) (bdcb1e7)
  • Upgrade BQ managed udf to preview (#1536) (4a7fe4d)

Bug Fixes

  • Add deprecation warning to TextEmbeddingGenerator model, espeically gemini-1.0-X and gemini-1.5-X (#1534) (c93e720)
  • Change the default value for pdf extract/chunk (#1517) (a70a607)
  • Local data always has sequential index (#1514) (014bd33)
  • Read_pandas inline returns None when exceeds limit (#1525) (578081e)
  • Temporary fix for StreamingDataFrame not working backend bug (#1533) (6ab4ffd)
  • Tolerate BQ connection service account propagation delay (#1505) (6681f1f)

Performance Improvements

Documentation

  • Update GeoSeries.difference() and bigframes.bigquery.st_difference() docs (#1526) (d553fa2)

v1.41.0

19 Mar 19:38
0cdc874

Choose a tag to compare

1.41.0 (2025-03-19)

Features

  • Add support for the 'right' parameter in 'pandas.cut' (#1496) (8aff128)
  • Support BQ managed functions through read_gbq_function (#1476) (802183d)
  • Warn when the BigFrames version is more than a year old (#1455) (00e0750)

Bug Fixes

  • Fix pandas.cut errors with empty bins (#1499) (434fb5d)
  • Fix read_gbq with ORDER BY query and index_col set (#963) (de46d2f)

Performance Improvements

Documentation

v1.40.0

11 Mar 23:15
5273d36

Choose a tag to compare

1.40.0 (2025-03-11)

⚠ BREAKING CHANGES

  • reading JSON data as a custom arrow extension type (#1458)

Features

  • Reading JSON data as a custom arrow extension type (#1458) (e720f41)
  • Support list output for managed function (#1457) (461e9e0)

Bug Fixes

  • Fix list-like indexers in partial ordering mode (#1456) (fe72ada)
  • Fix the merge issue between 1424 and 1373 (#1461) (7b6e361)
  • Use == instead of is for timedelta type equality checks (#1480) (0db248b)

Performance Improvements

  • Compilation no longer bounded by recursion (#1464) (27ab028)

v1.39.0

05 Mar 20:03
c928920

Choose a tag to compare

1.39.0 (2025-03-05)

Features

  • (Preview) Support diff() for date series (#1423) (521e987)
  • (Preview) Support aggregations over timedeltas (#1418) (1251ded)
  • (Preview) Support arithmetics between dates and timedeltas (#1413) (962b152)
  • (Preview) Support automatic load of timedelta from BQ tables. (#1429) (b2917bb)
  • Add allow_large_results option to many I/O methods. Set to False to reduce latency (#1428) (dd2f488)
  • Add GeoSeries.boundary() (#1435) (32cddfe)
  • Add allow_large_results to peek (#1448) (67487b9)
  • Add groupby.rank() (#1433) (3a633d5)
  • Iloc multiple columns selection. (#1437) (ddfd02a)
  • Support interface for BigQuery managed functions (#1373) (2bbf53f)
  • Warn if default ingress_settings is used in remote_functions (#1419) (dfd891a)

Bug Fixes

  • Do not compare schema description during schema validation (#1452) (03a3a56)
  • Remove warnings for null index and partial ordering mode in prep for GA (#1431) (6785aee)
  • Warn if default cloud_function_service_account is used in remote_function (#1424) (fe7463a)
  • Window operations over JSON columns (#1451) (0070e77)
  • Write chunked text instead of dummy text for pdf chunk (#1444) (96b0e8a)

Performance Improvements

Documentation

  • Add snippet for explaining the linear regression model prediction (#1427) (7c37c7d)

v1.38.0

24 Feb 19:31
aeb5063

Choose a tag to compare

1.38.0 (2025-02-24)

Features

  • (Preview) Support diff aggregation for timestamp series. (#1405) (abe48d6)
  • Add GeoSeries.from_wkt() and GeoSeries.to_wkt() (#1401) (2993b28)
  • Support DF.array(copy=True) (#1403) (693ed8c)
  • Support routines with ARRAY return type in read_gbq_function (#1412) (4b60049)

Bug Fixes

  • Calling to_timdelta() over timedeltas no longer changes their values (#1411) (650a190)
  • Replace empty dict with None to avoid mutable default arguments (#1416) (fa4e3ad)

Performance Improvements

Dependencies

  • Remove scikit-learn and sqlalchemy as required dependencies (#1296) (fd8bc89)

Documentation

  • Add samples using SQL methods via the bigframes.bigquery module (#1358) (f54e768)
  • Add snippets for visualizing a time series and creating a time series model for the Limit forecasted values in time series model tutorial (#1310) (c6c9120)

v1.37.0

19 Feb 16:18
4df61b4

Choose a tag to compare

1.37.0 (2025-02-19)

Features

  • JSON dtype support for read_pandas and Series constructor (#1391) (44f4137)
  • Support add, sub, mult, div, and more between timedeltas (#1396) (ffa63d4)
  • Support comparison, ordering, and filtering for timedeltas (#1387) (34d01b2)
  • Support subtraction in DATETIME/TIMESTAMP columns with timedelta columns (#1390) (50ad3a5)

Bug Fixes

  • Ensure binops with pandas objects returns bigquery dataframes (#1404) (3cee24b)

Performance Improvements

Documentation

v1.36.0

11 Feb 12:26
641abea

Choose a tag to compare

1.36.0 (2025-02-11)

Features

  • Add bigframes.bigquery.st_area and suggest it from GeoSeries.area (#1318) (8b5ffa8)
  • Add GeoSeries.from_xy() (#1364) (3c3e14c)

Bug Fixes

  • Dtype parameter ineffective in Series/DataFrame construction (#1354) (b9bdca8)
  • Translate labels to col ids when copying dataframes (#1372) (0c55b07)
  • Fixed an AttributeError related to sqlglot that occurred when using bigframes (#1379)(24962cd)

Performance Improvements

v1.35.0

04 Feb 21:12
9a21f25

Choose a tag to compare

1.35.0 (2025-02-04)

Features

  • Add Series.keys() (#1342) (deb015d)
  • Allow case_when to change dtypes if case list contains the condition (True, some_default_value) (#1311) (5c2a2c6)
  • Support python type as astype arg (#1316) (b26e135)
  • Support time_series_id_col in ARIMAPlus (#1282) (97532c9)

Bug Fixes

  • Exclude DataFrame and Series __call__ from unimplemented API metrics (#1351) (f2d5264)
  • Make DataFrame __getattr__ and __setattr__ more robust to subclassing (#1352) (417de3a)

Performance Improvements

Dependencies

  • Add support for Python 3.13 for everything but remote functions (#1307) (533db96)

Documentation

  • Add GeoSeries docs (#1327) (05f83d1)
  • Add link to DataFrames intro to improve SEO (#1176) (aafb5be)
  • Add snippet to explain the univariate model's forecast result in the Forecast a single time series with a univariate model tutorial (#1272) (c22126b)