BigQuery API - Module Google::Cloud::Bigquery::External (v1.50.0)

Reference documentation and code samples for the BigQuery API module Google::Cloud::Bigquery::External.

External

Creates a new DataSource (or subclass) object that represents the external data source that can be queried from directly, even though the data is not stored in BigQuery. Instead of loading or streaming the data, this object references the external data source.

See DataSource, CsvSource, JsonSource, SheetsSource, BigtableSource

Examples

require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv|  csv.autodetect = true  csv.skip_leading_rows = 1 end data = bigquery.query "SELECT * FROM my_ext_table",  external: { my_ext_table: csv_table } # Iterate over the first page of results data.each do |row|  puts row[:name] end # Retrieve the next page of results data = data.next if data.next?

Hive partitioning options:

require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new gcs_uri = "gs://cloud-samples-data/bigquery/hive-partitioning-samples/autolayout/*" source_uri_prefix = "gs://cloud-samples-data/bigquery/hive-partitioning-samples/autolayout/" external_data = bigquery.external gcs_uri, format: :parquet do |ext|  ext.hive_partitioning_mode = :auto  ext.hive_partitioning_require_partition_filter = true  ext.hive_partitioning_source_uri_prefix = source_uri_prefix end external_data.hive_partitioning? #=> true external_data.hive_partitioning_mode #=> "AUTO" external_data.hive_partitioning_require_partition_filter? #=> true external_data.hive_partitioning_source_uri_prefix #=> source_uri_prefix