Skip to main content
1 vote
0 answers
55 views

I am trying to use pipelines in Databricks to ingest data from an external location to the datalake using AutoLoader, and I am facing this issue. I have noticed other posts with similar errors, but in ...
MattSt's user avatar
  • 1,203
0 votes
0 answers
191 views

Issue: I have a Databricks Workflow/job running a pytest test that is being marked as "Failed" because one of the Autoloader pipelines within it fails, despite the overall job succeeding and ...
r_g_s_'s user avatar
  • 306
0 votes
1 answer
140 views

I am using Databricks Autoloader with PySpark to stream Parquet files into a Delta table. Here's a simplified version of what I am doing: spark.readStream \ .format("cloudFiles") \ ....
Zeruno's user avatar
  • 1,689
1 vote
0 answers
96 views

Hi all I am using Databricks Autoloader with PySpark to ingest Parquet files from a directory. Here's a simplified version of my current setup: spark.readStream \ .format("cloudFiles") \ ....
Zeruno's user avatar
  • 1,689
0 votes
0 answers
164 views

I'm using databricks autoloader to load parquet files def run_autoloader(table_name, checkpoint_path, latest_file_location, new_columns): # Configure Auto Loader to ingest parquet data to a Delta ...
Boris's user avatar
  • 906
0 votes
0 answers
131 views

We are using databricks autoloader to process parquet files into delta format. The job is scheduled to run once per day and the code looks like this: def run_autoloader(table_name, checkpoint_path, ...
Boris's user avatar
  • 906
0 votes
1 answer
335 views

I am trying to read delta table in delta shares shared from other environments. The pipeline runs okay; however, as the delta table is update in the source (delta share in GCP), the code below gets ...
Diego's user avatar
  • 11
0 votes
2 answers
258 views

Im struggling to understand how to control the backfill process baked into Autoloader: https://docs.databricks.com/en/ingestion/cloud-object-storage/auto-loader/production.html#trigger-regular-...
Andy McWilliams's user avatar
-1 votes
1 answer
267 views

I am using Databricks Autoloader to process files in streaming (micro-batch) mode. The source files are in .text format. While the checkpoints are created and the stream does not fail, the Delta table ...
Shoaib Maroof's user avatar
0 votes
0 answers
795 views

Looking for a solution to ingesting empty parquet files into databricks using autoloader into unity catalog delta tables without causing the stream to fail. we use a batch process that processes ...
Htape's user avatar
  • 1
0 votes
0 answers
186 views

We use the Autoloader pattern in Databricks to fill from raw files in our storage account into Delta tables. Though successfull for almost all tables, for 2 tables we face strange behaviour. It takes ...
FrancoisDuCoq's user avatar
1 vote
0 answers
213 views

I intend to use Autoloader in file notification mode. I want to control the naming of the event grid subscription and storage queues, so I am using the module recommended in the official docs to ...
Andy McWilliams's user avatar
0 votes
2 answers
616 views

I am seeking guidance on handling full load scenarios within Databricks using Autoloader. Please don't go too hard on me, since I lack practical experience at this point in time. My scenario is a ...
Erhnam's user avatar
  • 1,031
0 votes
1 answer
204 views

I am using Autoloader in Trigger Once mode to load Parquet files from an S3 location. My goal is to implement change data capture by comparing the source and target Delta tables to identify and ...
user26458184's user avatar
3 votes
0 answers
638 views

I am using Autoloader with Schema Inference to automatically load some data into S3. I have one column that is a Map which is overwhelming Autoloader (it tries to infer it as struct -> creating a ...
Robert Kossendey's user avatar

15 30 50 per page
1
2 3 4 5
7